{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "view-in-github"
   },
   "source": [
    " "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "oCetcC1MQz13"
   },
   "source": [
    "# Using pycorese\n",
    "\n",
    "This notebook demonstrates how to use the **pycorese** package:\n",
    "\n",
    "- to load knowledge graph\n",
    "- to perform a SPARQL query\n",
    "- to validate a SHACL form\n",
    "- to access the classes of Corese Java API"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "tZjvQGgGe64i"
   },
   "source": [
    "## Install pycorese"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "nFeJr1PbQz18"
   },
   "source": [
    "Java Runtime Environment (JRE) 11 or higher is required to run **pycorese**.\n",
    "\n",
    "If you don't have Java installed please refer to the [official website](https://www.java.com/en/download/help/download_options.html) to download and install it."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/"
    },
    "id": "IKx255qaQz1_",
    "outputId": "29b40851-6439-459b-c5f5-1e8cb89f7e84"
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "openjdk version \"11.0.25\" 2024-10-15\n",
      "OpenJDK Runtime Environment (build 11.0.25+9-post-Ubuntu-1ubuntu122.04)\n",
      "OpenJDK 64-Bit Server VM (build 11.0.25+9-post-Ubuntu-1ubuntu122.04, mixed mode, sharing)\n"
     ]
    }
   ],
   "source": [
    "!java -version"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "QzKUfvL8Qz2G"
   },
   "source": [
    "**pycorese** is available on PyPI and can be installed using pip:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/"
    },
    "id": "OdY7kuBeQz2I",
    "outputId": "f0deca77-241c-4c58-970c-2906bcbc4078"
   },
   "outputs": [],
   "source": [
    "!pip install pycorese"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "pRlL21fgQz2M"
   },
   "source": [
    "Download the data files from the GitHub repository:\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/"
    },
    "id": "kOvrNs-ze64n",
    "outputId": "731259ca-8854-4497-fadc-aca4b4ec3714"
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "beatles.rdf  beatles-validator.ttl\n"
     ]
    }
   ],
   "source": [
    "import os\n",
    "import sys\n",
    "if  not os.path.exists('./data/beatles.rdf'):\n",
    "    print('Downloading the data files...')\n",
    "    !mkdir -p ./data\n",
    "    !wget https://raw.githubusercontent.com/corese-stack/corese-python/main/examples/data/beatles.rdf -O ./data/beatles.rdf\n",
    "    !wget https://raw.githubusercontent.com/corese-stack/corese-python/main/examples/data/beatles-validator.ttl -O ./data/beatles-validator.ttl\n",
    "\n",
    "if sys.platform == 'win32':\n",
    "    !dir /b .\\data\\*.*\n",
    "else:\n",
    "    !ls ./data"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "PLBixnURe64o"
   },
   "source": [
    "### Connect to Corese API\n",
    "\n",
    "Demonstrate loading and querying data with CoreseAPI connected through `Py4J` or `JPype` packages.  If you don't specify the java bridge type, the default is `Py4J`."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {
    "id": "wN4TDhjXe64p"
   },
   "outputs": [],
   "source": [
    "#%%timeit -n 1 -r 1\n",
    "from  pycorese.api import CoreseAPI\n",
    "\n",
    "python_to_java_bridge = 'py4j'\n",
    "corese = CoreseAPI(java_bridge=python_to_java_bridge)\n",
    "corese.loadCorese()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "7WzP7gCle64p"
   },
   "source": [
    "### High-level API"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "1hHYhnIve64p"
   },
   "source": [
    "#### Run SELECT query"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/",
     "height": 206
    },
    "id": "KiVYUBGhe64p",
    "outputId": "786d7754-23a2-4ba6-800d-e36bd199adc7"
   },
   "outputs": [
    {
     "data": {
      "application/vnd.google.colaboratory.intrinsic+json": {
       "summary": "{\n  \"name\": \"results\",\n  \"rows\": 5,\n  \"fields\": [\n    {\n      \"column\": \"subject\",\n      \"properties\": {\n        \"dtype\": \"string\",\n        \"num_unique_values\": 3,\n        \"samples\": [\n          \"http://example.com/Please_Please_Me\",\n          \"http://example.com/McCartney\",\n          \"http://example.com/Imagine\"\n        ],\n        \"semantic_type\": \"\",\n        \"description\": \"\"\n      }\n    },\n    {\n      \"column\": \"p\",\n      \"properties\": {\n        \"dtype\": \"string\",\n        \"num_unique_values\": 2,\n        \"samples\": [\n          \"http://example.com/date\",\n          \"http://example.com/artist\"\n        ],\n        \"semantic_type\": \"\",\n        \"description\": \"\"\n      }\n    },\n    {\n      \"column\": \"o\",\n      \"properties\": {\n        \"dtype\": \"string\",\n        \"num_unique_values\": 5,\n        \"samples\": [\n          \"http://example.com/Paul_McCartney\",\n          \"1970-04-17\"\n        ],\n        \"semantic_type\": \"\",\n        \"description\": \"\"\n      }\n    }\n  ]\n}",
       "type": "dataframe",
       "variable_name": "results"
      },
      "text/html": [
       "\n",
       "
"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "oCetcC1MQz13"
   },
   "source": [
    "# Using pycorese\n",
    "\n",
    "This notebook demonstrates how to use the **pycorese** package:\n",
    "\n",
    "- to load knowledge graph\n",
    "- to perform a SPARQL query\n",
    "- to validate a SHACL form\n",
    "- to access the classes of Corese Java API"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "tZjvQGgGe64i"
   },
   "source": [
    "## Install pycorese"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "nFeJr1PbQz18"
   },
   "source": [
    "Java Runtime Environment (JRE) 11 or higher is required to run **pycorese**.\n",
    "\n",
    "If you don't have Java installed please refer to the [official website](https://www.java.com/en/download/help/download_options.html) to download and install it."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/"
    },
    "id": "IKx255qaQz1_",
    "outputId": "29b40851-6439-459b-c5f5-1e8cb89f7e84"
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "openjdk version \"11.0.25\" 2024-10-15\n",
      "OpenJDK Runtime Environment (build 11.0.25+9-post-Ubuntu-1ubuntu122.04)\n",
      "OpenJDK 64-Bit Server VM (build 11.0.25+9-post-Ubuntu-1ubuntu122.04, mixed mode, sharing)\n"
     ]
    }
   ],
   "source": [
    "!java -version"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "QzKUfvL8Qz2G"
   },
   "source": [
    "**pycorese** is available on PyPI and can be installed using pip:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/"
    },
    "id": "OdY7kuBeQz2I",
    "outputId": "f0deca77-241c-4c58-970c-2906bcbc4078"
   },
   "outputs": [],
   "source": [
    "!pip install pycorese"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "pRlL21fgQz2M"
   },
   "source": [
    "Download the data files from the GitHub repository:\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/"
    },
    "id": "kOvrNs-ze64n",
    "outputId": "731259ca-8854-4497-fadc-aca4b4ec3714"
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "beatles.rdf  beatles-validator.ttl\n"
     ]
    }
   ],
   "source": [
    "import os\n",
    "import sys\n",
    "if  not os.path.exists('./data/beatles.rdf'):\n",
    "    print('Downloading the data files...')\n",
    "    !mkdir -p ./data\n",
    "    !wget https://raw.githubusercontent.com/corese-stack/corese-python/main/examples/data/beatles.rdf -O ./data/beatles.rdf\n",
    "    !wget https://raw.githubusercontent.com/corese-stack/corese-python/main/examples/data/beatles-validator.ttl -O ./data/beatles-validator.ttl\n",
    "\n",
    "if sys.platform == 'win32':\n",
    "    !dir /b .\\data\\*.*\n",
    "else:\n",
    "    !ls ./data"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "PLBixnURe64o"
   },
   "source": [
    "### Connect to Corese API\n",
    "\n",
    "Demonstrate loading and querying data with CoreseAPI connected through `Py4J` or `JPype` packages.  If you don't specify the java bridge type, the default is `Py4J`."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {
    "id": "wN4TDhjXe64p"
   },
   "outputs": [],
   "source": [
    "#%%timeit -n 1 -r 1\n",
    "from  pycorese.api import CoreseAPI\n",
    "\n",
    "python_to_java_bridge = 'py4j'\n",
    "corese = CoreseAPI(java_bridge=python_to_java_bridge)\n",
    "corese.loadCorese()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "7WzP7gCle64p"
   },
   "source": [
    "### High-level API"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "1hHYhnIve64p"
   },
   "source": [
    "#### Run SELECT query"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/",
     "height": 206
    },
    "id": "KiVYUBGhe64p",
    "outputId": "786d7754-23a2-4ba6-800d-e36bd199adc7"
   },
   "outputs": [
    {
     "data": {
      "application/vnd.google.colaboratory.intrinsic+json": {
       "summary": "{\n  \"name\": \"results\",\n  \"rows\": 5,\n  \"fields\": [\n    {\n      \"column\": \"subject\",\n      \"properties\": {\n        \"dtype\": \"string\",\n        \"num_unique_values\": 3,\n        \"samples\": [\n          \"http://example.com/Please_Please_Me\",\n          \"http://example.com/McCartney\",\n          \"http://example.com/Imagine\"\n        ],\n        \"semantic_type\": \"\",\n        \"description\": \"\"\n      }\n    },\n    {\n      \"column\": \"p\",\n      \"properties\": {\n        \"dtype\": \"string\",\n        \"num_unique_values\": 2,\n        \"samples\": [\n          \"http://example.com/date\",\n          \"http://example.com/artist\"\n        ],\n        \"semantic_type\": \"\",\n        \"description\": \"\"\n      }\n    },\n    {\n      \"column\": \"o\",\n      \"properties\": {\n        \"dtype\": \"string\",\n        \"num_unique_values\": 5,\n        \"samples\": [\n          \"http://example.com/Paul_McCartney\",\n          \"1970-04-17\"\n        ],\n        \"semantic_type\": \"\",\n        \"description\": \"\"\n      }\n    }\n  ]\n}",
       "type": "dataframe",
       "variable_name": "results"
      },
      "text/html": [
       "\n",
       "  
\n",
       "    
\n",
       "\n",
       "
\n",
       "  \n",
       "    \n",
       "      | \n",
       " | subject\n",
       " | p\n",
       " | o\n",
       " | 
\n",
       "  \n",
       "  \n",
       "    \n",
       "      | 0\n",
       " | http://example.com/Please_Please_Me\n",
       " | http://example.com/artist\n",
       " | http://example.com/The_Beatles\n",
       " | 
\n",
       "    \n",
       "      | 1\n",
       " | http://example.com/McCartney\n",
       " | http://example.com/artist\n",
       " | http://example.com/Paul_McCartney\n",
       " | 
\n",
       "    \n",
       "      | 2\n",
       " | http://example.com/Imagine\n",
       " | http://example.com/artist\n",
       " | http://example.com/John_Lennon\n",
       " | 
\n",
       "    \n",
       "      | 3\n",
       " | http://example.com/Please_Please_Me\n",
       " | http://example.com/date\n",
       " | 1963-03-22\n",
       " | 
\n",
       "    \n",
       "      | 4\n",
       " | http://example.com/McCartney\n",
       " | http://example.com/date\n",
       " | 1970-04-17\n",
       " | 
\n",
       "  \n",
       "
\n",
       "
\n",
       "    
\n",
       "  
\n",
       "    
\n",
       "\n",
       "
\n",
       "  \n",
       "    \n",
       "      | \n",
       " | type\n",
       " | focusNode\n",
       " | resultMessage\n",
       " | resultPath\n",
       " | resultSeverity\n",
       " | sourceConstraintComponent\n",
       " | sourceShape\n",
       " | value\n",
       " | 
\n",
       "    \n",
       "      | o\n",
       " | \n",
       " | \n",
       " | \n",
       " | \n",
       " | \n",
       " | \n",
       " | \n",
       " | \n",
       " | 
\n",
       "  \n",
       "  \n",
       "    \n",
       "      | urn:uuid:66d7b5ea-0065-4f84-b0e4-d65ba0b16a11\n",
       " | http://www.w3.org/ns/shacl#ValidationResult\n",
       " | http://example.com/Love_Me_Do\n",
       " | Fail at: [sh:minCount 1 ;\n",
       "  sh:nodeKind sh:IRI...\n",
       " | http://example.com/performer\n",
       " | http://www.w3.org/ns/shacl#Violation\n",
       " | http://www.w3.org/ns/shacl#MinCountConstraintC...\n",
       " | _:b9\n",
       " | 0\n",
       " | 
\n",
       "  \n",
       "
\n",
       "
\n",
       "    
\n",
       "