Imported all the notebooks
[tm351-notebooks.git] / notebooks / Example Linked Data Notebooks / LD.02.2 OU LInked Data Demo - Multiple Graphs.ipynb
diff --git a/notebooks/Example Linked Data Notebooks/LD.02.2 OU LInked Data Demo - Multiple Graphs.ipynb b/notebooks/Example Linked Data Notebooks/LD.02.2 OU LInked Data Demo - Multiple Graphs.ipynb
new file mode 100644 (file)
index 0000000..06074de
--- /dev/null
@@ -0,0 +1,360 @@
+{
+ "metadata": {
+  "name": "",
+  "signature": "sha256:3380b46353ffd8c3993f20058d404949d079b28375c35403650ab24c7dccd593"
+ },
+ "nbformat": 3,
+ "nbformat_minor": 0,
+ "worksheets": [
+  {
+   "cells": [
+    {
+     "cell_type": "heading",
+     "level": 1,
+     "metadata": {},
+     "source": [
+      "OU Linked Data Demo - Multiple Graphs"
+     ]
+    },
+    {
+     "cell_type": "markdown",
+     "metadata": {},
+     "source": [
+      "The Open Univesity Linked Data platform is home to several Linked Data graphs that cover:\n",
+      "\n",
+      "- OU course (i.e. module) and qualifications descriptions;\n",
+      "- the OU's Key Information Set (KIS) from Unistats;\n",
+      "- articles published in the Explore area of OpenLearn, as well as information about learning units on OpenLearn;\n",
+      "- Open Research Online (the OU's research repository);\n",
+      "- information about OU podcasts and data relating to the OU's Youtube channels, playlists and video.\n",
+      "\n",
+      "We can search across data from all these graphs at the same time, or specify which graph we wish to pull our results from."
+     ]
+    },
+    {
+     "cell_type": "markdown",
+     "metadata": {},
+     "source": [
+      "As ever, we need to set up our environment with the SPARQL endpoint and some helper routines."
+     ]
+    },
+    {
+     "cell_type": "code",
+     "collapsed": false,
+     "input": [
+      "from SPARQLWrapper import SPARQLWrapper, JSON\n",
+      "endpoint_ou=\"http://data.open.ac.uk/query\""
+     ],
+     "language": "python",
+     "metadata": {},
+     "outputs": []
+    },
+    {
+     "cell_type": "code",
+     "collapsed": false,
+     "input": [
+      "#We should perhaps show how to create a simple package in the first OU notebook that we can then just import?\n",
+      "def runQuery(endpoint,prefix,q):\n",
+      "    ''' Run a SPARQL query with a declared prefix over a specified endpoint '''\n",
+      "    sparql = SPARQLWrapper(endpoint)\n",
+      "    sparql.setQuery(prefix+q)\n",
+      "    sparql.setReturnFormat(JSON)\n",
+      "    return sparql.query().convert()\n",
+      "\n",
+      "import pandas as pd\n",
+      "#And some more helpers\n",
+      "def dict2df(results):\n",
+      "    ''' Hack a function to flatten the SPARQL query results and return the column values '''\n",
+      "    data=[]\n",
+      "    for result in results[\"results\"][\"bindings\"]:\n",
+      "        tmp={}\n",
+      "        for el in result:\n",
+      "            tmp[el]=result[el]['value']\n",
+      "        data.append(tmp)\n",
+      "\n",
+      "    df = pd.DataFrame(data)\n",
+      "    return df\n",
+      "\n",
+      "def dfResults(endpoint,prefix,q):\n",
+      "    ''' Generate a data frame containing the results of running\n",
+      "        a SPARQL query with a declared prefix over a specified endpoint '''\n",
+      "    return dict2df( runQuery( endpoint, prefix, q ) )\n"
+     ],
+     "language": "python",
+     "metadata": {},
+     "outputs": []
+    },
+    {
+     "cell_type": "markdown",
+     "metadata": {},
+     "source": [
+      "To run a query that will pull responses just from a specified graph within the OU Linked Data platform, we specify the required graph using the `FROM` statement."
+     ]
+    },
+    {
+     "cell_type": "code",
+     "collapsed": false,
+     "input": [
+      "prefix='''\n",
+      "PREFIX mlo: <http://purl.org/net/mlo/>\n",
+      "PREFIX aiiso: <http://purl.org/vocab/aiiso/schema#>\n",
+      "'''\n",
+      "\n",
+      "q='''\n",
+      "SELECT ?course \n",
+      "FROM <http://data.open.ac.uk/context/course>\n",
+      "WHERE {\n",
+      "    ?course mlo:location <http://sws.geonames.org/2328926/> .\n",
+      "    ?course a aiiso:Module.\n",
+      "    ?course <http://data.open.ac.uk/saou/ontology#OUCourseLevel> \"1\"^^<http://www.w3.org/2001/XMLSchema#string>.\n",
+      "}\n",
+      "'''\n",
+      "dfResults(endpoint_ou,prefix,q)"
+     ],
+     "language": "python",
+     "metadata": {},
+     "outputs": []
+    },
+    {
+     "cell_type": "markdown",
+     "metadata": {},
+     "source": [
+      "We can count the number of triples associated with a graph by using a `COUNT()` statement applied to a query that specifies all the triples in the graph."
+     ]
+    },
+    {
+     "cell_type": "code",
+     "collapsed": false,
+     "input": [
+      "q='''\n",
+      "SELECT COUNT(*) \n",
+      "FROM <http://data.open.ac.uk/context/course>\n",
+      "WHERE {?x ?y ?z}\n",
+      "'''\n",
+      "runQuery(endpoint_ou,'',q)"
+     ],
+     "language": "python",
+     "metadata": {},
+     "outputs": []
+    },
+    {
+     "cell_type": "markdown",
+     "metadata": {},
+     "source": [
+      "Compare this result to running a query over the whole platfrom, without a particular graph specified."
+     ]
+    },
+    {
+     "cell_type": "code",
+     "collapsed": false,
+     "input": [
+      "q='''\n",
+      "SELECT COUNT(*) \n",
+      "WHERE {?x ?y ?z}\n",
+      "'''\n",
+      "\n",
+      "runQuery(endpoint_ou,'',q)"
+     ],
+     "language": "python",
+     "metadata": {},
+     "outputs": []
+    },
+    {
+     "cell_type": "markdown",
+     "metadata": {},
+     "source": [
+      "According the data.open.ac.uk webpages, the following subgraphs are defined in the OU datastore:"
+     ]
+    },
+    {
+     "cell_type": "raw",
+     "metadata": {},
+     "source": [
+      "http://data.open.ac.uk/context/podcast\n",
+      "http://data.open.ac.uk/context/qualification\n",
+      "http://data.open.ac.uk/context/youtube\n",
+      "http://data.open.ac.uk/context/people/kmifoaf\n",
+      "http://data.open.ac.uk/context/openlearn\n",
+      "http://data.open.ac.uk/context/kmiplanet\n",
+      "http://data.open.ac.uk/context/oro\n",
+      "http://data.open.ac.uk/context/openlearnexplore\n",
+      "http://data.open.ac.uk/context/kis\n",
+      "http://data.open.ac.uk/context/course\n",
+      "http://data.open.ac.uk/context/audioboo"
+     ]
+    },
+    {
+     "cell_type": "markdown",
+     "metadata": {},
+     "source": [
+      "We can look to see whether there are any additional, perhaps undocumented, graphs by running a query that requests distinct graph names.\n",
+      "\n",
+      "(A query such as this is a really useful one to have in your toolbox when exploring a new endpoint!)"
+     ]
+    },
+    {
+     "cell_type": "code",
+     "collapsed": false,
+     "input": [
+      "q='''\n",
+      "SELECT DISTINCT ?g { GRAPH ?g { } }\n",
+      "'''\n",
+      "runQuery(endpoint_ou,'',q)"
+     ],
+     "language": "python",
+     "metadata": {},
+     "outputs": []
+    },
+    {
+     "cell_type": "markdown",
+     "metadata": {},
+     "source": [
+      "This more direct query suggests that there is a graph describing BBC resources. The Open University co-produces a wide range of content for broadcast with the BBC, some of which is also associated with OU courses. Perhaps this graph describes some of those resources?"
+     ]
+    },
+    {
+     "cell_type": "code",
+     "collapsed": false,
+     "input": [
+      "q='''\n",
+      "SELECT ?x ?y ?z \n",
+      "FROM <http://data.open.ac.uk/context/bbc>\n",
+      "WHERE {?x ?y ?z} LIMIT 10\n",
+      "'''\n",
+      "runQuery(endpoint_ou,'',q)"
+     ],
+     "language": "python",
+     "metadata": {},
+     "outputs": []
+    },
+    {
+     "cell_type": "code",
+     "collapsed": false,
+     "input": [
+      "q='''\n",
+      "SELECT DISTINCT ?property FROM <http://data.open.ac.uk/context/bbc> WHERE {\n",
+      " ?x ?property ?y.\n",
+      "}\n",
+      "'''\n",
+      "\n",
+      "runQuery(endpoint_ou,prefix,q)"
+     ],
+     "language": "python",
+     "metadata": {},
+     "outputs": []
+    },
+    {
+     "cell_type": "markdown",
+     "metadata": {},
+     "source": [
+      "The http://data.open.ac.uk/bbc/ontology/relatesToCourse property looks interesting - maybe we can use it to find BBC resources associated with a course resource? Let's try it and see:"
+     ]
+    },
+    {
+     "cell_type": "code",
+     "collapsed": false,
+     "input": [
+      "q='''\n",
+      "SELECT ?prog ?course FROM <http://data.open.ac.uk/context/bbc> WHERE {\n",
+      " ?prog <http://data.open.ac.uk/bbc/ontology/relatesToCourse> <http://data.open.ac.uk/course/s175>.\n",
+      "}\n",
+      "'''\n",
+      "\n",
+      "runQuery(endpoint_ou,prefix,q)"
+     ],
+     "language": "python",
+     "metadata": {},
+     "outputs": []
+    },
+    {
+     "cell_type": "markdown",
+     "metadata": {},
+     "source": [
+      "How about if we try to pull back resources from the BBC graph that are associated with things that are modules?"
+     ]
+    },
+    {
+     "cell_type": "markdown",
+     "metadata": {},
+     "source": [
+      "Hmm... it seems as if the http://data.open.ac.uk/context/bbc graph does not contain that information. In this case, we need to search over the courses graph as well."
+     ]
+    },
+    {
+     "cell_type": "code",
+     "collapsed": false,
+     "input": [
+      "q='''\n",
+      "SELECT ?prog ?course\n",
+      "FROM <http://data.open.ac.uk/context/bbc>\n",
+      "FROM <http://data.open.ac.uk/context/course>\n",
+      "WHERE {\n",
+      " ?prog <http://data.open.ac.uk/bbc/ontology/relatesToCourse> ?course.\n",
+      " ?course a aiiso:Module.\n",
+      "}\n",
+      "'''\n",
+      "\n",
+      "runQuery(endpoint_ou,prefix,q)\n"
+     ],
+     "language": "python",
+     "metadata": {},
+     "outputs": []
+    },
+    {
+     "cell_type": "markdown",
+     "metadata": {},
+     "source": [
+      "Alternatively, we could have searched over every graph in the platform."
+     ]
+    },
+    {
+     "cell_type": "code",
+     "collapsed": false,
+     "input": [
+      "q='''\n",
+      "SELECT ?prog ?course ?level WHERE {\n",
+      " ?prog <http://data.open.ac.uk/bbc/ontology/relatesToCourse> ?course.\n",
+      " ?course a aiiso:Module.\n",
+      " ?course <http://data.open.ac.uk/saou/ontology#OUCourseLevel> ?level.\n",
+      "}\n",
+      "'''\n",
+      "\n",
+      "runQuery(endpoint_ou,prefix,q)\n"
+     ],
+     "language": "python",
+     "metadata": {},
+     "outputs": []
+    },
+    {
+     "cell_type": "markdown",
+     "metadata": {},
+     "source": [
+      "What do you think are the advabtages and disadvantages of searching over a specified graph or set of graphs within a platform compared to a search over all the graphs in a platform?"
+     ]
+    },
+    {
+     "cell_type": "heading",
+     "level": 2,
+     "metadata": {},
+     "source": [
+      "Activity"
+     ]
+    },
+    {
+     "cell_type": "markdown",
+     "metadata": {},
+     "source": [
+      "Spend thirty minutes or so exploring some of the other Open University Linked Data Platform graphs.\n",
+      "\n",
+      "For example, explore the qualifications graph and see how it relates to the courses graph.\n",
+      "\n",
+      "Try to come up with some interesting queries that work over two or three specified graphs.\n",
+      "\n",
+      "Share any notebooks you produce in the Open Design Studio."
+     ]
+    }
+   ],
+   "metadata": {}
+  }
+ ]
+}
\ No newline at end of file