SPARQL Function

LDScript: a Linked Data Script Language

Created: 2015, November 5th
Modified: 2020, December 1st

Authors

Olivier Corby <olivier.corby@inria.fr>
Catherine Faron Zucker <faron@i3s.unice.fr>
Fabien Gandon <fabien.gandon@inria.fr>

Abstract

This document defines a function definition language on top of SPARQL filter language. It enables users to define and use simple extension functions directly in (extended) SPARQL. The body of a function is written using SPARQL filter language augmented with additional statements.

Table of contents

1 Introduction
  1.1 Relationship to W3C Recommendations
2 Function Definition
  2.1 Function
  2.2 Anonymous Function
  2.3 Annotation
3 Statement
  3.1 SPARQL
  3.2 Let
  3.3 For
  3.4 Pattern Matching
  3.5 If Then Else
  3.6 Return
  3.7 Error
  3.8 Safe
4 Second Order Function
  4.1 Funcall
  4.2 Apply
  4.3 Map
  4.4 Reduce
5 Predefined Extension Function
  5.1 General Purpose
  5.2 SPARQL Transformation
  5.3 SHACL
6 Datatype
  6.1 LDScript Datatype
  6.2 RDF Datatype
  6.3 SPARQL Datatype
7 Language Syntax
8 SPARQL Extension
  8.1 LDScript in SPARQL
  8.2 Aggregate
  8.3 Values Unnest
  8.4 Property Path Variable
  8.5 Named Graph Pattern
9 Use Case
  9.1 Functional Property
  9.2 Functional Service
  9.3 Approximate Match
  9.4 Recursive Match
  9.5 Event Driven Function
  9.6 Predefined Query
10 Implementation
11 Conclusion

 

1 Introduction

In addition to the existing standards dedicated to representation or querying, Semantic Web programmers could really benefit from a dedicated programming language enabling them to directly define functions on RDF terms, RDF graphs or SPARQL results. This is especially the case, for instance, when defining SPARQL extension functions. We propose a function definition language on top of SPARQL filter language by introducing a function clause. It enables users to define and use simple extension functions directly in (extended) SPARQL. The body of functions is written using SPARQL filter language augmented with additional statements. The language can be seen as a kind of SPARQLScript w.r.t SPARQL in the spirit of JavaScript w.r.t HTML.

LDScript is provided with extension datatypes that enable programmers to manipulate RDF objects such as graphs and triples as well as XML and JSON objects in a seamless way.

Example

The example below defines and uses the factorial function. Function definitions occur just after query definition.

select *
where {
  ?x rdf:value ?n
  filter (?n >= us:fac(10))
}

function us:fac(?n) {
  if (?n = 0, 1, ?n * us:fac(?n - 1))
}
For the sake of usability, LDScript variables out of a SPARQL Query can be written without "?" as shown below. Variables n and ?n are the same variables.
function us:fac(n) {
  if (n = 0, 1, n * us:fac(n - 1))
}

1.1 Relationship to W3C Recommendations

This proposition is strongly related to SPARQL 1.1 Query Language and to RDF 1.1 Concepts and Abstract Syntax.

 

2 Function Definition

The language is built on top of SPARQL filter language extended with the statements defined in this proposition. The objects of the language are SPARQL variables and RDF terms: URI, literal, blank node. The language objects can also be RDF triple and graph as well as SPARQL query solution (mapping) and solution sequence (mappings). A list datatype is also introduced whose elements can be any of the objects listed above, including lists. List elements do not need to be of the same kind or of the same datatype. Triple, graph, mapping, mappings and list are managed as RDF literals with extension datatypes: dt:triple, dt:graph, dt:mapping, dt:mappings and dt:list. Their content is accessed by pattern matching and they are iterable. We call LDScript terms the union of RDF terms and LDScript literals with extension datatype in the dt: namespace.

In the document, we use prefix and namespaces shown below:

prefix rq:  <http://ns.inria.fr/sparql-function/>
prefix dt:  <http://ns.inria.fr/sparql-datatype/>
prefix st:  <http://ns.inria.fr/sparql-template/>
prefix xt:  <http://ns.inria.fr/sparql-extension/>
prefix us:  <http://ns.inria.fr/sparql-extension/user/>
prefix dom: <http://ns.inria.fr/sparql-extension/dom/>
prefix sh:  <http://www.w3.org/ns/shacl#> 

2.1 Function

The function statement defines a function that can be used in a query, a rule, a template or another function. The name of a function is an URI, it can have zero, one or several arguments that are variables. Function overloading is provided: several functions can be defined with the same name and different number of arguments. Functions can call other LDScript functions including themselves, SPARQL functions or extension functions. The body is a sequence of expressions. The result of the function is the result of the last expression of the body, or the result of the return statement if any.

function us:fun(x, y) {
  x + y
}

The parameters and the result of a function may be typed as shown below.

function xsd:integer us:fun(xsd:integer x, xsd:integer y) {
  x + y
}

2.2 Anonymous Function

This statement defines an anonymous function that can be used with second order functions such as: apply, funcall, map and reduce. As it is an expression of the language, it can be bound to a variable, passed to a function call as parameter, it can be an element of a list, etc. Compiling an anonymous function produces a function definition with a generated URI. This URI is transparently used at runtime to call and execute the function.

function(x) { 1 / (x * x) }
maplist(function(x) { 1 / (x * x) } , xt:iota(5))

2.3 Annotation

Public

This @public annotation exports function definitions in the SPARQL interpreter in such a way that future SPARQL queries can use them within current runtime session.

@public 
function us:foo() {
  xt:display("Hello World")
}

@public  {

  function us:bar(x, y) {
    us:gee(x * y)
  }
  
  function us:gee(x) {
    x * x
  }
}

 

3 Statement

This section details LDScript statements.

3.1 SPARQL

LSDcript inherits SPARQL Filter language statements, including the exists clause, SPARQL select and construct queries and Update queries. These statements are evaluated with the Dataset of the embedding SPARQL query that runs the LDScript function. For syntactic reasons, SPARQL queries are embedded in a query statement, except in the let and for statements where it can be avoided.

query(select ?x ?y where { ?x foaf:knows ?y })

The result of a select (resp. construct) query is a dt:mappings (resp. dt:graph) datatype extension literal. These datatypes act as "pointers" to the underlying data structure that implements the result of the query.

datatype(query(select ?x ?y where {?x foaf:knows ?y})) = dt:mappings
datatype(query(construct    where {?x foaf:knows ?y})) = dt:graph

At runtime, variables that are bound in LDScript stack and that are projected in the select clause of a query are dynamically bound in the where clause. They are bound using an extended values clause that is generated dynamically. It is extended because it accepts blank node values in addition to URI and literals. In the example below, we call us:foo(v), the value of ?x is v in the stack and an appropriate values clause is dynamically generated.

us:foo(v) 

function us:foo(?x) {
  query(select ?x ?y where { values ?x { v }  ?x foaf:knows ?y })
}

For construct queries, variables that are in-scope in the where clause and that are bound in LDScript stack are dynamically bound in the where clause using an extended values clause.

For the exists { BGP } clause, variables that are in-scope in the BGP and that are bound in LDScript stack are dynamically bound in the BGP using an extended values clause.

Statements such as if, bound, coalesce are also available. SPARQL predefined functions are also available with the rq: prefix, e.g. rq:contains.

Query and Anonymous Function

It is worth noticing that, as any statement, a SPARQL query can be embedded in an anonymous function.

let (query = function() { query(select .. where ..) }) {
    datatype(funcall(query)) = dt:mappings
}

3.2 Let

The let statement defines local variables whose scope is the body of the let statement. The result of the statement is the result of the last expression of the body, or the result of the return statement if any.

let (z = x + y, t = 2 * z) {
  us:foo(t)
}

Dynamic Let

The dynamic let statement is a variant of the let statement where the scope of the declared variable is not only the body but also the functions that are called in the body (and recursively). In the example below, variable x in anonymous function is bound by the dynamic let.

letdyn (x = exp) { maplist(function(y) { us:fun(x, y) }, list }

Let List

The let statement enables users to map list elements to variables. The number of variables may be less than the size of the list.

let ((x y z) = list) {
  us:foo(x, y, z)
}

The left argument can be a list of lists of variables.

let (((x y), (z t)) = @((1 2)(3 4))) {
  
}

Let Select Query

The let statement can have a select-where query as argument. In this case, the variables in the select clause are defined and bound, in the body of the let clause, with the values of the first query solution. If a variable has no value in the first solution (e.g. due to an optional), the body is executed with the variable left unbound. If there is no solution, the body is executed with all select variables left unbound. These cases can be trapped in the body by the bound or coalesce functions.

let (select ?x ?y where { ?x foaf:knows ?y }) {
    us:bar(?x, ?y)
}

If the left argument is a list of variables, each variable is bound to the corresponding query solution (a mapping) in order.

let ((s1 s2) = select * where { ?x foaf:knows ?y }) {
  us:foo(s1, s2)
}

If the left argument is a list of list of variables, each variable is bound to the value of the corresponding variable (with same name) in the first query solution.

let (((x y)) = select * where { ?x foaf:knows ?y }) {
  us:foo(x, y)
}

Let Construct Query

The let statement can take as second argument a construct-where query. The value of the variable is the RDF result graph.

function us:foo(?x) {
    let (g = construct where { ?x foaf:knows ?y}) {
        g
    }
}

Variables in-scope in the where clause that are bound in LDScript stack are bound in the where clause using an extended values clause generated at runtime. The query above is evaluated as shown below if the value of ?x is v in the stack.

function us:foo(?x) {
    let (g = construct where { values ?x { v }  ?x foaf:knows ?y}) {
        g
    }
}

Set

This statement assigns a value to a variable.

set (var = val)
set (x = x + 1)

Global variable

Local variables are defined by let (var = exp), for (var in exp), function us:fun(var) while set(var = exp) sets the value of a variable to the result of the expression.

Global variables are defined by set(var = exp) when var is not a local variable at that time.
The runtime scope of a global variable is the runtime scope of the outermost query within which the variable is defined, including functions and subqueries. When LDScript is used with
STTL, the scope of a global variable is the whole STTL transformation.
When a global variable is defined in a function, the global variable definition remains valid outside the function when the function resumes, until the outermost query resumes.
A local variable definition temporarily hides a global variable with the same name within the lexical scope of the statement that defines the local variable.
A global variable cannot be referenced directly in a SPARQL query, however it can be accessed by means of a function call that returns the value of the global variable. In other words, global variables belong to LDScript, not to SPARQL.

3.3 For

The for statement defines a loop on LDScript terms that are iterable datatypes. The list below specifies the kind of the term iterated in the statement: for (VAR in EXP).

  1. VAR : Term in EXP : dt:list
  2. VAR : Term in EXP : dt:triple
  3. VAR : dt:triple in EXP : dt:path
  4. VAR : dt:triple in EXP : dt:graph
  5. VAR : dt:mapping in EXP : dt:mappings
  6. VAR : dt:list(xsd:string, Term) in EXP : dt:mapping where first element is the variable name and second is the variable value

The result returned by the for statement is the boolean value true. A specific result can be returned using the return statement which has for effect to interrupt the loop. If an iteration of the loop returns an error, the loop terminates and returns an error.

for (n in xt:list(1, 2, 3)) {
  if (us:prime(n)) {
    xt:display(n)
  }
}

For Select Query

The for statement can take as argument a select-where query. In this case, the loop iterates on the solutions of the query and the variables projected in the select clause are bound to their value of the current solution in the body of the loop. If a variable has no value, it remains unbound.

for (select ?x ?y where {?x foaf:knows ?y}) {
    us:foo(x, y)
}

For Construct Query

The for statement can take as second argument a construct-where query. In this case, the loop iterates on the triples of the result graph.

for (t in construct where {?x foaf:knows ?y}) {
  let ((s p o) = t) {
    
  }
}
for ((s p o) in construct where {?x foaf:knows ?y}) {
  
}

3.4 Pattern Matching

The access to the content of extension datatypes can be done by declarative pattern matching.

Let Pattern Matching

Iterable datatypes can be mapped to a list of variables, by pattern matching, using the let statement.

let ((e1 e2 e3) = list)
let ((t1 t2 t3) = graph)
let ((s p o)    = triple)
let ((m1 m2 m3) = mappings)

Pattern matching with dt:mapping datatype is done by variable name and not by position. In the example below, variable x is bound to the value of variable x in current mapping.

let ((x y) = mapping) 

Extended Let Pattern Matching

Extended datatypes can be accessed with pattern matching that focuses on first element(s), rest of the elements and last element(s). For this purpose, LDScript introduces two Pattern Matching operators: "." and "|" that can be combined.

The "." operator enables to identify last element(s) of an extension datatype. In the example below, z variable matches the last element whereas x variable matches the first element. If there is only one element, the first and the last element are the same. If the extended datatype is empty, the variables remain unbound but the statement does not fail.

let ((x . z) = term) 

It is possible to match several elements among the first ones and/or several elements among the last ones, as shown below. If there are not enough elements, some variables remain unbound.

let ((x y . z t) = term) 

The "|" operator enables LDScript to match a sublist of elements, after the first element(s). In the example below, the rest variable is bound with the sublist starting after the two first elements. The sublist may be empty if there are not enough elements.

let ((x y | rest) = term) 

It is possible to combine the two operators. In the example below, the sublist starts after the first two elements and stops before the last two elements. If there are not enough elements, the sublist may be empty.

let ((x y | rest . z t) = term) 

Sublist and last operators can be used on their own.

let (( | list) = term) 
let (( | list . z t) = term) 
let (( . z t) = term) 

The "." and "|" operators can be used with these datatypes: dt:list, dt:map, dt:graph, dt:triple, dt:path, dt:mappings.

Examples

let ((x y | rest . z t) = xt:iota(5))
x = 1 ; y = 2 ; rest = (3) ; z = 4 ; t = 5
let ((x y | rest . z t) = xt:iota(4))
x = 1 ; y = 2 ; rest = () ; z = 3 ; t = 4
let ((x y | rest . z t) = xt:iota(3))
x = 1 ; y = 2 ; rest = () ; z = 2 ; t = 3
let ((x y | rest . z t) = xt:iota(2))
x = 1 ; y = 2 ; rest = () ; z = 1 ; t = 2
let ((x y | rest . z t) = xt:iota(1))
x = 1 ; y is UNBOUND ; rest = () ; z is UNBOUND ; t = 1

For Pattern Matching

LDScript extension datatypes can be iterated and mapped to a list of variables using the for statement.

for (elem in list)
for ((x y) in listOfPairs)
for (triple in graph)
for ((s p o) in graph)
for (term in triple)
for (mapping in mappings)
for ((var val) in mapping) 

A mappings datatype is iterated as mapping elements. Pattern matching with mapping element is done by variable name and not by position. In the example below, variable x and y are bound to the values of variable x and y in current mapping.

for ((x y) in mappings)

3.5 If Then Else

This statement is a syntactic extension of SPARQL if then else statement.

if (x > 0) {
  us:foo(x)
}
else if (y > 0) {
  us:bar(y)
}
else {
  us:gee(x, y) 
}

3.6 Return

This statement resumes the execution of a function and returns its result.

term return(term t)
function us:test(a, b)
  for (x in xt:iota(a, b)) {
     if (us:prime(x)) { return(x) }
  }
}

3.7 Error

This statement returns an error. The execution of the LDScript expression resumes and returns an error. An error can be trapped by the coalesce statement as in SPARQL.

if (x < 0) {
  error()
}

3.8 Safe

This statement checks that the evaluation of an expression does not produce an error and returns a boolean accordingly. It is a generalization of the bound statement with any expression as argument.

safe(x / y)

4 Second Order Function

A second order function is a function whose first argument evaluation returns a function (a function URI or an anonymous function) and which calls this function with the other arguments. Second order functions are funcall, apply, map and reduce. They are useful in the context of Linked Data because the name URI of a function to be applied on resources can be determined by a SPARQL query.
We use the abstract function type to denote either the URI of a function or an anonymous function.

4.1 Funcall

This statement applies a function which is the result of the evaluation of an expression. The first argument of the statement is an expression that must return either the URI of a function or an anonymous function.

term funcall (function fun, term t1, ... term tn)
funcall (us:getMethod(us:surface, x), x)

4.2 Apply

This statement is similar to the funcall statement but the arguments of the function call are given as a list.

term apply (function fun, dt:list arglist)
apply (rq:regex, xt:list("test", "e", "i"))

4.3 Map

The map statement applies a function iteratively on the elements of an iterable datatype: dt:list, dt:map, dt:graph, dt:mappings, dt:mapping. We use the abstract iterable type to denote any of these types. The first argument of the statement is an expression that must return the URI of a function or an anonymous function. SPARQL filter functions, as well as second order functions, are available as URI with the rq: prefix. If one of the function evaluations returns an error, the map terminates and returns en error.

map (function fun, iterable term)
map (xt:display, xt:list(1, 2, 3))

The map functions described here can also have other arguments. In this case, the values of the arguments are considered at each step of the iteration of the iterable datatypes. The map functions iterate the first argument that is iterable. If an additional argument is iterable, it is not iterated.

map (us:fun, xt:list(1, 2, 3), 4)

The map functions described here can operate on iterable datatypes such as graph (iterate triple) or mappings (iterate mapping).

map (us:foo, query(select * where { ?x ?p ?y }))

The maplist statement applies a function on the elements of a list and returns the list of results

dt:list maplist (function fun, iterable term)
maplist (function(x) { 1 / (x * x) }, xt:list(1, 2, 3))

The mapfind statement search elements for which the function returns true. Function mapfind returns first of such elements or error() if there is no such element. In this latter case, error() can be trapped using coalesce().

term mapfind (function fun, iterable term)
mapfind (us:prime, xt:list(1, 2, 3))

The mapfindlist statement finds the elements of an iterable datatype for which the function returns true, return the list of such elements.

dt:list mapfindlist (function fun, iterable term)
mapfindlist (us:prime, xt:list(1, 2, 3))

The mapevery statement returns true if the function returns true on all elements, false otherwise.

xsd:boolean mapevery (function fun, iterable term)
mapevery (us:prime, xt:list(1, 2, 3))

The mapany statement returns true if the function returns true on any element, false otherwise.

xsd:boolean mapany (function fun, iterable term)
mapany (function(y) { exists { x p y } }, xt:list(1, 2, 3))

4.4 Reduce

This statement applies a binary function iteratively to a list of arguments and produces one final result. The first argument of the statement is an expression that must return the URI of a function or an anonymous function. When the list is empty, if there is a function definition with the same name and zero argument, this function is called and its result is returned.

term reduce (function fun, dt:list list)
reduce (rq:plus, xt:iota(5)) = 15

Combining second order functions

Second order functions are available with the rq: prefix and can be combined.

reduce(rq:concat, maplist(rq:funcall, 
  xt:list(rq:year, rq:month, rq:day, rq:hours, rq:minutes, rq:seconds), 
  now()))

 

5 Predefined Extension Function

LDScript introduces general purpose extension functions.

5.1 General Purpose

Display

Display RDF terms in Turtle syntax.

xt:display(term t)

Print

Display RDF terms string value.

xt:print(term t)

Turtle

Return a xsd:string Turtle representation of an RDF term.

xsd:string xt:turtle(term t)

Content

Return a xsd:string representation of the content of an extension datatype in the dt: namespace.

xsd:string xt:content(LDScript term t)

Self

Return the result of the evaluation of its argument.

term xt:self(term t)

Graph

Return the current RDF graph.

dt:graph xt:graph()

SPARQL Query

Execute a SPARQL query whose text is the result of an expression, with possibly a list of variable value bindings.

dt:mappings xt:sparql(xsd:string selectQuery)
dt:mappings xt:sparql(xsd:string selectQuery, xsd:string var, term val, ...)

dt:graph xt:sparql(xsd:string constructQuery)
dt:graph xt:sparql(xsd:string constructQuery, xsd:string var, term val, ...)

Load

The xt:load function implements URI dereferencing, it returns the RDF graph resulting from the parsing of an RDF document.

dt:graph xt:load(URI uri)

If there is a graph argument, the RDF document is loaded in the graph.

dt:graph xt:load(URI uri, dt:graph g)

Sequence

The sequence evaluates its arguments in sequence and returns the result of the last argument. If an argument returns an error, the sequence returns an error.

xt:sequence(exp e1, .. exp en)

Focus Statement

The first argument MUST returns a graph with datatype dt:graph. The focus statement evaluates other arguments with the graph as current dataset.

term xt:focus(dt:graph g, exp e1, .., exp en)
xt:focus(
    xt:load(<http://example.org/test>),
    exists { ?x rdf:value 2.718 })

5.2 SPARQL Transformation

LDScript implementations MAY provide functions to execute STTL SPARQL Transformation. STTL is a language that enables users to apply transformations to RDF entities such as Turtle, RDF/XML or JSON transformations to RDF graphs and resources or the functional syntax transformation of OWL ontologies. Note that these functions belong to the st: namespace. In the example below, "transform" is the name of a transformation: st:turtle, st:rdfxml, st:json, st:owl, st:spin, etc.

xsd:string st:apply-templates-with(URI transform)
xsd:string st:apply-templates-with(URI transform, term node)

xsd:string st:call-template(URI name, term node_1, .., term node_n)
xsd:string st:call-template-with(URI transform, URI name, term node_1, .., term node_n)

5.3 SHACL

LDScript implementations MAY provide functions to evaluate SHACL shapes on the current focus graph. The result of shape functions is the validation report graph.

dt:graph sh:shacl() 
dt:graph sh:shaclshape(shape)
dt:graph sh:shaclshape(shape, node)
dt:graph sh:shaclnode(node)
xsd:boolean sh:conform(graph)

Generate the Turtle syntax of the report graph.

xsd:string xt:turtle(dt:graph g)

Format

Generate the SPIN RDF graph of a SPARQL string query.

dt:graph xt:spin(xsd:string q)

Generate an RDF graph for SPARQL Query Results using https://www.w3.org/2001/sw/DataAccess/tests/result-set W3C vocabulary.

dt:graph xt:tograph(dt:mappings m)

Generate XML, JSON or RDF text format for SPARQL Query Results. The RDF format is the same as the one returned by the xt:tograph function.

xsd:string xt:xml (dt:mappings m) 
xsd:string xt:json(dt:mappings m)  
xsd:string xt:rdf (dt:mappings m)

 

6 Datatype

The objects of the language are RDF terms and LDScript terms. RDF terms are, as usual, URI, Blank Node and Literal with XSD datatype. LDScript terms are RDF graph and triple, SPARQL query solution sequence (called mappings), SPARQL query solution (called mapping) and SPARQL property path solution (called path). LDScript terms include list whose elements are LDScript objects and map whose keys and values are LDScript objects. LDScript terms also include datatypes for XML and JSON objects. The XML datatype is provided with (a subset of) the DOM API.

LDScript objects other than RDF terms are implemented by means of literals with specific extension datatypes in the dt: namespace: dt:list, dt:map, dt:xml, dt:json, dt:graph, dt:triple, dt:path, dt:mappings, dt:mapping. Hence, they are implemented as RDF terms (i.e. RDF literals with extension datatypes) and their content can be accessed by specific statements as shown below. These datatypes are iterable by means of the for and map statements. By extension, we call LDScript terms the objects of the language.

6.1 LDScript Datatype

List

The dt:list extension datatype implements list of LDScript terms, including lists. Although similar, it is distinct from RDF list (rdf:List class with rdf:first, rdf:rest and rdf:nil). List elements need to be neither of the same kind nor of the same datatype. The dt:list datatype is provided with a set of functions.

The xt:list function is the list constructor.

dt:list xt:list(term t...)
xt:list(1, 2, 3)
xt:list(xt:list(1, 2), xt:list(3, 4))

The xt:iota function generates a list of successive integers or characters.

dt:list xt:iota(term t)
dt:list xt:iota(term t1 , term t2)
xt:iota(5)        = xt:list(1, 2, 3, 4, 5)
xt:iota(5, 7)     = xt:list(5, 6, 7)
xt:iota('a', 'c') = xt:list('a', 'b', 'c')

The xt:size function returns the number of elements of a list.

xsd:integer xt:size(dt:list list)

The xt:first function returns the first element of a list.

term xt:first(dt:list list)

The xt:rest function returns the sublist after the first element..

dt:list xt:rest(dt:list list)

The xt:get function returns the nth element of a list.

term xt:get(dt:list list, xsd:integer n)

The xt:set function sets the value of the nth element of a list. Error if there is no nth element.

xt:set(dt:list list, xsd:integer n, term t)

The xt:add function adds a tail element to a list. List is modified.

xt:add(dt:list list, term t)

The xt:add function inserts/adds element to a list at nth place. List is modified.

xt:add(dt:list list, xsd:integer n, term t)

The xt:cons function adds a head element to a list. Returns copy of list.

dt:list xt:cons(term t, dt:list list)

The xt:member function tests if element is member of the list.

xsd:boolean xt:member(term t, dt:list list)

The xt:swap function swaps elements at given index in the list.

dt:list xt:swap(dt:list list, xsd:integer i1, xsd:integer i2)

The xt:remove function removes the first occurrence of an element from the list, if it is present. Modify the list.

xt:remove(dt:list list, term t)

The xt:removeindex function removes the nth element of the list. Modify the list.

xt:removeindex(dt:list list, xsd:integer n)

The xt:append function appends two lists, keep duplicates.

dt:list xt:append(dt:list l1, dt:list l2)

The xt:merge function merges two lists and removes duplicates.

dt:list xt:merge(dt:list l1, dt:list l2)

The xt:reverse function reverses a list.

dt:list xt:reverse(dt:list list)

The xt:sort function sorts a list.

dt:list xt:sort(dt:list list)

The xt:sort function sorts a list according to a comparison function.

dt:list xt:sort(dt:list list, function fun)
xt:sort(list, us:compare)

function us:compare(x, y) {
    if (x < y, -1, if(x = y, 0, 1))
}

Map

The dt:map extension datatype implements a Map whose keys and values are LDScript objects.

It is provided with a xt:map constructor function.

dt:map xt:map()

The xt:size function returns the size of the map.

The xt:set function enables users to set a key value pair in the map.

xt:set(dt:map amap, term key, term value)

The xt:get function enables users to retrieve the value of a key in the map.

term xt:get(dt:map amap, term key)

The xt:has function checks whether the key is present in the map.

term xt:has(dt:map amap, term key)

The dt:map datatype is iterable as pairs (key, value).

for ((key val) in amap) { }

map (function((key, val)) { }, amap)

JSON

The dt:json extension datatype implements a JSON Map whose keys and values are LDScript objects.

It is provided with a xt:json constructor function.

dt:json xt:json()

dt:json xt:json(xsd:string jsonString)

The xt:size function returns the size of the json map.

The xt:set function enables users to set a key value pair in the json map.

xt:set(dt:json json, term key, term value)

The xt:get function enables users to retrieve the value of a key in the json map.

term xt:get(dt:json json, term key)

The dt:json datatype is iterable as pairs (key, value).

for ((key val) in json) { }

map (function((key, val)) { }, json)

XML

The dt:xml extension datatype represents XML objects provided with XPath function and the DOM API. More precisely, the XML datatype manages org.w3c.dom.Node objects from Java DOM.

It is provided with a xt:xml constructor function.

dt:xml xt:xml(xsd:string xmlString)
dt:xml xt:xml(URI uri)

The XML datatype is provided with an xpath function where exp is a XPath expression.

dt:list(dt:xml) xpath(dt:xml doc, exp)

The XML datatype is provided with a subset of the DOM API. Implementations MAY provide more functions from the DOM API.

xsd:string dom:getNodeType(dt:xml node)     
xsd:string dom:getNodeName(dt:xml node)     
xsd:string dom:getLocalName(dt:xml node)
xsd:string dom:getNodeValue(dt:xml node)    
xsd:string dom:getTextContent(dt:xml node)

URI dom:getNamespaceURI(dt:xml node)  
URI dom:getBaseURI(dt:xml node) 

dt:map(xsd:string, xsd:string) dom:getAttributes(dt:xml node)  

dt:list(dt:xml) dom:getElementsByTagName(dt:xml node, xsd:string name)     
dt:list(dt:xml) dom:getElementsByTagNameNS(dt:xml node, xsd:string ns, xsd:string name)
dt:list(dt:xml) dom:getChildNodes(dt:xml node)  

dt:xml dom:getElementById(dt:xml node)   
dt:xml dom:getFirstChild(dt:xml node)    
dt:xml dom:getNodeParent(dt:xml node)    
dt:xml dom:getOwnerDocument(dt:xml node) 

xsd:boolean dom:hasAttribute(dt:xml node, xsd:string name)
xsd:string  dom:getAttribute(dt:xml node, xsd:string name)
xsd:boolean dom:hasAttributeNS(dt:xml node, xsd:string ns, xsd:tring name)
xsd:string  dom:getAttributeNS(dt:xml node, xsd:string ns, xsd:tring name)

The dt:xml datatype is iterable on child nodes.

for (node in xml) { }

map (function(node) { }, xml)

6.2 RDF Datatype

There are two datatypes for RDF entities, dt:graph for RDF graph and dt:triple for RDF triple.

Graph Datatype

The dt:graph datatype is provided with functions. Function xt:size returns the number of triples of a graph.

xsd:integer xt:size(dt:graph g)

Function xt:graph returns the current graph.

dt:graph xt:graph()

Function xt:union computes a graph that is the union of two graphs. The arguments are LDScript terms with dt:graph datatype and the result is returned as a LDScript term with dt:graph datatype.

dt:graph xt:union(dt:graph g1, dt:graph g2)

The dt:graph datatype is iterable on its triples.

for (atriple in agraph) { }
for ((s p o) in agraph) { }

map (function(atriple)   { }, agraph)
map (function((s, p, o)) { }, agraph)

Triple Datatype

The dt:triple datatype is provided with functions to access the subject, the property and the object. Implementations MAY provide a function to access the named graph when triples are quads.

term xt:subject(dt:triple t)
term xt:property(dt:triple t)
term xt:object(dt:triple t)
term xt:graph(dt:triple t)

Triple's elements are accessible by pattern matching.

let ((s p o) = atriple) { }

6.3 SPARQL Datatype

There are datatypes for SPARQL entities: dt:mappings for SPARQL Query solution sequence, dt:mapping for SPARQL Query solution and dt:path for Property Path solutions.

Mappings

The dt:mappings datatype is the datatype of SPARQL Query Solution Sequences, i.e. of select-where SPARQL queries. It is provided with a function xt:size that returns the number of solutions.

xsd:integer xt:size(dt:mappings m)

The dt:mappings datatype is provided with functions that perform SPARQL algebra operations on SPARQL query solutions of select-where queries. The results are returned as literals with dt:mappings datatype.

dt:mappings xt:join(dt:mappings m1, dt:mappings m2) 
dt:mappings xt:union(dt:mappings m1, dt:mappings m2) 
dt:mappings xt:minus(dt:mappings m1, dt:mappings m2)
dt:mappings xt:optional(dt:mappings m1, dt:mappings m2) 

The dt:mappings datatype is iterable on its mapping elements.

for (mapping in mappings) { }

map (function(mapping) { }, mappings)

Mapping

The dt:mapping datatype is the datatype of SPARQL Query Solutions. The datatype is iterable as (variable, value) pairs where variable is the name of a variable represented as a xsd:string.

for ((var val) in mapping) { }

map (function((var, val)) { }, mapping)

Path

The dt:path datatype is provided for the case where the implementation provides Property Path variables. It is provided with the xt:size function.

The datatype is iterable on its triples.

for (atriple in apath) { }

map (function(atriple) { },  apath) 

 

7 Language Syntax

The syntax is given in EBNF and relies on SPARQL syntax.

LDScript   ::= SPARQL_QueryUnit Fun
Fun        ::= (Annotation Function | Annotation Package)*
Function   ::= 'function' Type? Uri FunVarList  Body 
Package    ::= '{' Function+ '}'
Annotation ::= ( '@public' | '@debug' )*

Body ::= '{' '}' | '{' Exp (';' Exp)* '}'

Exp  ::= SPARQL_Constraint -- with BuiltInCall extended below

BuiltInCall ::= SPARQL_BuiltInCall 
| Let | For | If | Return | Error | SecondOrder | Lambda | Query

SecondOrder ::= Funcall | Apply | MapFun | Reduce |  
  
Query ::= 'query' '('  (SelectQuery | ConstructQuery | Update1)  ')'

ExpQuery = Exp | '@' List | SelectQuery | ConstructQuery  

Let ::=
LetName '(' LetDecl (',' LetDecl)* ')'  Body  |
LetName '(' SelectQuery ')'  Body  

LetName = 'let' | 'letdyn'

LetDecl ::= Var '=' ExpQuery | VarExp '=' ExpQuery

Type ::= Uri

VarExp ::= '(' VAR+  ('|' VAR )?  ('.' VAR+)?  ')'

VarList    ::= '(' Var  (Var)* ')'
VarListSep ::= '(' Var  (',' Var)* ')'

FunVarList ::= '(' ')' | '(' Type? Var (',' Type? Var)* ')'

For ::=
'for' '(' Var     'in' ExpQuery ')'  Body |
'for' '(' VarList 'in' ExpQuery ')'  Body |
'for' '(' SelectQuery ')'  Body  

If ::= 'if'  '(' Exp ')'  Body  ('else' (  Body | If )) ?

Funcall::= 'funcall' '(' Exp (',' Exp)* ')'
Apply  ::= 'apply'   '(' Exp ',' Exp ')'
Reduce ::= 'reduce'  '(' Exp ',' Exp ')'
MapFun ::= Map       '(' Exp (',' Exp)+ ')' 
Map    ::= 'map' | 'maplist' |  'mapfind' | 'mapfindlist' 
         | 'mapany' | 'mapevery'

Lambda ::= 'function' LambdaVarList  Body 
LambdaVarList ::= FunVarList | '(' VarListSep ')'

Error ::= 'error' '(' ')'

Return ::= 'return' '(' Exp ')'

List ::= '(' (RDFTerm | List)* ')'

 

8 SPARQL Extension

LDScript enables us to propose and implement natural SPARQL extensions.

8.1 LDScript in SPARQL

LDScript statements MAY be available within extended SPARQL Query Filter Constraints.

select * where {
    ?x us:method [ us:name us:validate ; us:function fun ]
    filter funcall(?fun, ?x) 
}

8.2 Aggregate

This statement defines an extension aggregate which computes the list of values of the expression and returns a dt:list literal.

select (aggregate(distinct ?n) as ?list)
where {
  ?x rdf:value ?n
}

8.3 Values Unnest

This statement is a values clause where the values are computed by an expression.

values ?n { unnest(xt:list(1, 2, 3)) }

It is equivalent to the values clause below.

values ?n { 1 2 3 }

The extended values statement can be used with several variables.

values (?n ?m) { unnest(xt:list(xt:list(1, 2), xt:list(3, 4))) }

The statement values unnest can be used on iterable datatypes such as: list, map, json, xml, graph.

values ?elem    { unnest(?list) }
values ?node    { unnest(?xml) }
values ?triple  { unnest(?graph) }

values (?key ?val) { unnest(?map) }
values (?key ?val) { unnest(?json) }
values (?s ?p ?o)  { unnest(?graph) }

8.4 Property Path Variable

The dt:path datatype is provided for the case where the implementation gives access to Property Path solutions. In the example below, SPARQL is extended with path variables, the $path variable is bound to the property path that relates ?x and ?y. The datatype of the value of $path is dt:path. It is conceptually equivalent to dt:list(dt:triple).

select * where {
    ?x foaf:knows+ :: $path ?y
}

8.5 Named Graph Pattern

When a variable has for value an RDF graph, the variable can be used in a named graph pattern which is evaluated on the content of the graph. The example below shows this case with variable ?g.

select * where {
    bind (us:getGraph() as ?g)
    graph ?g {  }
}

LDScript SPARQL Extension

Aggregate ::= SPARQL_Aggregate |
'aggregate' '(' ('distinct')? Exp ')' 

ValuesClause ::= SPARQL_ValuesClause |
'values' Var     '{' 'unnest' '(' Exp ')' '}' |
'values' VarList '{' 'unnest' '(' Exp ')' '}'

VerbPath ::= Path ( '::' Var )?

 

9 Use Case

9.1 Functional Property

In an ontology, properties may be defined as functions of other properties. For example, the surface can be defined as the product of the length and the width.

select * where {
  ?x a us:Figure 
  bind (us:surface(?x) as ?s)
}

function us:surface(?x) {
  let ((?w, ?l) = select * where { ?x us:width ?w ; us:length ?l }) {
    ?w * ?l
  }
}

9.2 Functional Service

Implement a function as a service.

function us:service(?x) {
  let (select ?x ?l where {
        service <http://fr.dbpedia.org/sparql> {
          ?x rdfs:label ?l}}) {
    ?l
  }
}

9.3 Approximate Match

Functions can be used to program approximate match.

select * where {
  ?x a ?t
  filter us:match(foaf:Person, ?t)
}

function us:match(?q, ?t) { 
  exists { 
    { ?t rdfs:subClassOf* ?q } union 
    { ?q rdfs:subClassOf/(rdfs:subClassOf*|^rdfs:subClassOf) ?t }
  }    
}

9.4 Recursive Match

Functions can be used to program recursive match.

select * where {
  ?x a foaf:Person 
  ?y a foaf:Person 
  filter us:match(?x, ?y)
}

function us:match(?x, ?y) { 
  exists { 
    { ?x foaf:knows ?y } union 
    { ?x foaf:knows ?z . ?y a foaf:Person filter us:match(?z, ?y) }
  }    
}

9.5 Event Driven Function Call

A SPARQL interpreter may define a set of events and emit events during quering processing. A SPARQL interpreter may be provided with an event manager that traps events. If a SPARQL query is provided with appropriate function definitions for the events, the event manager calls these functions. The association between an event and a function is done by an annotation wich is an identifier prefixed by the '@' character. The function name is free whereas the annotation name is fixed.

Function called when query processing starts.

@before function us:before(query) 

Function called when query processing resumes.

@after function us:after(mappings) 

Function called when a solution is found.

@result us:result(mapping)

9.6 Predefined Query

LDScript can be use to manage predefined queries by means of anonymous functions.

function us:foo() {
    let (list = xt:list(
        function() { query(select .. where ..) },
        function() { query(select .. where ..) }
    )) {
        maplist(rq:funcall, list)
    }
}

9.7 Mapping rdf:List with dt:list

Translate recursively an rdf:List into a dt:list.

select x (us:list(l) as list) where {
    x rdf:value l .
}

function us:list(l) {
  let (select ?l 
       (aggregate (if (?b, us:list(?e), 
                   if (?e = rdf:nil, xt:list(), ?e))) as ?list) 
        where {
            ?l rdf:rest*/rdf:first ?e
            bind (exists { ?e rdf:rest ?a } as ?b)
        } ) {
    return (list)
  }
}

9.8 Aggregate

This statement defines an extension aggregate. The first expression (e.g. aggregate(?n)) is the expression to aggregate. The aggregate function computes the list of values of this expression. The second expression (e.g. us:median(?list)) is the function to be applied to the list of values. In the example below, the aggregate computes the median of the values.

select (aggregate(?n) as ?list) (us:median(?list) as ?med)
where {
  ?x rdf:value ?n
}

function us:median(?list) {
   xt:get(xt:sort(?list), xsd:integer(xt:size(?list) / 2)) 
}

9.9 SHACL to SPARQL path translator

prefix sh: <http://www.w3.org/ns/shacl#> 
# 
# path = URI | bnode
# bnode : [sh:zeroOrOnePath exp ] | (exp1 .. expn)
#
function sh:path(path) {
    if (isURI(path)) {
        return (xt:turtle(path))
    }
    else {
        let (select * where { ?path ?oper ?val filter (?oper not in (rdf:first)) } ) {
            return (if (oper = rdf:rest, sh:sequencePath(path), funcall(oper, val)))
        }
    }
}

function sh:paren(path) {
    if (isURI(path), sh:path(path), concat("(", sh:path(path), ")"))
}

function sh:oneOrMorePath(path) {
     concat(sh:paren(path), "+")
}

function sh:zeroOrOnePath(path) {
     concat(sh:paren(path), "?")
}    

function sh:zeroOrMorePath(path) {
     concat(sh:paren(path), "*")
}    

function sh:inversePath(path) {
     concat("^", sh:paren(path))
}

# path = (e1 .. en)
function sh:alternativePath(path) {
    sh:reduce(path, "|")
}

# path = (e1 .. en)
function sh:sequencePath(path) {
    sh:reduce(path, "/")
}

function sh:reduce(path, sep) {
    letdyn (astr = sep) {
         reduce(function(x, y) { concat(x, astr, y) }, 
            maplist(sh:path, sh:list(path)))
    }
}

function sh:list(path) {
    let (select path (aggregate(?exp) as ?list) 
         where { ?path rdf:rest*/rdf:first ?exp } ) {
         return (list)
    }
}

9.10 Create RDF from XML

This example shows how to parse an XML document and create RDF triples.

insert {
    ?uri foaf:name ?author .  
    [ us:author ?uri ; us:title ?title ]
}
where {
    values ?book { unnest(xpath(us:xml(), "/doc/book")) }
    bind (dom:getTextContent(xt:xpath(?book, "title"))  as ?title)
    bind (dom:getTextContent(xt:xpath(?book, "author")) as ?author)
    bind (uri(concat(us:, replace(?author, " ", "")))   as ?uri)
}


# XML document
function us:xml() {
xt:xml(
"""
<doc>
<book><title>1984</title><author>Georges Orwell</author></book>
<book><title>Le Capital au XXIe siècle</title><author>Thomas Piketty</author></book>
<book><title>Capital et idéologie</title><author>Thomas Piketty</author></book>
</doc>
"""
)
}

 

10 Implementation

LDScript is implemented and available in the Corese Semantic Web Factory. SPARQL-Generate provides an implementation of a subset of LDScript where the body of a function is written solely with SPARQL Filter language.

Examples

We have also written a SHACL interpreter using SPARQL Function.

 

11 Conclusion

Dedicated programming language enabling Semantic Web programmers to define functions on RDF terms, triples and graphs or SPARQL query results can facilitate the development and improve the reuse and maintenance of the code produced for Linked Data. We propose to extend SPARQL with LDScript, a script language that enables users to define extension functions. Its main characteristics are:

  1. Function definition
  2. Design on top of SPARQL Filter language
  3. SPARQL predefined functions, including exists clause
  4. Select, Construct and Update SPARQL query
  5. Second order functions: funcall, apply, map, reduce
  6. Statements: let, for, if then else, return
  7. Pattern matching
  8. List, Map, JSON, XML extension datatypes
  9. Graph, Triple, Query Solution Mappings extension datatypes
  10. LDScript predefined functions

In the future we wish to provide a second implementation on top of another Semantic Web Factory. We wish to provide a compiler to Java language and work on performance. We would like to design a type checker and investigate Linked Functions.

 

Bibliography

  1. K. L. Clark, F. G. McCabe. Ontology oriented programming in go! Applied Intelligence. Springer. Volume 24, Issue 3, 2006.

  2. Eyal Oren, Renaud Delbru, Sebastian Gerke, Armin Haller, Stefan Decker. ActiveRDF: object-oriented semantic web programming. International Conference on World Wide Web, WWW 2007, Banff, Alberta, Canada, 2007.

  3. Greg Williams. Extensible SPARQL functions with embedded javascript. In ESWC Workshop on Scripting for the Semantic Web, SFSW, Innsbruck, Austria, volume 248 of CEUR Workshop Proceedings, 2007.

  4. Diego Berrueta, Jose E. Labra, and Ivan Herman. XSLT+SPARQL: Scripting the Semantic Web with SPARQL embedded into XSLT stylesheets. 4th Workshop on Scripting for the Semantic Web 2008

  5. Axel Polleres, Thomas Krennwallner , Nuno Lopes, Jacek Kopecký, Stefan Decker. XSPARQL Language Specification. W3C Member Submission 2009

  6. Bernhard Schandl. Functions over RDF Language Elements. International Semantic Web Conference, ISWC 2009.

  7. Sven Groppe, Jana Neumann, and Volker Linnemann. SWOBE - embedding the semantic web languages RDF, SPARQL and SPARUL into java for guaranteeing type safety, for checking the satisfiability of queries and for the determination of query result types. ACM Symposium on Applied Computing (SAC), Honolulu, Hawaii, USA, 2009.

  8. Holger Knublauch. SPIN JavaScript Functions (SPINx) SPIN JavaScript Functions (SPINx) 2010

  9. V. Eisenberg. Ruby on Semantic Web. IEEE 27th International Conference on Data Engineering. 2011

  10. Holger Knublauch. SPIN - SPARQL Syntax. Member Submission, W3C, 2011. http://www.w3.org/Submission/2011/SUBM-spin-sparql-20110222/.

  11. Espen Suenson, Johan Lilius, Ivan Porres. OWL Web Ontology Language as a Scripting Language for Smart Space Applications Rule-Based Reasoning, Programming, and Applications Springer Berlin Heidelberg Berlin, Heidelberg 2011

  12. Olivier Corby, Alban Gaignard, Catherine Faron-Zucker, and Johan Montagnat. KGRAM Versatile Data Graphs Querying and Inference Engine In Proc. IEEE/WIC/ACM International Conference on Web Intelligence, Macau, December 2012.

  13. SPARQL 1.1 Query Language, Steve Harris, Andy Seaborne. W3C Recommendation, March 2013

  14. David Mizell, Kristyn J. Maschhoff, Steven P. Reinhardt. Extending SPARQL with graph functions. IEEE International Conference on Big Data (Big Data). 2014

  15. Martin Leinberger, Stefan Scheglmann, Ralf Lämmel, Steffen Staab, Matthias Thimm, Evelyne Viegas. Semantic Web Application Development with LITEQ. International Semantic Web Conference, ISWC, Riva del Garda, Italy. 2014.

  16. RDF 1.1 Concepts and Abstract Syntax, Graham Klyne, Jeremy J. Carroll, Brian McBride. W3C Recommendation, February 2014

  17. Maurizio Atzori. Toward the web of functions: Interoperable higher-order functions in SPARQL. 13th International Semantic Web Conference, ISWC, Riva del Garda, Italy, volume 8797 of LNCS, 2014.

  18. Olivier Corby and Catherine Faron-Zucker. A Transformation Language for RDF based on SPARQL. Web Information Systems and Technologies - Selected Extended Papers from WEBIST 2015. Springer-Verlag, Lecture Notes in Business Information Processing, 2015. Best paper nominee.

  19. Gabriel Ciobanu, Ross Horne, Vladimiro Sassone Minimal type inference for Linked Data consumers. J. Log. Algebr. Meth. Program. 84(4): 485-504 (2015)

  20. Florian Weber, Andreas Bihlmaier, Heinz Worn. Semantic Object-Oriented Programming (SOOP) INFORMATIK, Lecture Notes in Informatics (LNI) 2016

  21. Olivier Corby, Catherine Faron-Zucker and Fabien Gandon, LDScript: a Linked Data Script Language, International Semantic Web Conference, ISWC, spotlight paper, 2017 October, Vienna, Austria.

  22. Martin Leinberger, Ralf Lämmel, Steffen Staab. The Essence of Functional Programming on Semantic Data. Programming Languages and Systems: 26th European Symposium on Programming, ESOP 2017, Held as Part of the European Joint Conferences on Theory and Practice of Software, ETAPS, Uppsala, Sweden, April, 2017.

  23. Maxime Lefrançois, Antoine Zimmermann, and Noorani Bakerally. A SPARQL extension for generating RDF from heterogeneous formats. 14th European Semantic Web Conference, ESWC, Portoroz, Slovenia, volume 10249 of LNCS, 2017.

  24. Chi Zhang, Jakob Beetz, Bauke de Vries. BimSPARQL: Domain-specific functional SPARQL extensions for querying RDF building data Semantic Web Journal, 2017.

  25. Jean-Baptiste Lamy. Owlready: Ontology-oriented programming in Python with automatic classification and high level constructs for biomedical ontologies. Technical Report 2017

  26. René Schubotz, Christian Vogelgesang, Torsten Spieldenner. SPARQλ: SPARQL as a function. Conference: Future of Information and Communication Conference (FICC) 2019, San Francisco, USA. 2019

  27. Philipp Seifer, Martin Leinberger, Ralf Lämmel, and Steffen Staab. Semantic Query Integration With Reason. Programming journal. 2019

  28. Kurt Cagle Extending MarkLogic SPARQL with Javascript 2018

  29. Ben De Meester, Tom Seymoens, Anastasia Dimoua, Ruben Verborgh. Implementation-independent function reuse, 2020 Future Generation Computer Systems

  30. Extensions in ARQ Jena documentation

  31. PL/SQL documentation

  32. Geospatial Extensions for RDF and SPARQL