Provenance

The provenance plugin enables the generation of inference closure from a specific named graph at query time. This is useful in situations when you want to trace what the implicit statements generated from a specific graph are and the axiomatic triples part of the configured ruleset, i.e., the ones inserted with a special predicate sys:schemaTransaction. For more information, check Reasoning.

By default, GraphDB’s forward-chaining inferencer materializes all implicit statements in the default graph. Therefore, it is impossible to trace which graphs these implicit statements are coming from. The provenance plugin provides the opposite approach. With the configured ruleset, the reasoner does forward-chaining over a specific named graph and generates all its implicit statements at query time.

Predicates

The plugin predicates gives you an easy access to the graph, which implicit statements you want to generate. The process is similar to the RDF reification. All plugin’s predicates start with <http://www.ontotext.com/provenance/>:

Plugin predicates

Semantics

http://www.ontotext.com/provenance/derivedFrom

Creates a request scope for the graph with the inference closure

http://www.ontotext.com/provenance/subject

Binds all subjects part of the inference closure

http://www.ontotext.com/provenance/predicate

Binds all predicates part of the inference closure

http://www.ontotext.com/provenance/object

Binds all objects part of the inference closure

Registering the plugin

The plugin is not registered by default.

  1. To register it, start GraphDB with the following parameter:

    ./graphdb -Dregister-plugins=com.ontotext.trree.plugin.provenance.ProvenancePlugin
    
  2. Check the startup log to validate that the plugin has started correctly.

    [INFO ] 2016-11-18 19:47:19,134 [http-nio-7200-exec-2 c.o.t.s.i.PluginManager] Initializing plugin 'provenance'
    

Usage and examples

  1. In the Workbench SPARQL editor, add the following data as schema transaction:

    PREFIX ex: <http://example.com/>
    PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
    PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
    INSERT data {
        [] <http://www.ontotext.com/owlim/system#schemaTransaction> [] .
    
            ex:BusStop a rdfs:Class .
            ex:SkiResort a rdfs:Class .
            ex:WebCam a rdfs:Class .
            ex:OutdoorWebCam a rdfs:Class .
            ex:Place a rdfs:Class .
    
            ex:BusStop rdfs:subClassOf ex:Place .
            ex:SkiResort rdfs:subClassOf ex:Place .
            ex:WebCam rdfs:subClassOf ex:Place .
            ex:OutdoorWebCam rdfs:subClassOf ex:WebCam .
    
    }
    
  2. Add the following data as normal transaction:

    PREFIX ex: <http://example.com/>
    PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
    PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
    
    INSERT data {
        GRAPH ex:g1{
            ex:webcam_g1 a ex:OutdoorWebCam .
            ex:webcam_g1 ex:containedIn ex:skiresort .
        }
         GRAPH ex:g1a{
            ex:webcam_g1a a ex:OutdoorWebCam .
            ex:webcam_g1a ex:containedIn ex:skiresort .
        }
        GRAPH ex:g2{
            ex:skiresort a ex:SkiResort .
            ex:skiresort ex:publicTransport ex:busstop .
    
        }
        GRAPH ex:g3{
            ex:busstop a ex:BusStop .
            ex:busstop ex:nearBy ex:skiresort .
        }
    }
    
  3. If we run the following query not using the plugin, it will return solutions over all statements that were inferred during the loading of the data:

    PREFIX ex: <http://example.com/>
    SELECT * {
        ?webCam a ex:WebCam .
        ?webCam ex:containedIn ?skiresort .
        ?busstop ex:nearBy ?skiresort .
    }
    

    The result will have two solutions.

  4. This query showcases the newly introduced Provenance plugin predicate pr:derive. The computed closure is accessible in a dedicated special context whose name is provided as a subject of that predicate pattern. The patterns in the scope of that context are evaluated over the computed closure by the provenance plugin with the selected data. This allows for flexibility of use, as more than one context can be selected to supply data for processing, e.g., pr:derive (ex:g1 ex:g2).

    PREFIX pr: <http://www.ontotext.com/provenance/>
    PREFIX ex: <http://example.com/>
    select * {
        pr:derive1 pr:derive (ex:g1 ex:g2 ex:g3)  .
        GRAPH pr:derive1
        {
            ?webCam a ex:WebCam .
            ?webCam ex:containedIn ?skiresort .
            ?busstop ex:nearBy ?skiresort .
        }
    }
    

    This will return one solution since only the data from ex:g1, ex:g2, and ex:g3 will be used, so that the solution dependent on the data within ex:g1a will not be part of the result.

Note

During evaluation, the inferences and the data are kept in-memory, so the plugin should be used with relatively small sets of statements placed in contexts.