Using GraphDB with Jena

What’s in this document?

GraphDB can also be used with the Jena framework, which is achieved with a customised Jena/RDF4J/GraphDB adapter component.

Jena is a Java framework for building Semantic Web applications. It provides a programmatic environment for RDF, RDFS, OWL and SPARQL and includes a rule-based inference engine. Access to GraphDB via the Jena framework is achieved with a special adapter, which is essentially an implementation of the Jena ARQ interface that provides access to individual triples managed by a GraphDB repository through the RDF4J API interfaces.

Note

The GraphDB-specific Jena adapter can only be used with ‘local’ repositories, i.e., not ‘remote’ repositories that are accessed using the RDF4J HTTP protocol. If you want to use GraphDB remotely, consider using the Joseki server as described below.

Installing GraphDB with Jena

Required software

  • Jena version 2.7 (tested with version 2.7.3)
  • ARQ (tested with version 2.9.3)

Description of the GraphDB Jena adapter

The GraphDB Jena adapter is essentially an implementation of the Jena DatasetGraph interface that provides access to individual triples managed by a GraphDB repository through the RDF4J API interfaces.

It is not a general purpose RDF4J adapter and cannot be used to access any RDF4J compatible repository, because it utilises an internal GraphDB API to provide more efficient methods for processing RDF data and evaluating queries.

The adapter comes with its own implementation of the Jena ‘assembler’ factory to make it easier to instantiate and use with those related parts of the Jena framework, although you can instantiate an adapter directly by providing an instance of a RDF4J SailRepository (a GraphDB GraphDBRepository implementation). Query evaluation is controlled by the ARQ engine, but specific parts of a query (mostly batches of statement patterns) are evaluated natively through a modified StageGenerator plugged into the Jena runtime framework for efficiency. This also avoids unnecessary cross-api data transformations during query evaluation.

Instantiate Jena adapter using a SailRepository

In this approach, a GraphDB repository is first created and wrapped in a RDF4J SailRespository. Then a connection to it is used to instantiate the adapter class SesameDataset. The following example helps to clarify:

import com.ontotext.trree.OwlimSchemaRepository;
import org.eclipse.rdf4j.repository.sail.SailRepository;
import org.eclipse.rdf4j.repository.RepositoryConnection;
import com.ontotext.jena.SesameDataset;

...

OwlimSchemaRepository schema = new OwlimSchemaRepository();

// set the data folder where GraphDB will persist its data
schema.setDataDir(new File("./local-sotrage"));

// configure GraphDB with some parameters
schema.setParameter("storage-folder", "./");
schema.setParameter("repository-type", "file-repository");
schema.setParameter("ruleset", "rdfs");

// wrap it into a RDF4J SailRepository
SailRepository repository = new SailRepository(schema);

// initialize
repository.initialize();
RepositoryConnection connection = repository.getConnection();

// finally, create the DatasetGraph instance
SesameDataset dataset = new SesameDataset(connection);

From now on the SesameDataset object can be used through the Jena API as a regular dataset, e.g., to add some data to it, you could do something like the following:

Model model = ModelFactory.createModelForGraph(dataset.getDefaultGraph());
Resource r1 = model.createResource("http://example.org/book#1") ;
Resource r2 = model.createResource("http://example.org/book#2") ;

r1.addProperty(DC.title, "SPARQL - the book")
    .addProperty(DC.description, "A book about SPARQL") ;

r2.addProperty(DC.title, "Advanced techniques for SPARQL") ;

It can also be used to evaluate queries through the ARQ engine:

// Query string.
String queryString = "PREFIX dc: <" + DC.getURI() + "> " +
    "SELECT ?title WHERE {?x dc:title ?title . }";

Query query = QueryFactory.create(queryString);

// Create a single execution of this query, apply to a model
// which is wrapped up as a QueryExecution and then fetch the results
QueryExecution qexec = QueryExecutionFactory.create(query, dataset.asDataset());
try {
    // Assumption: it's a SELECT query.
    ResultSet rs = qexec.execSelect();
    // The order of results is undefined.
    for (; rs.hasNext();) {
        QuerySolution rb = rs.nextSolution();
        for (Iterator<String> iter = rb.varNames(); iter.hasNext();) {
            String name = iter.next();
            RDFNode x = rb.get(name);
            if (x.isLiteral()) {
                Literal titleStr = (Literal) x;
                System.out.print(name + "=" + titleStr + "\t");
            } else if (x.isURIResource()) {
                Resource res = (Resource) x;
                System.out.print(name + "=" + res.getURI() + "\t");
            }
            else
                System.out.print(name + "=" + x.toString() + "\t");
        }
        System.out.println();
    }
}
catch( Exception e ) {
    System.out.println( "Exception occurred: " + e );
}
finally {
    // QueryExecution objects should be closed to free any system resources
    qexec.close();
}

Instantiate GraphDB adapter using the provided Assembler

Another approach is to use the Jena assemblers infrastructure to instantiate a GraphDB Jena adapter. For this purpose, the required configuration must be stored in some valid RDF serialisation format and its contents read in a Jena model. Then, the assembler can be invoked to get an instance of the Jena adapter. The following example specifies an adapter instance in N3 format.

@prefix rdf:    <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix rdfs:   <http://www.w3.org/2000/01/rdf-schema#> .
@prefix ja:     <http://jena.hpl.hp.com/2005/11/Assembler#> .
@prefix otjena: <http://www.ontotext.com/jena/> .

@prefix :       <#> .

[] ja:loadClass "com.ontotext.jena.SesameVocab" .
otjena:SesameDataset rdfs:subClassOf ja:Object .
otjena:SesameDataset ja:assembler "com.ontotext.jena.SesameAssembler" .
<#dataset>  rdf:type otjena:SesameDataset ;
            otjena:datasetParam "./location" .

The ja:loadClass statements ensure that the GraphDB Jena adapter factory class file(s) are initialised and plugged in the Jena framework prior to being invoked. Then, the \\#dataset description tells the Jena framework to expect instances of otjena:SesameDataset to be created by this factory. The following example uses such a description stored in the file owlimbridge.n3 to get an instance of the Jena adapter:

Model spec = FileManager.get().loadModel( "owlimbridge.n3" );
Resource root = spec.createResource( spec.expandPrefix( ":dataset" ) );
DataSource datasource = (DataSource)Assembler.general.open( root );
DatasetGraph dataset = datasource.asDatasetGraph();

After this, the adapter is ready to be used, for example, to evaluate some queries through the ARQ engine using the same approach.

Using GraphDB with the Joseki server

To use a GraphDB repository with the Joseki server, you only need to configure it as a dataset, so that the Jena assembler framework is able to instantiate it. An example Joseki configuration file that makes use of such a dataset description could look like the following. First, a service that hosts the dataset is described:

<#service1>
    rdf:type            joseki:Service ;
    rdfs:label          "service point" ;
    joseki:dataset      otjena:bridge ;
    joseki:serviceRef   "sparql" ;
    joseki:processor    joseki:ProcessorSPARQL ;
    .

Then, the dataset is described:

[] ja:loadClass "com.ontotext.jena.SesameVocab" .
otjena:DatasetSesame  rdfs:subClassOf  ja:RDFDataset .
otjena:bridge  rdf:type otjena:DatasetSesame ;
    rdfs:label "GraphDB repository" ;
    otjena:datasetParam "./location" .

If a repositoryConnection is obtained (as in the example in the RDF4J section above), the Jena adapter can be used as follows:

import com.ontotext.jena.SesameDataset;

// Create the DatasetGraph instance
SesameDataset dataset = new SesameDataset(repositoryConnection);

From now on the SesameDataset object can be used through the Jena API as a regular dataset, e.g., to add some data to it, you could do something like the following:

Model model = ModelFactory.createModelForGraph(dataset.getDefaultGraph());
Resource r1 = model.createResource("http://example.org/book#1");
Resource r2 = model.createResource("http://example.org/book#2");
r1.addProperty(DC.title, "SPARQL - the book")
    .addProperty(DC.description, "A book about SPARQL");
r2.addProperty(DC.title, "Advanced techniques for SPARQL");

When GraphDB is used through Jena, its performance is quite similar to using it through the RDF4J APIs. For most of the scenarios and tasks, GraphDB can deliver considerable performance improvements when used as a replacement for Jena’s own native RDF backend TDB.