Quick start guide

Start the database

Run GraphDB as a stand-alone server

The default way of running GraphDB is as a stand-alone server. The server is platform independent and it includes all recommended JVM parameters for an immediate use.

Running GraphDB

  1. Download your GraphDB distribution file and unzip it.

  2. Start the GraphDB Server and Workbench interface by executing the startup script located in the /bin folder:

    graphdb
    

    A message appears in your console telling you that GraphDB has been started in workbench mode. To access the Workbench, open http://localhost:7200/ in your browser.

Options

The startup script supports the following options:

Option Description
-d daemonise (run in background), not available on Windows
-s run in server only mode (no workbench)
-p pidfile write PID to <pidfile>
-h
--help
print command line options
-v print GraphDB version, then exit
-Dprop set Java system property
-Xprop set non-standard Java system property

Note

Run graphdb -s to start GraphDB in server-only mode without the web interface (no workbench). A remote workbench can still be attached to the instance.

Configuring GraphDB

Paths and network settings

The configuration of all GraphDB directory paths and network settings is read from conf/graphdb.properties file. The file controls where to store the database data, log files and internal data. To assign a new value, modify the file or override the setting by adding -D<property>=<new-value> as a parameter to the startup script. For example, to change the database port number:

graphdb -Dgraphdb.connector.port=<your-port>

The configuration properties can also be set in the environment variable GDB_JAVA_OPTS using the same -D<property>=<new-value> syntax.

Note

The order of precedence for GraphDB configuration properties is: config file < GDB_JAVA_OPTS < command line supplied arguments.

Java virtual machine settings

It is strongly recommended to set explicit values for Java heap space. You can control the heap size by supplying an explicit value to the startup script such as graphdb -Xms10g -Xmx10g or setting one of the following environment variables:

  • GDB_HEAP_SIZE environment variable to set both the minimum and the maximum heap size (recommended).
  • GDB_MIN_MEM environment variable to set only the minimum heap size.
  • GDB_MAX_MEM environment variable to set only the maximum heap size.

For more information on how to change the default Java settings, check the instructions in the graphdb file.

Note

The order of precedence for JVM options is: GDB_MIN_MEM/GDB_MAX_MEM < GDB_HEAP_SIZE < GDB_JAVA_OPTS < command line supplied arguments.

Stopping the database

To stop the database, find the GraphDB process identifier and send kill <process-id>. This will send a shutdown signal and the database will stop. If the database is run in a non-daemon mode, you can also send Ctrl+C interrupt to stop it.

Set up your license

_images/no-license.png

To do that, follow the steps:

  1. Add, view or update your license by clicking on the key icon next to the active location path.

    _images/register-license.png
  2. Select the license file and register it.

    _images/select-license-file.png

    You can also copy and paste it in the text area.

    _images/copy-paste-license.png

    Warning

    In GraphDB 7.0.0 there is a known issue with registering a pasted license in some versions of Safari on Mac. If you are unable to use this method try a different browser or use the upload method instead.

  3. Validate your license.

Create a repository

Now let’s create your first repository. All examples given bellow are based on the News sample dataset provided in the distribution folder.

Tip

You can also use public datasets such as the w3.org Wine ontology by pasting its data URL - https://www.w3.org/TR/owl-guide/wine.rdf - in the Remote content tab of the Import page.

Hint

Locations represent individual GraphDB servers, where the repository data is stored. They can be local (a directory on the disk) or remote (an end-point URL). When started, GraphDB creates GraphDB-HOME/data directory as a default location. See Managing locations.

  1. From Locations and Repositories. click the Create Repository button.

    _images/create_repo.png
  2. Enter News as a Repository ID and leave all other optional configuration settings with their default values.

    Tip

    For repositories with more than few tens of millions of statements, see Configuring a repository.

  3. Set the newly created News repository as the default repository for this location with the Connect button.

    _images/connect_to_repo.png

Load your data

Load data through the GraphDB Workbench

Load data from local files

Let’s load your data.

  1. Go to Data -> Import.
  2. Open the Local files tab and click the Select files icon to upload the files from the News sample dataset provided in the distribution folder.
_images/import_local_file.png
  1. Click the Import button.
  2. Enter the import settings in the pop-up window.
_images/import_settings.png

Import Settings

  • Base URI: the default prefix for all local names in the file;
  • Context: specifies a graph within the repository;
  • Chunk size: the size of the batch operation; used for very large files (e.g., 10,000 - 100,000 triples per chunk);
  • Retry times: the number of times the workbench will try to upload the chunk before canceling (in case of HTTP error, during the data transfer);
  • Preserve BNnode IDs: when clicked, the parser keeps the blank node ID-s with their original strings.

Tip

Chunking a file is optional, but we recommend it for files larger than 200 MB.

  1. Click the Import button.

Note

You can also import data from files on the server where the workbench is located, from a remote URL (with a format extension or by specifying the data format), from a SPARQL construct query directly, or by pasting the RDF data in the Text area tab.

Load data through SPARQL or Sesame API

The GraphDB database also supports a very powerful API with a standard SPARQL or Sesame endpoint to which data can be posted with cURL, a local Java client API or a Sesame console. It is compliant with all standards. It allows every database operation to be executed via a HTTP client request.

  1. Locate the correct GraphDB URL endpoint:

    • select Admin -> Location and Repositories

    • click the link icon next to the repository name

      _images/locate_repo_URL.png
    • copy the repository URL.

  2. Go to the folder where your local data files are.

  3. Execute the script:

    curl -X POST -H "Content-Type:application/x-turtle" -T localfilename.ttl
      http://localhost:7200/repositories/repository-id/statements
    

    where localfilename.ttl is the data file you want to import and http://localhost:7200/repositories/repository-id/statements is the GraphDB URL endpoint of your repository.

    Tip

    Alternatively, use the full path to your local file.

Load data through the GraphDB LoadRDF tool

LoadRDF is a low level bulk load tool, which writes directly in the database index structures. It is ultra fast and supports parallel inference. For more information, see the LoadRDF tool.

Note

Loading data through the GraphDB LoadRDF tool can be performed only if the repository is empty, e.g., the initial loading after the database was down.

Explore your data and class relationships

Class hierarchy

To explore your data, navigate to Data -> Class hierarchy. You can see a diagram depicting the hierarchy of the imported RDF classes by number of instances. The biggest circles are the parent classes and the nested ones are their children.

Note

If your data has no ontology (hierarchy), the RDF classes will be visualised as separate circles, instead of nested ones.

_images/rdf-class-hierarchy-diagram-news.png

Explore your data - different actions

  • To see what classes each parent has, hover over the nested circles.

  • To explore a given class, click its circle. The selected class is highlighted with a dashed line and a side panel with its instances opens for further exploration. For each RDF class you can see its local name, URI and a list of its first 1000 class instances. The class instances are represented by their URIs, which when clicked lead to another view, where you can further explore their metadata.

    _images/rdf-class-hierarchy-diagram-selected-class-news.png

    The side panel includes the following:

    • Local name;
    • URI (Press Ctrl+C / Cmd+C to copy to clipboard and Enter to close);
    • Domain-Range Graph button;
    • Class instances count;
    • Scrollable list of the first 1000 class instances;
    • View Instances in SPARQL View button. It redirects to the SPARQL view and executes an auto-generated query that lists all class instances without LIMIT.
  • To go to the Domain-Range Graph diagram, double click a class circle or the Domain-Range Graph button from the side panel.

  • To explore an instance, click its URI from the side panel.

    _images/rdf-class-hierarchy-diagram-class-instance-resource-view-news.png
  • To adjust the number of classes displayed, drag the slider on the left-hand side of the screen. Classes are sorted by the maximum instance count and the diagram displays only the current slider value.

    _images/rdf-class-hierarchy-diagram-slider-low-value-news.png
  • To administer your data view, use the toolbar options on the right-hand side of the screen.

    _images/rdf-class-hierarchy-diagram-toolbar.png
    • To see only the class labels, click the Hide/Show Prefixes. You can still view the prefixes when you hover over the class that interests you.
    _images/rdf-class-hierarchy-diagram-no-prefix-classes-news.png
    • To zoom out of a particular class, click the Focus diagram home icon.
    • To reload the data on the diagram, click the Reload diagram icon. This is recommended when you have updated the data in your repository or you experience some strange behaviour, for example you cannot see a given class.
    • To export the diagram as an .svg image, click the Export Diagram download icon.

RDF domain-range graph

To see all properties of a given class as well as their domain and range, double click its class circle or the Domain-Range Graph button from the side panel. The RDF Domain-Range Graph view opens, enabling you to further explore the class connectedness by clicking the green nodes (object property class).

_images/rdf-domain-range-graph-diagram-news.png
  • To administer your graph view, use the toolbar options on the right-hand side of the screen.

    _images/rdf-domain-range-graph-diagram-toolbar.png
    • To go back to your class in the Class hierarchy, click the Back to Class hierarchy diagram button.
    • To export the diagram as an .svg image, click the Export Diagram download icon.

Class relationships

To explore the relationships between the classes, navigate to Data -> Class relationships. You can see a complicated diagram showing only the top relationships, where each of them is a bundle of links between the individual instances of two classes. Each link is an RDF statement where the subject is an instance of one class, the object is an instance of another class, and the link is the predicate. Depending on the number of links between the instances of two classes, the bundle can be thicker or thinner and gets the color of the class with more incoming links. These links can be in both directions.

In the example below, you can see the relationships between the classes of the News sample dataset provided in the distribution folder. You can observe that the class with the biggest number of links (the thickest bundle) is pub-old:Document.

_images/news-scenario-dependencies.png

To see how many annotations (mentions) are there in the documents, hover over the blue bundle representing the relationship between the classes pub-old:Document and pub-old:TextMention. The tooltip shows that there are 6197 annotations.

_images/news-scenario-class-document.png

To see how many of these annotations are about people, hover over the brown bundle representing the relationship between the classes pub-old:TextMention and pub:Person. The tooltip shows that 274 annotations are about people.

_images/news-scenario-class-person.png

Query your data

Query data through the GraphDB Workbench

Hint

SPARQL is a SQL-like query language for RDF graph databases with the following types:

  • SELECT - returns tabular results;
  • CONSTRUCT - creates a new RDF graph based on query results;
  • ASK - returns “YES”, if the query has a solution, otherwise “NO”;
  • DESCRIBE - returns RDF data about a resource; useful when you do not know the RDF data structure in the data source;
  • INSERT - inserts triples into a graph;
  • DELETE - deletes triples from a graph.

For more information, see the Additional resources section.

Now it’s time to delve into your data. The following is one possible scenario for searching in it.

  1. Select the repository you want to work with, in this example News, and click the SPARQL menu tab.

  2. Let’s say you are interested in people. Find all people mentioned in the documents from this news articles dataset.

    PREFIX pub: <http://ontology.ontotext.com/taxonomy/>
    PREFIX pub-old: <http://ontology.ontotext.com/publishing#>
    select distinct ?x ?Person  where {
    ?x a pub:Person .
    ?x pub:preferredLabel ?Person .
    ?doc pub-old:containsMention / pub-old:hasInstance ?x .
    }
    
    _images/news-scenario-all-people.png
  3. Run a query to calculate the RDF ranking of the instances based on their interconnectedness.

    PREFIX rank: <http://www.ontotext.com/owlim/RDFRank#>
    INSERT DATA { _:b1 rank:compute _:b2. }
    
  4. Find all people mentioned in the documents, ordered by popularity in the repository.

    PREFIX pub: <http://ontology.ontotext.com/taxonomy/>
    PREFIX pub-old: <http://ontology.ontotext.com/publishing#>
    PREFIX rank: <http://www.ontotext.com/owlim/RDFRank#>
    select distinct ?x ?PersonLabel ?rank where {
        ?x a pub:Person .
        ?x pub:preferredLabel ?PersonLabel .
        ?doc pub-old:containsMention / pub-old:hasInstance ?x .
        ?x rank:hasRDFRank ?rank .
    } ORDER by DESC (?rank)
    
    _images/news-scenario-ordred-by-popularity.png
  5. Find all people who were mentioned together with their political parties in the documents.

    PREFIX pub-old: <http://ontology.ontotext.com/publishing#>
    PREFIX pub: <http://ontology.ontotext.com/taxonomy/>
    select distinct ?personLabel ?partyLabel where {
        ?document pub-old:containsMention ?mention .
        ?mention pub-old:hasInstance ?person .
        ?person pub:preferredLabel ?personLabel .
        ?person pub:memberOfPoliticalParty ?party .
        ?party pub:hasValue ?value .
        ?value pub:preferredLabel ?partyLabel .
    }
    
    _images/news-scenario-people-and-political-parites.png
  6. Did you know that Marlon Brando was from the Democratic Party? Find what other mentions occur together with Marlon Brando in the given news article.

    PREFIX pub: <http://ontology.ontotext.com/taxonomy/>
    PREFIX pub-old: <http://ontology.ontotext.com/publishing#>
    select distinct ?Mentions where {
    <http://www.reuters.com/article/2014/10/06/us-art-auction-idUSKCN0HV21B20141006> pub-old:containsMention / pub-old:hasInstance ?x .
    ?x pub:preferredLabel ?Mentions .
    
    }
    
    _images/news-scenario-MB-and-other-mentions.png
  7. Find everything there is in the database about Marlon Brando.

    PREFIX pub: <http://ontology.ontotext.com/taxonomy/>
    PREFIX pub-old: <http://ontology.ontotext.com/publishing#>
    select distinct ?p ?objectLabel where {
    <http://ontology.ontotext.com/resource/tsk78dfdet4w> ?p ?o .
        {
    ?o pub:hasValue ?value .
        ?value pub:preferredLabel ?objectLabel .
        } union {
            ?o pub:hasValue ?objectLabel .
            filter (isLiteral(?objectLabel)) .
         }
    }
    
    _images/news-scenario-MB-data.png
  8. Find all documents that mention members of the Democratic Party and their names.

    PREFIX pub-old: <http://ontology.ontotext.com/publishing#>
    PREFIX pub: <http://ontology.ontotext.com/taxonomy/>
    select distinct ?document ?personLabel where {
        ?document pub-old:containsMention ?mention .
        ?mention pub-old:hasInstance ?person .
        ?person pub:preferredLabel ?personLabel .
        ?person pub:memberOfPoliticalParty ?party .
        ?party pub:hasValue ?value .
        ?value pub:preferredLabel "Democratic Party"@en .
    }
    
    _images/news-scenario-all-DP-members-and-names.png
  9. Find when all people from the Democratic Party mentioned in the news articles were born and have died.

    PREFIX pub-old: <http://ontology.ontotext.com/publishing#>
    PREFIX pub: <http://ontology.ontotext.com/taxonomy/>
    select distinct ?person ?personLabel ?dateOfbirth ?dateOfDeath where {
        ?document pub-old:containsMention / pub-old:hasInstance ?person .
        ?person pub:preferredLabel ?personLabel .
        OPTIONAL {
           ?person pub:dateOfBirth / pub:hasValue ?dateOfbirth .
        }
        OPTIONAL {
           ?person pub:dateOfDeath / pub:hasValue ?dateOfDeath .
        }
        ?person pub:memberOfPoliticalParty / pub:hasValue / pub:preferredLabel "Democratic Party"@en .
    } order by ?dateOfbirth
    
    _images/news-scenario-dob-dod.png

Tip

You can play with more example queries from the Example_queries.rtf file provided in the distribution folder.

Note

In version 7.0, GraphDB also introduces an Autocomplete Index, which offers suggestions for the URIs local names in the SPARQL editor and the View Resource page. For more information, go to the Autocomplete Index section of the Workbench User Guide.

Query data programmatically

SPARQL is not only a standard query language, but also a protocol for communicating with RDF databases. GraphDB stays compliant with the protocol specification and allows querying data with standard HTTP requests.

Execute the example query with a HTTP GET request:

curl -G -H "Accept:application/x-trig"
  -d query=CONSTRUCT+%7B%3Fs+%3Fp+%3Fo%7D+WHERE+%7B%3Fs+%3Fp+%3Fo%7D+LIMIT+10
  http://localhost:7200/repositories/yourrepository

Execute the example query with a POST operation:

curl -X POST --data-binary @file.sparql -H "Accept: application/rdf+xml"
  -H "Content-type: application/x-www-form-urlencoded"
  http://localhost:7200/repositories/worker-node

where, file.sparql contains an encoded query:

query=CONSTRUCT+%7B%3Fs+%3Fp+%3Fo%7D+WHERE+%7B%3Fs+%3Fp+%3Fo%7D+LIMIT+10

Tip

For more information how to interact with GraphDB APIs, refer to the Sesame and SPARQL protocols or the Linked Data Platform specifications.