Quick start guide

Run GraphDB as a stand-alone server

The default way of running GraphDB is as a stand-alone server. The server is platform independent and it includes all recommended JVM parameters for immediate use.

Running GraphDB

  1. Download your GraphDB distribution file and unzip it.

  2. Start the GraphDB Server and Workbench interface by executing the startup script located in the /bin folder:

    graphdb
    

    A message appears in your console telling you that GraphDB has been started in workbench mode. To access the Workbench, open http://localhost:7200/ in your browser.

Options

The startup script supports the following options:

Option Description
-d daemonise (run in background), not available on Windows
-s run in server-only mode (no workbench)
-p pidfile write PID to <pidfile>
-h
--help
print command line options
-v print GraphDB version, then exit
-Dprop set Java system property
-Xprop set non-standard Java system property

Note

Run graphdb -s to start GraphDB in server-only mode without the web interface (no workbench). A remote workbench can still be attached to the instance.

Configuring GraphDB

Paths and network settings

The configuration of all GraphDB directory paths and network settings is read from the conf/graphdb.properties file. It controls where to store the database data, log files and internal data. To assign a new value, modify the file or override the setting by adding -D<property>=<new-value> as a parameter to the startup script. For example, to change the database port number:

graphdb -Dgraphdb.connector.port=<your-port>

The configuration properties can also be set in the environment variable GDB_JAVA_OPTS, using the same -D<property>=<new-value> syntax.

Note

The order of precedence for GraphDB configuration properties is: config file < GDB_JAVA_OPTS < command line supplied arguments.

Java virtual machine settings

It is strongly recommended to set explicit values for the Java heap space. You can control the heap size by supplying an explicit value to the startup script such as graphdb -Xms10g -Xmx10g or setting one of the following environment variables:

  • GDB_HEAP_SIZE environment variable to set both the minimum and the maximum heap size (recommended).
  • GDB_MIN_MEM environment variable to set only the minimum heap size.
  • GDB_MAX_MEM environment variable to set only the maximum heap size.

For more information on how to change the default Java settings, check the instructions in the graphdb file.

Note

The order of precedence for JVM options is: GDB_MIN_MEM/GDB_MAX_MEM < GDB_HEAP_SIZE < GDB_JAVA_OPTS < command line supplied arguments.

Stopping the database

To stop the database, find the GraphDB process identifier and send kill <process-id>. This sends a shutdown signal and the database stops. If the database is run in a non-daemon mode, you can also send Ctrl+C interrupt to stop it.

Set up your license

GraphDB SE is available under an RDBMS-like commercial license on a per-server-CPU basis. It is neither free nor open-source. To purchase a license or obtain a copy for evaluation, please contact GraphDB-info@ontotext.com.

When installing GraphDB SE, the license file can be set through the GraphDB Workbench or programmatically:

_images/no-license.png

To do that, follow the steps:

  1. Add, view or update your license by clicking on the key icon next to the active location path.

    _images/register-license.png
  2. Select the license file and register it.

    _images/select-license-file.png

    You can also copy and paste it in the text area.

    _images/copy-paste-license.png
  3. Validate your license.

    _images/validate-new-license.png
  4. When you are ready, you will see the details of your license.

Create a repository

Now let’s create your first repository.

Hint

When started, GraphDB creates GraphDB-HOME/data directory as an active location. To change the directory, see Configuring GraphDB Data Directory.

  1. Go to Setup -> Repositories.

  2. Click Create new repository.

    _images/createRepository.png
  3. Enter myrepo as a Repository ID and leave all other optional configuration settings with their default values.

    Tip

    For repositories with more than few tens of millions of statements, see Configuring a repository.

  4. Click the the Connect button to set the newly created repository as the repository for this location.

    _images/connect_to_repo.png
  5. Use the pin to select it as the default repository.

    _images/default-repo-pin.png

Load your data

All examples given bellow are based on the News sample dataset provided in the distribution folder.

Tip

You can also use public datasets such as the w3.org Wine ontology by pasting its data URL - https://www.w3.org/TR/owl-guide/wine.rdf - in the Remote content tab of the Import page.

Load data through the GraphDB Workbench

Load data from local files

Let’s load your data.

  1. Go to Import -> RDF.
  2. Open the Local files tab and click the Select files icon to upload the files from the News sample dataset provided in the distribution folder.
_images/import_local_file.png
  1. Click the Import button.
  2. Enter the import settings in the pop-up window.
_images/import_settings.png

Import Settings

  • Base URI: the default prefix for all local names in the file;
  • Context: specifies a graph within the repository;
  • Chunk size: the size of the batch operation; used for very large files (e.g., 10,000 - 100,000 triples per chunk);
  • Retry times: the number of times the workbench will try to upload the chunk before canceling (in case of HTTP error, during the data transfer);
  • Preserve BNnode IDs: when clicked, the parser keeps the blank node ID-s with their original strings.

Tip

Chunking a file is optional, but we recommend it for files larger than 200 MB.

  1. Click the Import button.

Note

You can also import data from files on the server where the workbench is located, from a remote URL (with a format extension or by specifying the data format), from a SPARQL construct query directly, or by pasting the RDF data in the Text area tab.

Load data through SPARQL or RDF4J API

The GraphDB database also supports a very powerful API with a standard SPARQL or RDF4J endpoint to which data can be posted with cURL, a local Java client API or a RDF4J console. It is compliant with all standards. It allows every database operation to be executed via a HTTP client request.

  1. Locate the correct GraphDB URL endpoint:

    • select Setup -> Repositories

    • click the link icon next to the repository name

      _images/locate_repo_URL.png
    • copy the repository URL.

  2. Go to the folder where your local data files are.

  3. Execute the script:

    curl -X POST -H "Content-Type:application/x-turtle" -T localfilename.ttl
      http://localhost:7200/repositories/repository-id/statements
    

    where localfilename.ttl is the data file you want to import and http://localhost:7200/repositories/repository-id/statements is the GraphDB URL endpoint of your repository.

    Tip

    Alternatively, use the full path to your local file.

Load data through the GraphDB LoadRDF tool

LoadRDF is a low level bulk load tool, which writes directly in the database index structures. It is ultra fast and supports parallel inference. For more information, see the Loading data using the LoadRDF tool.

Note

Loading data through the GraphDB LoadRDF tool can be performed only if the repository is empty, e.g., the initial loading after the database was down.

Explore your data and class relationships

Explore instances

To explore instances and their relationships, navigate to Explore -> Visual graph and find an instance of interest through the search box or from the Resource view click the Visual Graph button. The graph of the instance and its relationships is shown.

_images/visual-graph-news.png

Click on a node to see a menu for the following actions

  • Expand node to show its relationships or collapse to hide them if already expanded
  • Click the info icon to show more info about the node. The side panel includes a short description (rdfs:comment), labels (rdfs:label), RDF rank, image (foaf:depiction) if present and all DataType properties. You can search by DataType property if you are interested in a its value.
  • Focus the node to restart the graph with this instance as central one. Note that you will lose the current state of your graph.
  • Delete a node to hide its relationships and remove the node itself from the graph.

Click on the settings icon for advanced graph settings. Control number of links, types and predicates to hide and show.

_images/visual-graph-global.png

A side panel opens with the available settings

_images/visual-graph-settings.png

Class hierarchy

To explore your data, navigate to Explore -> Class hierarchy. You can see a diagram depicting the hierarchy of the imported RDF classes by the number of instances. The biggest circles are the parent classes and the nested ones are their children.

Note

If your data has no ontology (hierarchy), the RDF classes will be visualised as separate circles, instead of nested ones.

_images/rdf-class-hierarchy-diagram-news.png

Explore your data - different actions

  • To see what classes each parent has, hover over the nested circles.

  • To explore a given class, click its circle. The selected class is highlighted with a dashed line and a side panel with its instances opens for further exploration. For each RDF class you can see its local name, URI and a list of its first 1000 class instances. The class instances are represented by their URIs, which when clicked, lead to another view, where you can further explore their metadata.

    _images/rdf-class-hierarchy-diagram-selected-class-news.png

    The side panel includes the following:

    • Local name;
    • URI (Press Ctrl+C / Cmd+C to copy to clipboard and Enter to close);
    • Domain-Range Graph button;
    • Class instances count;
    • Scrollable list of the first 1000 class instances;
    • View Instances in SPARQL View button. It redirects to the SPARQL view and executes an auto-generated query that lists all class instances without LIMIT.
  • To go to the Domain-Range Graph diagram, double click a class circle or the Domain-Range Graph button from the side panel.

  • To explore an instance, click its URI from the side panel.

    _images/rdf-class-hierarchy-diagram-class-instance-resource-view-news.png
  • To adjust the number of classes displayed, drag the slider on the left-hand side of the screen. Classes are sorted by the maximum instance count and the diagram displays only the current slider value.

    _images/rdf-class-hierarchy-diagram-slider-low-value-news.png
  • To administer your data view, use the toolbar options on the right-hand side of the screen.

    _images/rdf-class-hierarchy-diagram-toolbar.png
    • To see only the class labels, click the Hide/Show Prefixes. You can still view the prefixes when you hover over the class that interests you.
    • To zoom out of a particular class, click the Focus diagram home icon.
    • To reload the data on the diagram, click the Reload diagram icon. This is recommended when you have updated the data in your repository or you experience some strange behaviour, for example you cannot see a given class.
    • To export the diagram as an .svg image, click the Export Diagram download icon.

Domain-Range graph

To explore the connectedness of a given class, double click the class circle or the Domain-Range Graph button from the side panel. You can see a diagram that shows this class and its properties with their domain and range, where domain refers to all subject resources and range - to all object resources. For example, if you start from class pub:Company, you see something like: <pub-old:Mention pub-old:hasInstance pub:Company> <pub:Company pub:description xsd:string>.

_images/rdf-domain-range-graph-diagram-news.png

You can also further explore the class connectedness by clicking:

  • the green nodes (object property class).
  • the labels - they lead to the View resource page, where you can find more information about the current class or property.
  • the slider Show collapsed predicates to hide all edges sharing the same source and target nodes.
_images/rdf-domain-range-graph-diagram-collapsed-news.png

To see all predicate labels contained in a collapsed edge, click the collapsed edge count label, which is always in the format <count> predicates. A side panel opens with the target node label, a list of the collapsed predicate labels and the type of the property (explicit or implicit). You can click these labels to see the resource in the View resource page.

_images/rdf-domain-range-graph-diagram-collapsed-side-panel-news.png

Administering the diagram view

To administer your diagram view, use the toolbar options on the right-hand side of the screen.

_images/rdf-domain-range-graph-diagram-toolbar.png
  • To go back to your class in the Class hierarchy, click the Back to Class hierarchy diagram button.
  • To collapse edges with common source/target nodes, in order to see the diagram more clearly, click the Show all predicates/Show collapsed predicates button. The default is collapsed.
  • To export the diagram as an .svg image, click the Export Diagram download icon.

Class relationships

To explore the relationships between the classes, navigate to Explore -> Class relationships. You can see a complicated diagram showing only the top relationships, where each of them is a bundle of links between the individual instances of two classes. Each link is an RDF statement where the subject is an instance of one class, the object is an instance of another class, and the link is the predicate. Depending on the number of links between the instances of two classes, the bundle can be thicker or thinner and gets the color of the class with more incoming links. These links can be in both directions.

In the example below, you can see the relationships between the classes of the News sample dataset provided in the distribution folder. You can observe that the class with the biggest number of links (the thickest bundle) is pub-old:Document.

_images/news-scenario-dependencies.png

To remove all classes, use the X icon.

_images/news-scenario-remove-all.png

To control which classes to display in the diagram, use the add/remove icon next to each class.

_images/news-scenario-add-class.png

To see how many annotations (mentions) are there in the documents, click on the blue bundle representing the relationship between the classes pub-old:Document and pub-old:TextMention. The tooltip shows that there are 6197 annotations linked by the pub-old:containsMention predicate.

_images/news-scenario-class-document.png

To see how many of these annotations are about people, click on light purple bundle representing the relationship between the classes pub-old:TextMention and pub:Person. The tooltip shows that 274 annotations are about people linked by the pub-old:hasInstance predicate.

_images/news-scenario-class-person.png

Query your data

Query data through the Workbench

Hint

SPARQL is a SQL-like query language for RDF graph databases with the following types:

  • SELECT - returns tabular results;
  • CONSTRUCT - creates a new RDF graph based on query results;
  • ASK - returns “YES”, if the query has a solution, otherwise “NO”;
  • DESCRIBE - returns RDF data about a resource; useful when you do not know the RDF data structure in the data source;
  • INSERT - inserts triples into a graph;
  • DELETE - deletes triples from a graph.

For more information, see the Additional resources section.

Now it’s time to delve into your data. The following is one possible scenario for searching in it.

  1. Select the repository you want to work with, in this example News, and click the SPARQL menu tab.

  2. Let’s say you are interested in people. Find all people mentioned in the documents from this news articles dataset.

    PREFIX pub: <http://ontology.ontotext.com/taxonomy/>
    PREFIX pub-old: <http://ontology.ontotext.com/publishing#>
    select distinct ?x ?Person  where {
    ?x a pub:Person .
    ?x pub:preferredLabel ?Person .
    ?doc pub-old:containsMention / pub-old:hasInstance ?x .
    }
    
    _images/news-scenario-all-people.png
  3. Run a query to calculate the RDF rank of the instances based on their interconnectedness.

    PREFIX rank: <http://www.ontotext.com/owlim/RDFRank#>
    INSERT DATA { _:b1 rank:compute _:b2. }
    
  4. Find all people mentioned in the documents, ordered by popularity in the repository.

    PREFIX pub: <http://ontology.ontotext.com/taxonomy/>
    PREFIX pub-old: <http://ontology.ontotext.com/publishing#>
    PREFIX rank: <http://www.ontotext.com/owlim/RDFRank#>
    select distinct ?x ?PersonLabel ?rank where {
        ?x a pub:Person .
        ?x pub:preferredLabel ?PersonLabel .
        ?doc pub-old:containsMention / pub-old:hasInstance ?x .
        ?x rank:hasRDFRank ?rank .
    } ORDER by DESC (?rank)
    
    _images/news-scenario-ordred-by-popularity.png
  5. Find all people who are mentioned together with their political parties.

    PREFIX pub-old: <http://ontology.ontotext.com/publishing#>
    PREFIX pub: <http://ontology.ontotext.com/taxonomy/>
    select distinct ?personLabel ?partyLabel where {
        ?document pub-old:containsMention ?mention .
        ?mention pub-old:hasInstance ?person .
        ?person pub:preferredLabel ?personLabel .
        ?person pub:memberOfPoliticalParty ?party .
        ?party pub:hasValue ?value .
        ?value pub:preferredLabel ?partyLabel .
    }
    
    _images/news-scenario-people-and-political-parites.png
  6. Did you know that Marlon Brando was from the Democratic Party? Find what other mentions occur together with Marlon Brando in the given news article.

    PREFIX pub: <http://ontology.ontotext.com/taxonomy/>
    PREFIX pub-old: <http://ontology.ontotext.com/publishing#>
    select distinct ?Mentions where {
    <http://www.reuters.com/article/2014/10/06/us-art-auction-idUSKCN0HV21B20141006> pub-old:containsMention / pub-old:hasInstance ?x .
    ?x pub:preferredLabel ?Mentions .
    
    }
    
    _images/news-scenario-MB-and-other-mentions.png
  7. Find everything available about Marlon Brando in the database.

    PREFIX pub: <http://ontology.ontotext.com/taxonomy/>
    PREFIX pub-old: <http://ontology.ontotext.com/publishing#>
    select distinct ?p ?objectLabel where {
    <http://ontology.ontotext.com/resource/tsk78dfdet4w> ?p ?o .
        {
    ?o pub:hasValue ?value .
        ?value pub:preferredLabel ?objectLabel .
        } union {
            ?o pub:hasValue ?objectLabel .
            filter (isLiteral(?objectLabel)) .
         }
    }
    
    _images/news-scenario-MB-data.png
  8. Find all documents that mention members of the Democratic Party and the names of these people.

    PREFIX pub-old: <http://ontology.ontotext.com/publishing#>
    PREFIX pub: <http://ontology.ontotext.com/taxonomy/>
    select distinct ?document ?personLabel where {
        ?document pub-old:containsMention ?mention .
        ?mention pub-old:hasInstance ?person .
        ?person pub:preferredLabel ?personLabel .
        ?person pub:memberOfPoliticalParty ?party .
        ?party pub:hasValue ?value .
        ?value pub:preferredLabel "Democratic Party"@en .
    }
    
    _images/news-scenario-all-DP-members-and-names.png
  9. Find when these people were born and died.

    PREFIX pub-old: <http://ontology.ontotext.com/publishing#>
    PREFIX pub: <http://ontology.ontotext.com/taxonomy/>
    select distinct ?person ?personLabel ?dateOfbirth ?dateOfDeath where {
        ?document pub-old:containsMention / pub-old:hasInstance ?person .
        ?person pub:preferredLabel ?personLabel .
        OPTIONAL {
           ?person pub:dateOfBirth / pub:hasValue ?dateOfbirth .
        }
        OPTIONAL {
           ?person pub:dateOfDeath / pub:hasValue ?dateOfDeath .
        }
        ?person pub:memberOfPoliticalParty / pub:hasValue / pub:preferredLabel "Democratic Party"@en .
    } order by ?dateOfbirth
    
    _images/news-scenario-dob-dod.png

Tip

You can play with more example queries from the Example_queries.rtf file provided in the distribution folder.

Note

GraphDB also features an Autocomplete index, which offers suggestions for the URIs local names in the SPARQL editor and the View resource page.

Query data programmatically

SPARQL is not only a standard query language, but also a protocol for communicating with RDF databases. GraphDB stays compliant with the protocol specification and allows querying data with standard HTTP requests.

Execute the example query with a HTTP GET request:

curl -G -H "Accept:application/x-trig"
  -d query=CONSTRUCT+%7B%3Fs+%3Fp+%3Fo%7D+WHERE+%7B%3Fs+%3Fp+%3Fo%7D+LIMIT+10
  http://localhost:7200/repositories/yourrepository

Execute the example query with a POST operation:

curl -X POST --data-binary @file.sparql -H "Accept: application/rdf+xml"
  -H "Content-type: application/x-www-form-urlencoded"
  http://localhost:7200/repositories/worker-node

where, file.sparql contains an encoded query:

query=CONSTRUCT+%7B%3Fs+%3Fp+%3Fo%7D+WHERE+%7B%3Fs+%3Fp+%3Fo%7D+LIMIT+10

Tip

For more information how to interact with GraphDB APIs, refer to the RDF4J and SPARQL protocols or the Linked Data Platform specifications.