Loading and Updating Data

Interfaces for Loading RDF Data

GraphDB exposes multiple interfaces for loading RDF data.

GraphDB’s data loading interfaces

Interface

Use cases

Mode

Speed

SPARQL endpoint

No limits on the file size

Online parallel

Moderate speed

Workbench import of a text snippet

Small text snippets

Online parallel

Moderate speed

Workbench import of a local or a remote RDF file

Small files limited up to 200MB

Online parallel

Moderate speed

Workbench import of a server file

No limits on the file size

Online parallel

Fast ignoring all HTTP protocol overheads

ImportRDF Load

Batch import of very big files

Initial offline import with no plugins

Fast, with small speed degradation

ImportRDF Preload

Import huge datasets with no inference

Initial offline import with no inference and plugins

Ultra-fast without speed degradation

All interfaces support multiple RDF formats.

Tip

It’s often useful to use GraphDB’s rdfvalidator command line utility to check that an RDF file parses properly before attempting to load it.

Updating data in GraphDB is done via smart updates using server-side SPARQL templates.

GraphDB supports SHACL validation ensuring efficient data consistency checking.

The GraphDB change tracking plugin allows you to track changes within the context of a transaction identified by a unique ID.

The GraphDB sequences plugin provides transactional sequences for GraphDB. A sequence is a long counter that can be atomically incremented in a transaction to provide incremental IDs.

Loading via HTTP with curl

Using curl lets you script this call in an application. See also the Help ‣ REST API view of the GraphDB Workbench where you will find a complete reference of all REST APIs and be able to run API calls directly from the browser.

In addition to this, the RDF4J API is also available.

Most data import queries can either take the following set of attributes as an argument or return them as a response.

  • fileNames (string list): A list of filenames that are to be imported.

  • importSettings (JSON object): Import settings.

    • baseURI (string): Base URI for the files to be imported.

    • context (string): Context for the files to be imported.

    • data (string): Inline data.

    • forceSerial (boolean): Force use of the serial statements pipeline.

    • name (string): Filename.

    • status (string): Status of an import - pending, importing, done, error, none, interrupting.

    • timestamp (integer): When was the import started.

    • type (string): The type of the import.

    • replaceGraphs (string list): A list of graphs that you want to be completely replaced by the import.

    • parserSettings (JSON object): Parser settings.

      • failOnUnknownDataTypes (boolean): Fail parsing if datatypes are not recognized.

      • failOnUnknownLanguageTags (boolean): Fail parsing if languages are not recognized.

      • normalizeDataTypeValues (boolean): Normalize recognized datatypes values.

      • normalizeLanguageTags (boolean): Normalize recognized language tags.

      • preserveBNodeIds (boolean): Use blank node IDs found in the file instead of assigning them.

      • stopOnError (boolean): Stop on error. If false, the error will be logged and parsing will continue.

      • verifyDataTypeValues (boolean): Verify recognized datatypes.

      • verifyLanguageTags (boolean): Verify language based on a given set of definitions for valid languages.

Cancel server file import operation

DELETE /rest/repositories/<repo_id>/import/server

Example:

curl -X DELETE <base_url>/rest/repositories/<repo-id>/import/server?name=<encoded_filepath>

Get server files available for import

GET /rest/repositories/<repo_id>/import/server

Example:

curl <base_url>/rest/repositories/<repo_id>/import/server

Import a server file into the repository

POST /rest/repositories/<repo_id>/import/server

Example:

curl -X POST --header 'Content-Type: application/json' -d '{
  "fileNames": [
    "<data_url>",
    "<data_url>"
  ]
}' <base_url>/rest/repositories/<repo_id>/import/server

Tip

Common parameters:

<base_url>: The URL host and path leading to the deployed GraphDB Workbench webapp;

<repo_id>: The id string with which the current repository can be referred to;

<encoded_filepath>: Encoded filepath leading to a server file that is in the process of being imported.