Loading and Updating Data¶
What’s in this document?
Interfaces for Loading RDF Data¶
GraphDB exposes multiple interfaces for loading RDF data.
Interface |
Use cases |
Mode |
Speed |
---|---|---|---|
SPARQL endpoint |
No limits on the file size |
Online parallel |
Moderate speed |
Workbench import of a text snippet |
Small text snippets |
Online parallel |
Moderate speed |
Workbench import of a local or a remote RDF file |
Small files limited up to 200MB |
Online parallel |
Moderate speed |
Workbench import of a server file |
No limits on the file size |
Online parallel |
Fast ignoring all HTTP protocol overheads |
ImportRDF Load |
Batch import of very big files |
Initial offline import with no plugins |
Fast, with small speed degradation |
ImportRDF Preload |
Import huge datasets with no inference |
Initial offline import with no inference and plugins |
Ultra-fast without speed degradation |
All interfaces support multiple RDF formats.
Tip
It’s often useful to use GraphDB’s rdfvalidator command line utility to check that an RDF file parses properly before attempting to load it.
Updating data in GraphDB is done via smart updates using server-side SPARQL templates.
GraphDB supports SHACL validation ensuring efficient data consistency checking.
The GraphDB change tracking plugin allows you to track changes within the context of a transaction identified by a unique ID.
The GraphDB sequences plugin provides transactional sequences for GraphDB. A sequence is a long counter that can be atomically incremented in a transaction to provide incremental IDs.
Loading via HTTP with curl¶
Using curl lets you script this call in an application. See also the
view of the GraphDB Workbench where you will find a complete reference of all REST APIs and be able to run API calls directly from the browser.In addition to this, the RDF4J API is also available.
Most data import queries can either take the following set of attributes as an argument or return them as a response.
fileNames
(string list): A list of filenames that are to be imported.importSettings
(JSON object): Import settings.baseURI
(string): Base URI for the files to be imported.context
(string): Context for the files to be imported.data
(string): Inline data.forceSerial
(boolean): Force use of the serial statements pipeline.name
(string): Filename.status
(string): Status of an import - pending, importing, done, error, none, interrupting.timestamp
(integer): When was the import started.type
(string): The type of the import.replaceGraphs
(string list): A list of graphs that you want to be completely replaced by the import.parserSettings
(JSON object): Parser settings.failOnUnknownDataTypes
(boolean): Fail parsing if datatypes are not recognized.failOnUnknownLanguageTags
(boolean): Fail parsing if languages are not recognized.normalizeDataTypeValues
(boolean): Normalize recognized datatypes values.normalizeLanguageTags
(boolean): Normalize recognized language tags.preserveBNodeIds
(boolean): Use blank node IDs found in the file instead of assigning them.stopOnError
(boolean): Stop on error. Iffalse
, the error will be logged and parsing will continue.verifyDataTypeValues
(boolean): Verify recognized datatypes.verifyLanguageTags
(boolean): Verify language based on a given set of definitions for valid languages.
Cancel server file import operation
DELETE /rest/repositories/<repo_id>/import/server
Example:
curl -X DELETE <base_url>/rest/repositories/<repo-id>/import/server?name=<encoded_filepath>
Get server files available for import
GET /rest/repositories/<repo_id>/import/server
Example:
curl <base_url>/rest/repositories/<repo_id>/import/server
Import a server file into the repository
POST /rest/repositories/<repo_id>/import/server
Example:
curl -X POST --header 'Content-Type: application/json' -d '{
"fileNames": [
"<data_url>",
"<data_url>"
]
}' <base_url>/rest/repositories/<repo_id>/import/server
Tip
Common parameters:
<base_url>
: The URL host and path leading to the deployed
GraphDB Workbench webapp;
<repo_id>
: The id string with which the current repository can
be referred to;
<encoded_filepath>
: Encoded filepath leading to a server file
that is in the process of being imported.