Loading data using the Workbench

The Workbench Import view hosts many of the interfaces used for loading RDF data inside your repository, such as loading data:

  • from local files;

  • from a remote URL (with a format extension or by specifying the data format);

  • by pasting the RDF data in the Text area tab;

  • from files on the server where the Workbench is located.

Screenshot of the GraphDB Import interface with two tabs, one for User data and the other for Server files. Options to upload RDF files, get RDF data from a URL, and import RDF text snippet. Files with names like 'statements.jsonld,' 'publishing-ontology.ttl,' 'pub-ontology.ttl,' and 'star-wars-example.jsonld' are listed, each with an 'Import' button beside them. There is a Name filter used to filter the already uploaded files.

In addition to that, RDF data can also be loaded by with a INSERT query from the SPARQL editor.

All import methods support asynchronous running of the import tasks, with the exception of the text area import, which is intended for very fast and simple imports. Currently, only one import task of a type is executed at a time, while the others wait in the queue as pending.

Note

Since Workbench performs the parsing when importing data into a local repository, GraphDB supports interruption and additional settings. When importing data into a remote location, the data is set to the remote endpoint, and the parsing and loading are performed there.

There are two tabs inside the view — User data, for RDF data provided by the user, and Server files, for files that are already hosted on the server. Both tabs provide a file name filter to narrow down the list of available files.

Note

When importing data from local files, a remote URL, or by pasting a text snippet, the files are first uploaded to the server. Once uploaded to the server, they can be imported into the repository.

Importing local RDF files

Go to Import ‣ User data ‣ Upload RDF files.

This option allows you to select, configure, and import data from various RDF formats.

Note

The limitation of this method is that it supports files of a limited size. The default is 200 megabytes, and is controlled by the graphdb.workbench.maxUploadSize property. The value is in bytes (-Dgraphdb.workbench.maxUploadSize=20971520).

Loading data from your local machine directly streams the file to the RDF4J’s statements endpoint:

  1. Click the button to browse files for uploading.

  2. When the files appear in the table, either import a file by clicking Import on its line, or select multiple files and click Import from the header.

  3. The import settings modal appears, just in case you want to add additional settings.

Importing remote RDF files

Go to Import ‣ User data ‣ Get RDF data from a URL.

You can import from a URL with RDF data. Each endpoint that returns RDF data can be used.

If the URL has an extension, it is used to detect the correct data type (e.g., http://linkedlifedata.com/resource/umls-concept/C0024117.rdf). Otherwise, you have to provide the Data Format parameter, which is sent as Accept header to the endpoint and then to the import loader.

Importing RDF data from a text snippet

Go to Import ‣ User data ‣ Import RDF text snippet.

You can import data by typing or pasting it directly in the text area. This is functionally identical to uploading a small RDF file and importing it.

Importing server files

Go to Import ‣ Server files.

The server files import allows you to load files of arbitrary sizes. Its limitation is that the files must be put in a specific directory (symbolic links are supported). By default, it is $user.home/graphdb-import/ that you need to create beforehand.

If you want to tweak the directory location, see the graphdb.workbench.importDirectory system property. The directory is scanned recursively and all files with a semantic MIME type are visible in the Server files tab.

Import settings

The settings for each import are saved so that you can use them, in case you want to re-import a file. You can see them in the dialog that opens after you have uploaded a document and press Import:

  • Base IRI — Specifies the base IRI against which to resolve any relative IRIs found in the uploaded data. When data does not contain relative IRIs, this field may be left empty.

  • Target graphs — When specified, imports the data into one or more graphs. Some RDF formats may specify graphs, while others do not support that. The latter are treated as if they specify the default graph.

    • From data — Imports data into the graph(s) specified by the data source.

    • The default graph — Imports all data into the default graph.

    • Named graph — Imports everything into a user-specified named graph.

  • Enable replacement of existing data — Enable this to replace the data in one or more graphs with the imported data. When enabled:

    • Replaced graph(s) — All specified graphs will be cleared before the import is run. If a graph ends in *, it will be treated as a prefix matching all named graphs starting with that prefix excluding the *. This option provides the most flexibility when the target graphs are determined from data.

    • I understand that data in the replaced graphs will be cleared before importing new data — this option must be checked when the data replacement is enabled.

  • JSON-LD context — Only visible when importing JSON-LD or NDJSON-LD files. Specifies external context as an URL. Only whitelisted URLs can be used.

    Tip

    The whitelist is a comma-separated list of URLs. The wildcard (*) allows for fine-grained control, enabling administrators to specify a set of URLs, including entire directories. Each entry in the list represents a source that is considered safe for JSON-LD and NDJSON-LD operations.

    # Sets whitelist for JSON-LD resources
    graphdb.jsonld.whitelist = https://example.com/my/jsonld/*, file:///usr/local/my/jsonld/*
    

Advanced settings:

  • Preserve BNode IDs: assigns its own internal blank node identifiers or uses the blank node IDs it finds in the file.

  • Fail parsing if datatypes are not recognized: determines whether to fail parsing if datatypes are unknown.

  • Verify recognized datatypes: verifies that the values of the datatype properties in the file are valid.

  • Normalize recognized datatypes values: indicates whether recognized datatypes need to have their values be normalized.

  • Fail parsing if languages are not recognized: determines whether to fail parsing if languages are unknown.

  • Verify language based on a given set of definitions for valid languages: determines whether languages tags are to be verified.

  • Normalize recognized language tags: indicates whether languages need to be normalized, and to which format they should be normalized.

  • Should stop on error: determines whether to ignore non-fatal errors.

  • Force serial pipeline: enforces the use of the serial pipeline when importing data.

Note

Import without changing settings will import selected files or folders using their saved settings or default ones.

Import data with an INSERT query using the SPARQL endpoint

You can also insert triples into a graph with an INSERT query in the SPARQL editor.