Loading data using the Workbench

There are several ways of importing data:

  • from local files;

  • from files on the server where the Workbench is located;

  • from a remote URL (with a format extension or by specifying the data format);

  • by pasting the RDF data in the Text area tab;

  • from a SPARQL construct query directly.

All import methods support asynchronous running of the import tasks, except for the text area import, which is intended for very fast and simple import.

Note

Currently, only one import task of a type is executed at a time, while the others wait in the queue as pending.

Note

For local repositories, we support interruption and additional settings, since the parsing is done by the Workbench. When the location is a remote one, you just send the data to the remote endpoint, and the parsing and loading are performed there.

If you have many files, a file name filter is available to narrow the list down.

Import settings

The settings for each import are saved so that you can use them, in case you want to re-import a file. You can see them in the dialog that opens after you have uploaded a document and press Import:

  • Base IRI - specifies the base IRI against which to resolve any relative IRIs found in the uploaded data. When data does not contain relative IRIs, this field may be left empty.

  • Target graphs - when specified, imports the data into one or more graphs. Some RDF formats may specify graphs, while others do not support that. The latter are treated as if they specify the default graph.

    • From data - Imports data into the graph(s) specified by the data source.

    • The default graph - Imports all data into the default graph.

    • Named graph - Imports everything into a user-specified named graph.

  • Enable replacement of existing data - Enable this to replace the data in one or more graphs with the imported data. When enabled:

    • Replaced graph(s) - All specified graphs will be cleared before the import is run. If a graph ends in *, it will be treated as a prefix matching all named graphs starting with that prefix excluding the *. This option provides the most flexibility when the target graphs are determined from data.

    • I understand that data in the replaced graphs will be cleared before importing new data - this option must be checked when the data replacement is enabled.

Advanced settings:

  • Preserve BNnode IDs: assigns its own internal blank node identifiers or uses the blank node IDs it finds in the file.

  • Fail parsing if datatypes are not recognized: determines whether to fail parsing if datatypes are unknown.

  • Verify recognized datatypes: verifies that the values of the datatype properties in the file are valid.

  • Normalize recognized datatypes values: indicates whether recognized datatypes need to have their values be normalized.

  • Fail parsing if languages are not recognized: determines whether to fail parsing if languages are unknown.

  • Verify language based on a given set of definitions for valid languages: determines whether languages tags are to be verified.

  • Normalize recognized language tags: indicates whether languages need to be normalized, and to which format they should be normalized.

  • Should stop on error: determines whether to ignore non-fatal errors.

  • Force serial pipeline: enforces the use of the serial pipeline when importing data.

Note

Import without changing settings will import selected files or folders using their saved settings or default ones.

_images/import_settings.png

Importing local files

Go to Import ‣ RDF ‣ User data ‣ Upload RDF files.

This option allows you to select, configure, and import data from various formats.

Note

The limitation of this method is that it supports files of a limited size. The default is 200 megabytes, and is controlled by the graphdb.workbench.maxUploadSize property. The value is in bytes (-Dgraphdb.workbench.maxUploadSize=20971520).

Loading data from your local machine directly streams the file to the RDF4J’s statements endpoint:

  1. Click the icon to browse files for uploading.

  2. When the files appear in the table, either import a file by clicking Import on its line, or select multiple files and click Import from the header.

  3. The import settings modal appears, just in case you want to add additional settings.

_images/import_local_file.png

Importing remote content

Go to Import ‣ RDF ‣ User data ‣ Get RDF data from a URL.

You can import from a URL with RDF data. Each endpoint that returns RDF data can be used.

_images/import_remote_content.png

If the URL has an extension, it is used to detect the correct data type (e.g., http://linkedlifedata.com/resource/umls-concept/C0024117.rdf). Otherwise, you have to provide the Data Format parameter, which is sent as Accept header to the endpoint and then to the import loader.

Importing RDF data from a text snippet

Go to Import ‣ RDF ‣ User data ‣ Import RDF text snippet.

You can import data by typing or pasting it directly in the Text area control. This very simple text import sends the data to the Repository Statements Endpoint.

_images/import_text_area.png

Importing server files

Go to Import ‣ RDF ‣ Server files.

The server files import allows you to load files of arbitrary sizes. Its limitation is that the files must be put in a specific directory (symbolic links are supported). By default, it is $user.home/graphdb-import/.

If you want to tweak the directory location, see the graphdb.workbench.importDirectory system property. The directory is scanned recursively and all files with a semantic MIME type are visible in the Server files tab.

Import data with an INSERT query

You can also insert triples into a graph with an INSERT query in the SPARQL editor.

_images/INSERT_query.png