Backing up and restoring a repository

Backup a repository

Repository backups allow users to revert a GraphDB repository to a previous state. The database offers two different approaches of copying the repository state.

  • Export the repository to an RDF file - this operation can run in parallel to read and write, but it takes more time to complete.
  • Copy the repository image directory to a backup - this is a much faster option, but in non-cluster setups it requires to shutdown the database process.

Note

We recommend all repository backups to be scheduled during periods of lower user activities.

Export repository to an RDF file

The repository export works without having to stop GraphDB. This operation usually takes longer than copying the low level file system, because all explicit RDF statements must be serialized and deserialized over HTTP. Once the export operation starts, all following updates will not be included in the dump. To invoke the export repository operation several interfaces are available:

Option 1: Export the repository with the GraphDB Workbench.

Export the database contents using the Workbench. To preserve the contexts (named graph) when exporting/importing the whole database, use a context-aware RDF file format, e.g., TriG.

  1. Go to Explore/Graphs overview.
  2. Choose the files you want to export.
  3. Click Export graph as TriG.
_images/export_TriG.png

Option 2: Export all statements with curl.

The repository SPARQL endpoint supports dumping all explicit statements (replace the repositoryId with a valid repository name) with:

curl -X GET -H "Accept:application/x-trig" "http://localhost:7200/repositories/repositoryId/statements?infer=false" > export.trig

This method streams a snapshot of the database’s explicit statements into the export.trig file.

Option 3: Export all statements using the RDF4J API.

The same operation can be executed once with Java code by calling the RepositoryConnection.exportStatements() method with the includeInferred flag set to false (to return only the explicit statements).

Example:

RepositoryConnection connection = repository.getConnection();
FileOutputStream outputStream = new FileOutputStream(new File("export.nq"));
RDFWriter writer = Rio.createWriter(RDFFormat.NQUADS, outputStream);
connection.exportStatements(null, null, null, false, writer);
IOUtils.closeQuietly(outputStream);

The returned iterator can be used to visit every explicit statement in the repository and one of the RDF4J RDF writer implementations can be used to output the statements in the chosen format.

Note

If the data will be re-imported, we recommend the N-quads format as it can easily be broken into large ‘chunks’ that can be inserted and committed separately.

Backup GraphDB by copying the binary image

Note

This is the fastest method to backup a repository, but it requires stopping the database.

  1. Stop the GraphDB server.

  2. Manually copy the storage folders to the backup location.

    kill <pid-of-graphdb>
    sleep 10 #wait some time the database to stop
    cp -r {graphdb.home.data}/repositories/your-repo backup-dest/date/ #copies GraphDB's data
    cp -r {graphdb.home.data}/repositories/SYSTEM backup-dest/date/ #copies system repository
    

Tip

For more information about the data directory, see here.

The RDF4J’s SYSTEM repository contains all required information to instantiate the GraphDB repository. All RDF data is stored only in your repository.

Restore a repository

The restore options depends on the backup format.

Option 1: Restore a repository from an RDF export.

This option will import a previously exported file into an empty repository.

  1. Make sure that the repository is empty or recreated with the same repository configuration settings.
  2. Go to Import > RDF and then select the Server files tab.
  3. Check on the web page what is the directory path after the string Put files that you want to import in.
  4. Copy the RDF file with the backup into this directory path and refresh the page.
  5. Start the file import and wait for the data to be imported.

Option 2: Restore the database from a binary image backup.

  1. Stop the GraphDB server.
  2. Replace the entire contents of the {graphdb.home.data}/repositories/SYSTEM with the backup copy (Note: this will overwrite the repository and lose the meta data for all other GraphDB server repositories!).
  3. Replace the entire contents of the {graphdb.home.data}/repositories/your-repo with the backup copy.
  4. Start the GraphDB server.
  5. Run a quick test read query to check that the repository is initialized correctly.