Backing up and restore a cluster

Backup a cluster

The GraphDB cluster backup is similar to the repository backup. The key difference is that in the cluster there are multiple worker nodes and each of them have a different state such as OK (ready to serve reads and writes), OUT_OF_SYNC (wrong data checksum) or REPLICATION_CLIENT/REPLICATION_SERVER (serve/accept a replication from another node). For more information please check the cluster’s basic operations.

There are three options to backup a cluster:

  • Backup the cluster from the master node
  • Export repository to an RDF file
  • Backup a cluster incrementally

Backup the cluster from the master node

This is the most recommended approach for backup, since the master node will also check each worker’s state before selecting the backup source.

  1. Invoke the backup operation with a single optional parameter - the name used for identifying the image. The backup image will go into a directory under the master node’s repository data directory.
curl 'http://masterNode:7200/jolokia/exec/ReplicationCluster:name=ClusterInfo!/repositoryId/backup/\[null\]'

Note

At least two notification messages will be sent under a normal operation - one to indicate that the backup operation has been started, and another to indicate that it has completed successfully.

Important

  • The image name “default” is reserved for internal use only. Backing up to an image named as “default” would interfere with the cluster’s failure recovery capabilities.
  • Backing up a second time with the same image name will overwrite the previous image with that name.
  • Each master node maintains its own set of backup images. The set of backup images available from one master would be completely unrelated to the sets of images available from other masters.
  • Only images named as “default” may be used by the master to recover worker nodes if there are no other workers available to serve the replication.

Export repository to an RDF file

Note

This method is best used for a small running system during periods without writes.

Export the database contents using the Workbench. To preserve the contexts (named graph) when exporting/importing the whole database, use a context-aware RDF file format, e.g., TriG.

  1. Go to Explore/Graphs overview.
  2. Choose the files you want to export.
  3. Click Export graph as TriG.
../_images/export_TriG1.png

Backup a cluster incrementally

curl -H 'content-type: application/json' -d "{\"type\":\"exec\",\"mbean\":\"ReplicationCluster:name=ClusterInfo\/Repository_ID\",\"operation\":\"incrementalBackup\",\"arguments\":[\"sourceBackup\",\"targetBackup\",true]}" http://masterNode:7200/jolokia/

Restore a cluster

The cluster restore is initiated from a backup image stored on the master node. The backup image is replicated through all worker nodes in the cluster and propagated by other peered master nodes as needed.

  1. Invoke the restoreFromImage operation with an existing backup name as a parameter. The image created by the backup will be replicated to every worker throughout the cluster.

Note

At least two notification messages will be sent under normal operation, one to indicate that the restore operation has been started, and another to indicate that it has completed successfully.

Important

  • After a successful restore operation, any updates executed from the time the backup image was created to the time the cluster was restored will be lost irreversibly.
  • Upon failure of a restore operation, the cluster may be in an inconsistent state.
  • External stores like SOLR and Elasticsearch synchronized by the GraphDB Connector, will not be restored. The user will have to drop and to recreate all connectors.

Warning

Trying to initiate backup or restore while another backup or restore operation is already in progress on the same master node will be treated as an error and will result in immediate failure. The failure of the second operation will not interfere with the operation that was already in progress.