Backing up and restore a cluster

Backup a cluster

The GraphDB cluster backup is similar to the repository backup with the complexity of synchronising all concurrent cluster operations. The master node performs the cluster backup by selecting a worker node on the latest timestamp and OK status. The backup includes, in addition to the standard repository backup, the latest entry from the master’s transaction log so it can recover also the master’s state.

GraphDB supports:

  • Full backup of the cluster
  • Incremental backup of the cluster

The backup image will go into a directory under the master node’s repository data directory which by default is $GDB_HOME/data/repositories/<repositoryId>/backup.

Each master node maintains its own set of backup images. The set of backup images available from one master would be utterly unrelated to the collections of images available from other masters. Backing up a second time with the same image name will overwrite the previous image with that name.

Full backup of the cluster

The full backup operation switch in read-only mode one of the workers and copy all its data to the master. The operation accepts a single optional parameter identifies the backup name. If the parameter is left null, then the name will become default. The backup name will appear $GDB_HOME/data/repositories/<repositoryId>/backup/<backup-name>.

Start a full backup named default with:

curl -H 'content-type: application/json' -d '{"type":"exec", "mbean":"ReplicationCluster:name=ClusterInfo\/repositoryId", "operation":"backup", "arguments":[null]}' http://masterNode:7200/jolokia/

Start a named backup 20190501 with:

curl -H 'content-type: application/json' -d '{"type":"exec", "mbean":"ReplicationCluster:name=ClusterInfo\/repositoryId", "operation":"backup", "arguments":["20190501]}' http://masterNode:7200/jolokia/

Note

At least two notification messages will be sent under a normal operation - one to indicate that the backup operation has been started, and another to suggest that it has completed successfully.

Important

  • The image name default is reserved for internal use only. Backing up to an image named as “default” would interfere with the cluster’s failure recovery capabilities.
  • The master may use only images named as “default” to recover worker nodes if there are no other workers available to serve the replication.

Incremental backup of the cluster

The incremental backups are optimized for large repositories requiring frequent recovery points. During an incremental backup, a worker running with the latest timestamp will check the modified pages since the last full backup and store only these changes. The backup will also include the full backup of all plugin data.

The incremental backup operation expects three parameters:

  • the backup name of the last full backup, against it the cluster will calculate the delta
  • the incremental backup name
  • whether to rebuild a full image from the delta ready for a direct restore

For example, if the standard backup pattern is to perform a full backup on Sunday and then every day an incremental backup:

# Performs a full backup on sunday
curl -H 'content-type: application/json' -d '{"type":"exec", "mbean":"ReplicationCluster:name=ClusterInfo\/repositoryId", "operation":"backup", "arguments":["20190504-sunday"]}' http://masterNode:7200/jolokia/

# Performs the first incremental backup storing only the delta compared to Sunday
curl -H 'content-type: application/json' -d '{"type":"exec", "mbean":"ReplicationCluster:name=ClusterInfo\/repositoryId", "operation":"incrementalBackup", "arguments":["20190504-sunday","20190505-monday", false]}' http://masterNode:7200/jolokia/

# Performs the second incremental backup storing only the delta compared to Sunday including the data from Monday
curl -H 'content-type: application/json' -d '{"type":"exec", "mbean":"ReplicationCluster:name=ClusterInfo\/repositoryId", "operation":"incrementalBackup", "arguments":["20190504-sunday","20190506-tuesday", false]}' http://masterNode:7200/jolokia/

Restore a cluster

The cluster restore is initiated from a backup image stored on the master node. The backup image is replicated through all worker nodes in the cluster and propagated by other peered master nodes as needed.

Start the cluster restore from the default backup with:

curl -H 'content-type: application/json' -d '{"type":"exec","mbean":"ReplicationCluster:name=ClusterInfo\/repositoryId","operation":"restoreFromImage","arguments":[null]}' http://masterNode:7200/jolokia/

Start the cluster restore from a named backup with:

curl -H 'content-type: application/json' -d '{"type":"exec","mbean":"ReplicationCluster:name=ClusterInfo\/repositoryId","operation":"restoreFromImage","arguments":["20190504-sunday"]}' http://masterNode:7200/jolokia/

To recover an incremental backup you need to apply the delta to the latest full backup with the incremental backup recovery tool $GDB_HOME/bin/ibrtool. The tool will list all incremental backups for the current master and guide you on how to do this. After rebuilding the full image from the delta, the user may recover the incremental backup with:

curl -H 'content-type: application/json' -d '{"type":"exec","mbean":"ReplicationCluster:name=ClusterInfo\/repositoryId","operation":"restoreFromImage","arguments":["20190505-monday"]}' http://masterNode:7200/jolokia/

Note

At least two notification messages will be sent under normal the operation, one to indicate that the restore operation has been started, and another to suggest that it has completed successfully.

Important

  • After a successful restore operation, any updates executed from the time the backup image was created to the time the cluster was restored will be lost irreversibly.
  • Upon failure of a restore operation, the cluster may be in an inconsistent state.
  • External stores like SOLR and Elasticsearch synchronized by the GraphDB Connector, will not be restored. The user will have to drop and to recreate all connectors.

Warning

Trying to initiate a backup or restore while another backup or restore operation is already in progress on the same master node will be treated as an error and will result in immediate failure. The failure of the second operation will not interfere with the operation that was already in progress.