Replication

Normal operations

During normal operations, the master node keeps the cluster synchronized by processing updates.

When the master detects an out-of-sync worker - because it is offline or unreachable due to network problems - it tries to bring the problem worker node to the same state as the other worker nodes. The master checks whether the signature of the out-of-sync worker corresponds to a known state that was reached back in the transaction log. If so, it starts to replay the missing transactions to the problem worker node until it reaches the same state as the other worker nodes. During this process, the cluster remains in read-write mode and can process updates.

Replication

When the master detects an out of sync worker whose signature does not correspond to a known state in the transaction log, it initiates replication. It chooses an up-to-date worker that is further ahead in the execution of the transaction logs to replicate to the problem worker node. Replicating a worker involves shutting down both workers and initiating a binary transfer of the good worker’s storage folder directly to the storage folder of the bad worker. When replication is complete, both nodes become available.

Note that setting the parameter IncrementalUpdateLimit to a number of missing updates is no longer supported.

The logic behind

When the signature of the problem worker corresponds to a known state, the master automatically chooses between replaying the missing transactions (incremental updates) and replicating the worker, based on which option will be faster.

Let’s say that the estimate of incremental update is incrementalDurationS, and the estimate of replication is replicationDurationS. GraphDB EE prefers the replication when both of these are true:

  • incrementalDurationS > replicationDurationS * FullRreplicationTimeFactor - this is the speed-up
  • incrementalDurationS > MinTimeToConsiderFullReplicationS - this handles the case when the relative difference is big but the absolute difference is small. E.g., 1sec for full replication vs. 2sec for incremental updates.

Parameters

There are three parameters that control the smart replication process. They are controlled via the JMX bean ReplicationCluster:name=ClusterInfo/{$MASTER}, and are persisted in the master’s configuration file.

Parameter Type Default value Description
NetSpeedBitsPerSec bits/sec (long) 104857600 (100Mbps) The network speed. Used to estimate the time for full replication.
FullReplicationTimeFactor ratio (float) 1.3 Speed-up ratio.
MinTimeToConsiderFullReplicationS seconds (long) 600 (10 minutes) Minimum absolute time.

Log messages showing the reasons for starting replication

Replicating (out of sync)
The worker was detected out of sync. This state is either reported by the worker when it finds itself irreparably out of sync with the master’s expectations, or detected by the master when the worker does not report a valid current state.
Replicating (out of sync), forced
The worker was detected out of sync after a reportedly successful completion of an operation (e.g., a replication or an update execution).
Replicating (empty)
The worker was empty upon initialization, and there was at least one non-empty worker attached to the cluster.
Replicating (catch up)
The worker was cued to replicate through smart replication. This message is preceded by a message from the smart replication indicating why replication was preferred over ordinary execution of transactions.

Other log messages

Couldn’t find storage size
The master cannot find a suitable worker to query its storage size.

Compression for replications

Hint

The below example for compressing the repositories is using Snappy.

Note

Enable the compression on all masters and all workers. A mixed setup will result in problems with the replication, including backup and restore.

The compression is enabled by default. To disable the replication compression, add this setting:

-Dreplication.compression=false.

Expected speed up

The following table indicates the expected replication speed tested in a local area network.

Repository size: 41 gigabytes

Network link Without compression With compression (default)
1Gb/s 6m30.074s 3m10.519s
100Mb/s 57m38.210s 23m58.711s