Configuring GraphDB Memory¶
What’s in this document?
Configure Java heap memory¶
The following diagram offers a view of the memory use by the GraphDB structures and processes:

To specify the maximum amount of heap space used by a JVM, use the -Xmx
virtual machine parameter.
Single global page cache¶
GraphDB’s cache strategy, the single global page cache, employs the concept of one global cache shared between all internal structures of all repositories. This way, you no longer have to configure the cache-memory
, tuple-index-memory
, and predicate-memory
, or size every repository and calculate the amount of memory dedicated to it. If at a given moment one of the repositories is being used more, it will naturally get more slots in the cache.
The global page cache size is dynamic and is determined by the given -Xmx
value. It is set as follows:
Heap size |
Global page cache size |
---|---|
Less than 4GB |
25% |
4-8GB |
Linear, starting at 25% and ending at 30% |
8-16GB |
Linear, starting at 30% and ending at 35% |
16-32GB |
Linear, starting at 35% and ending at 40% |
32-100GB |
40% |
Over 100GB |
Max value of 40GB |
The current global page cache size can be set manually by specifying: -Dgraphdb.page.cache.size=3G
.
You can disable the current global page cache implementation by setting -Dgraphdb.global.page.cache=false
.
If you do not specify graphdb.page.cache.size
, it will be determined by the heap range as outlined above.
Note
You do not have to change/edit your repository configurations. The new cache will be used when you upgrade to the new version.
Configure Entity pool memory¶
By default, all entity pool structures are residing on-heap, i.e., inside the regular JVM heap. The graphdb.engine.onheap.allocation
property is used to configure memory allocation not only for the entity pool but also for the other structures. It also specifies the entity pool on-heap allocation regardless of whether the deprecated property graphdb.epool.onheap
is set to true
.
Note
To activate the old behavior, i.e., the entity pool residing off-heap, you can enable off-heap allocation with -Dgraphdb.epool.onheap=false
.
If you are concerned that the process will eat up an unlimited amount of memory, you can specify a maximum size with
-XX:MaxDirectMemorySize
, which defaults to the -Xmx
parameter (at least in OpenJDK and Oracle JDK).
Sample memory configuration¶
This is a sample configuration demonstrating how to correctly size a GraphDB server with a single repository. The loaded dataset is estimated to 500 million RDF statements and 150 million unique entities. As a rule of thumb, the average number of unique entities compared to the total number of statements in a standard dataset is 1:3.
Configuration parameter |
Description |
Example value |
---|---|---|
Total OS memory |
Total physical system memory |
16 GB |
On-heap JVM (-Xmx) configuration |
Maximum heap memory allocated by the JVM process |
10 GB |
|
Global single cache shared between all internal structures of all repositories |
5 GB |
Remaining on-heap memory for query execution |
Raw estimate of the memory for query execution; a higher value is required if many, long running analytical queries are expected |
~4.5 GB |
|
Size of the initial entity pool hash table; the recommended value is equal to the total number of unique entities |
150,000,000 |
Memory footprint of the entity pool stored on-heap by default |
Calculated from |
~2.5 GB |
Remaining OS memory |
Raw estimate of the memory left to the OS |
~3.5 GB |
Upper bounds for the memory consumed by the GraphDB process¶
In order to make sure that no OutOfMemoryExceptions are thrown while working with an active GraphDB repository, you need to set an upper bound value for the memory consumed by all instances of the tupleSet/distinct
collections. This is done with the -Ddefault.min.distinct.threshold
parameter, whose default value is 250m and can be changed. If this value is surpassed, a QueryEvaluationException is thrown so as to avoid running out of memory due to hungry distinct/group
by operation.