GraphDB Free 7.1
Table of contents
- General
- Quick start guide
- Installation
- Administration
- Administration tasks
- Administration tools
- Creating locations and repositories
- Configuring a repository
- Sizing guidelines
- Disk space requirements
- Configuring the Entity Pool
- Managing repositories
- Access rights and security
- Backing up and recovering a repository
- Query monitoring and termination
- Database health checks
- Diagnosing and reporting critical errors
- Usage
- Tools
- References
- Release notes
- FAQ
- Support
GraphDB Free 7.1
Table of contents
- General
- Quick start guide
- Installation
- Administration
- Administration tasks
- Administration tools
- Creating locations and repositories
- Configuring a repository
- Sizing guidelines
- Disk space requirements
- Configuring the Entity Pool
- Managing repositories
- Access rights and security
- Backing up and recovering a repository
- Query monitoring and termination
- Database health checks
- Diagnosing and reporting critical errors
- Usage
- Tools
- References
- Release notes
- FAQ
- Support
Configuring a repository¶
Before you start adding or changing the parameters’ values, it is good to plan your repository configuration, to know what each of the parameters does, what the configuration template is and how it works, what data structures GraphDB supports, what configuration values are optimal for your set up, etc.
Planning a repository configuration¶
To plan your repository configuration, check out the following sections:
Configuring a repository through the GraphDB Workbench¶
To configure a new repository through the Workbench, fill in the configuration page that opens when you click the ‘Create Repository’ button. The parameters are described in the Configuration parameters section.

Alternatively, you can create a .ttl configuration file using the template and specify the repository type, ID and configuration parameters. Click the triangle at the edge of the Create repository button and upload it.
Editing a repository
Some of the parameters you specify at repository creation time can be changed at any point. Click the edit icon next to a repository to edit it. Note that you have to restart GraphDB for the changes to take effect.
Configuring a repository programmatically¶
To configure a new repository programmatically, fill in the .ttl configuration template that can be found in
the /templates
folder of the GraphDB distribution. The parameters are described in
the
Configuration parameters section.
# Sesame configuration template for a GraphDB Free repository
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#>.
@prefix rep: <http://www.openrdf.org/config/repository#>.
@prefix sr: <http://www.openrdf.org/config/repository/sail#>.
@prefix sail: <http://www.openrdf.org/config/sail#>.
@prefix owlim: <http://www.ontotext.com/trree/owlim#>.
[] a rep:Repository ;
rep:repositoryID "graphdb-test" ;
rdfs:label "GraphDB Free repository" ;
rep:repositoryImpl [
rep:repositoryType "graphdb:FreeSailRepository" ;
sr:sailImpl [
sail:sailType "graphdb:FreeSail" ;
owlim:base-URL "http://example.org/graphdb#" ;
owlim:defaultNS "" ;
owlim:entity-index-size "10000000" ;
owlim:entity-id-size "32" ;
owlim:imports "" ;
owlim:repository-type "file-repository" ;
owlim:ruleset "owl-horst-optimized" ;
owlim:storage-folder "storage" ;
owlim:enable-context-index "false" ;
owlim:cache-memory "256m" ;
owlim:tuple-index-memory "224m" ;
owlim:enablePredicateList "true" ;
owlim:predicate-memory "32m" ;
owlim:in-memory-literal-properties "true" ;
owlim:enable-literal-index "true" ;
owlim:check-for-inconsistencies "false" ;
owlim:disable-sameAs "false" ;
owlim:transaction-mode "safe" ;
owlim:transaction-isolation "true" ;
owlim:query-timeout "0" ;
owlim:query-limit-results "0" ;
owlim:throw-QueryEvaluationException-on-timeout "false" ;
owlim:read-only "false" ;
]
].
Tip
GraphDB uses a Sesame configuration
template
for configuring its repositories. Sesame 2.0 keeps the repository
configurations with their parameters, modelled in RDF, in the SYSTEM
repository. Therefore, in order to create a new repository, the Sesame
needs such an RDF file to populate the SYSTEM
repository. For more information how the configuration template works, see Repository configuration template - how it works.
Configuration parameters¶
This is a list of all repository configuration parameters. Some of the parameters can be changed (effective after a restart), some cannot be changed (the change has no effect) and others need special attention once a repository has been created, as changing them will likely lead to inconsistent data (e.g., unsupported inferred statements, missing inferred statements, or inferred statements that can not be deleted).
- base-URL
- defaultNS
- entity-index-size
- entity-id-size
- imports
- repository-type
- ruleset
- storage-folder
- enable-context-index
- cache-memory
- tuple-index-memory
- enablePredicateList
- predicate-memory
- in-memory-literal-properties
- enable-literal-index
- check-for-inconsistencies
- disable-sameAs
- transaction-mode
- transaction-isolation
- query-timeout
- query-limit-results
- throw-QueryEvaluationException-on-timeout
- useShutdownHooks (deprecated)
- index-compression-ratio (deprecated)
- enable-optimization (deprecated)
- read-only
base-URL
(Can be changed)- Description: Specifies the default namespace for the main persistence file. Non-empty namespaces are recommended, because their use guarantees the uniqueness of the anonymous nodes that may appear within the repository.Default value:
none
defaultNS
(Cannot be changed)- Description: Default namespaces corresponding to each imported schema file separated by semicolon and the number of namespaces must be equal to the number of schema files from the
imports
parameter.Default value:<empty>
Example:owlim:defaultNS "http://www.w3.org/2002/07/owl#;http://example.org/owlim#"
.Warning
This parameter cannot be set via a command line argument.
- entity-index-size (Cannot be changed)
- Description: Defines the number of entity hash table index entries. The bigger the size, the less the collisions in the hash table and the faster the entity retrieval. The entity hash table does not rehash, so its index size is constant throughout the life of the repository. The recommended value is *the number of entities x 1,5*.Default value:
10000000
- entity-id-size (Cannot be changed)
- Description: Defines the bit size of internal IDs used to index entities (URIs, blank nodes and literals). In most cases, this parameter can be left to its default value. However, if very large datasets containing more than 2 32 entities are used, set this parameter to
40
. Be aware that this can only be set when instantiating a new repository and converting an existing repository from 32 to 40-bit entity widths is not possible.Default value:32
Possible values:32
and40
imports
(Cannot be changed)- Description: A list of schema files that will be imported at start up. All the statements, found in these files, will be loaded in the repository and will be treated as
read-only
. The serialisation format is determined by the file extension:
.brf
=> BinaryRDF.n3
=> N3.nq
=> N-Quads.nt
=> N-Triples.owl
=> RDF/XML.rdf
=> RDF/XML.rdfs
=> RDF/XML.trig
=> TriG.trix
=> TriX.ttl
=> Turtle.xml
=> TriX
none
owlim:imports "./ont/owl.rdfs;./ont/ex.rdfs"
.Tip
Schema files can be either a local path name, e.g., ./ontology/myfile.rdf
or a URL, e.g., http://www.w3.org/2002/07/owl.rdf
. If this parameter is used, the default namespace for each imported schema file must be provided using the defaultNS parameter.
repository-type
(Cannot be changed)- Default value:
file-repository
Possible values:file-repository
,weighted-file-repository
.
- ruleset (Needs special attention)
- Description: Sets of axiomatic triples, consistency checks and entailment rules, which determine the applied semantics.Default value:
owl-horst-optimized
Possible values:empty
,rdfs
,owl-horst
,owl-max
andowl2-rl
and their optimised counterpartsrdfs-optimized
,owl-horst-optimized
,owl-max-optimized
andowl2-rl-optimized
. A custom ruleset is chosen by setting the path to its rule file.pie
.
- storage-folder (Can be changed)
- Description: specifies the folder where the index files will be stored.Default value:
none
- cache-memory (Can be changed)
- Description: Specifies the total amount of memory to be given to all types of cache.Default value:
<none>
- tuple-index-memory (Can be changed)
- Description: Specifies the amount of memory to be used for statement storage cache.Default value:
224m
- enable-context-index (Can be changed)
- Default value:
false
Possible value:true
, where GraphDB will build and use the context index/indices.
- enablePredicateList (Can be changed)
- Description: Enables or disables mappings from an entity (subject or object) to its predicates; switching this on can significantly speed up queries that use wildcard predicate patterns.Default value:
false:
- predicate-memory (Can be changed)
- Description: Specifies the amount of memory to be used for predicate lists cache.Default value:
32m
- in-memory-literal-properties (Can be changed)
- Description: Turns caching of the literal languages and data-types on and off. If the caching is on and the entity pool is restored from persistence, but there is no such cache available on disk, it is created after the entity pool initialisation.Default value:
false
- enable-literal-index (Can be changed)
- Description: Enables or disables the Storage. The literal index is always built as data is loaded/modified. This parameter only affects whether the index is used during query-answering.Default value:
true
- check-for-inconsistencies (Can be changed)
- Description: Turns the mechanism for consistency checking on and off; consistency checks are defined in the rule file and are applied at the end of every transaction, if this parameter is
true
. If an inconsistency is detected when committing a transaction, the whole transaction will be rolled back.Default value:false
- disable-sameAs (Needs special attention)
- Description: Enables or disables the
owl:sameAs
optimisation.Default value:false
- transaction-mode (Can be changed)
- Description: Specifies the transaction mode. In
fast
mode, dirty pages are written to disk in the laziest fashion possible, i.e., pages are only swapped when a new page is requested and there is no more memory available. No guarantees about data security are given when operating in this mode. So, in the event of an abnormal termination, the database must be considered corrupted and will need to be recreated from scratch.Default value:safe
; when set tosafe
, all updates are flushed to disk at the end of each transaction. Commit operations normally take a little longer, but recovery after an abnormal termination is instant. This mode also has much better concurrency characteristics.
- transaction-isolation (Can be changed)
- Description: This parameter only has an effect when
transaction-mode=fast
. In fast mode, updates lock the repository preventing concurrent query answering.Default value:true
;Possible value:false
, if set, concurrent queries are permitted with the loss of isolation.
- query-timeout (Can be changed)
- Description: Sets the number of seconds after which the evaluation of a query will be terminated; values less than or equal to zero mean no limit.Default value:
0
; (no limit);
query-limit-results
(Can be changed)- Description: Sets the maximum number of results returned from a query after which the evaluation of a query will be terminated; values less than or equal to zero mean no limit.Default value:
0
; (no limit);
throw-QueryEvaluationException-on-timeout
(Can be changed)- Default value:
false
Possible value:true
; if set, aQueryEvaluationException
is thrown when the duration of a query execution exceeds the time-out parameter.
useShutdownHooks
(Can be changed) (deprecated)- Default value:
true
. If set, the methodOwlimSchemaRepository.shutdown()
is called when the JVM exits (running GraphDB under Tomcat requires this parameter to betrue
, otherwise it cannot be guaranteed that theshutdown()
method will be called at all).
- index-compression-ratio (Cannot be changed) (deprecated)
- Description: The compression ratio of paged index files as a percentage of their uncompressed size. The value indicates how much smaller the compressed page should be, so a value of 25 (percent) will attempt to make the index files one quarter of their uncompressed size. Any page that can not be compressed to this size will be stored uncompressed in a separate overlay file.Default value:
-1
Possible value:-1
(off) and the range [10-50]Recommended value:30
- enable-optimization (Can be changed) (deprecated)
- Description: Enables or disables query optimisation.Default value:
true
Warning
Disabling query optimisation is rarely needed - usually only for debugging purposes. Also, be aware that disabling query optimisation will also disable the correct behaviour of plugins (Full-text search, Geo-spatial extensions, RDF Rank, etc).
read-only
(Can be changed)- Description: In this mode, no modifications are allowed to the data or namespaces.Default value:
false
Possible value:true
, puts the repository in toread-only
mode.
Configuring GraphDB memory¶
Configuring Java heap memory¶
The following diagram offers a view of the memory use by the GraphDB structures and processes:

To specify the maximum amount of heap space used by a JVM, use the -Xmx
virtual machine parameter.
The Xmx
value should be about 2/3 of the system memory. For example, if a system has 8GB total of RAM and 1GB is used by
the operating system, services, etc. and 1GB by the entity pool and the hash maps, as they are off heap, ideally, the JVM that hosts the
application using GraphDB should have a maximum heap size of 6GB
and can be set using the JVM argument: -Xmx6g
.
Cache memory¶
The cache memory parameter controls the number of pages stored in the memory. A larger cache memory means less disk operations and a faster repository performance. The cache is further distributed to tuple-index-memory
and predicate-memory
. The size of cache-memory
should be equal to the sum of tuple-index-memory
and predicate-memory
values.
Note
To avoid runtime out of memory errors the sum of configured cache memory for all repositories should not exceed more than 75% of the total heap memory and leave at least 2 GB for the normal execution of queries.
Configuring Entity pool memory¶
From GraphDB 7.1 on, you no longer have to calculate the entity pool memory when giving the JVM max heap memory parameter to GraphDB. All entity pool structures now reside off-heap, i.e. outside of the normal JVM heap.
This means, however, that you need to leave some memory outside of the Xmx parameter.
To activate the old behaviour, you can still enable on heap allocation with
-Dgraphdb.epool.onheap=true
If you are concerned that the process will eat up unlimited amount of memory, you can specify a maximum size with
-XX:MaxDirectMemorySize
,
which defaults to the Xmx
parameter (at least in openjdk and oracle jdk).
Sample memory configuration¶
This is a sample configuration demonstrating how to correctly size a GraphDB server with a single repository. The loaded dataset is estimated to 500M RDF statements and 150M unique entities. As a rule of thumb, the average number of unique entities compared to the total number of statements in a standard dataset is 1:3.
Configuration parameter | Description | Example value |
---|---|---|
Total OS memory | Total physical system memory | 16 GB |
On heap JVM (-Xmx) configuration | Maximum heap memory allocated by the JVM process | 11 GB |
tuple-index-memory ( “Tuple index memory”) |
The number of cached pages 1 million triples allocate between 90M and 120M RAM, if in memory | 6 GB |
predicate-memory ( “Predicate index memory” under “Use predicate indices”) |
Use predicate indices if the number of unique predicates is more than few tens. | 1 GB |
cache-memory ( “Total cache memory”) |
Sum of tuple-index-memory and predicate-memory |
7 GB |
entity-index-size ( “Entity index size”) stored off-heap by default |
Size of the entity pool hashtable; the recommended value is equal to the total number of unique entities divided by 5 | 75000000 |
Memory footprint of the entity pool stored off-heap by default | Calculated from entity-index-size and total number of entities; this memory will be taken after the repository initialisation |
~1.5 GB |
Remaining on-heap memory for query execution | Raw estimate of the memory for query execution; higher value is required if many long running analytical queries are expected | ~4 GB |
Remaining OS memory | Raw estimate of the memory left to the OS | ~3.5 GB |