Configuring GraphDB¶
GraphDB 9.x relies on several main directories for configuration, logging, and data.
What’s in this document?
Directories¶
GraphDB Home¶
The GraphDB home defines the root directory where GraphDB stores all of its data.
The home can be set through the system or config file property graphdb.home
.
The default value for the GraphDB home directory depends on how you run GraphDB:
Running as a standalone server: the default is the same as the distribution directory.
All other types of installations: OS-dependent directory.
On Mac:
~/Library/Application Support/GraphDB
.On Windows:
Users<username>AppDataRoamingGraphDB
.On Linux and other Unixes:
~/.graphdb
.
Note
In the unlikely case of running GraphDB on an ancient Windows XP, the default
directory is Documents and Settings<username>Application DataGraphDB
.
GraphDB does not store any files directly in the home directory, but uses the following subdirectories for data or configuration:
Data directory¶
The GraphDB data directory defines where GraphDB stores repository data.
The data directory can be set through the system or config property graphdb.home.data
.
The default value is the data
subdirectory relative to the GraphDB home directory.
Config directory¶
The GraphDB config directory defines where GraphDB looks for user-definable configuration.
The config directory can be set through the system property graphdb.home.conf
.
Note
It is not possible to set the config directory through a config property as the value needs to be set before the config properties are loaded.
The default value is the conf
subdirectory relative to the GraphDB home directory.
Work directory¶
The GraphDB work directory defines where GraphDB stores non-user-definable configuration.
The work directory can be set through the system or config property graphdb.home.work
.
The default value is the work
subdirectory relative to the GraphDB home directory.
Logs directory¶
The GraphDB logs directory defines where GraphDB stores log files.
The logs directory can be set through the system or config property graphdb.home.logs
.
The default value is the logs
subdirectory relative to the GraphDB home directory.
Note
When running GraphDB as deployed .war
files, the logs directory will be a subdirectory
graphdb
within the Tomcat’s logs directory.
Important
Even though GraphDB provides the means to specify separate custom directories for data, configuration and so on, it is recommended to specify the home directory only. This ensures that every piece of data, configuration, or logging, is within the specified location.
Step-by-step guide:
Choose a directory for GraphDB home, e.g.,
/opt/graphdb-instance
.Create the directory
/opt/graphdb-instance
.(Optional) Copy the subdirectory
conf
from the distribution into/opt/graphdb-instance
.Start GraphDB with
graphdb -Dgraphdb.home=/opt/graphdb-instance
.
GraphDB creates the missing subdirectories data
, conf
(if you skipped that step), logs
, and work
.
Checking the configured directories¶
When GraphDB starts, it logs the actual value for each of the above directories, e.g.:
GraphDB Home directory: /opt/test/graphdb-se-9.x.x
GraphDB Config directory: /opt/test/graphdb-se-9.x.x/conf
GraphDB Data directory: /opt/test/graphdb-se-9.x.x/data
GraphDB Work directory: /opt/test/graphdb-se-9.x.x/work
GraphDB Logs directory: /opt/test/graphdb-se-9.x.x/logs
Configuration¶
There is a single graphdb.properties
config file for GraphDB.
It is provided in the distribution under conf/graphdb.properties
, where GraphDB loads it from.
This file contains a list of config properties defined in the following format:
propertyName = propertyValue
, i.e., using the standard Java properties file syntax.
Each config property can be overridden through a Java system property with the same name,
provided in the environment variable GDB_JAVA_OPTS
, or in the command line.
Configuration properties¶
The properties are of four types and are detailed below.
General properties¶
The general properties define some basic configuration values that are shared with all GraphDB components and types of installation:
Property name |
Description |
---|---|
|
Defines the GraphDB home directory |
|
Defines the GraphDB data directory |
|
(only as a system property) Defines the GraphDB conf directory |
|
Defines the GraphDB work directory |
|
Defines the GraphDB logs directory |
|
If |
|
The place where the source for GraphDB Workbench is located |
|
Sets a custom path to the license file to use |
|
The amount of memory to be taken by the page cache |
|
The full path to the file where the GraphDB process ID is stored |
|
Tells GraphDB not to close |
|
GraphDB can dump the heap on out-of-memory errors in order to provide insight to
the cause for excessive memory usage. This property enables or disables the heap
dump. Default is |
|
File to write the heap dump to. The default is the
heapdump.hprof file
in the configured logs directory.See also the properties
graphdb.home and graphdb.home.logs . |
URL properties¶
Hint
Jump ahead to Typical use cases for a list of examples that cover URL properties usage.
In certain cases, GraphDB needs to construct a URL that refers to itself:
The repository list in
where each repository provides a link that can be used to access the repository via the REST API.Setting up a cluster via
where the system needs the repository URLs to attach the cluster nodes correctly.When a master node instructs a worker node to provide its data for backup.
When GraphDB is accessed directly (without a reverse proxy), it will figure out the correct URLs based on the URL of incoming requests. For example, if GraphDB is accessed using the URL http://graphdb.example.com:7200/
, it will construct URLs like http://graphdb.example.com:7200/repositories/repoId
.
When GraphDB is accessed via a reverse proxy, the server will not see the actual URL used to access the server and thus it cannot determine a valid external URL on its own. There are two specific setups:
The external URL as seen via the proxy uses
/
as its root, for example,http://rdf.example.com/
.GraphDB will map the external
/
to its own/
automatically, no need to add or change any configuration.GraphDB will still not know how to construct external URLs, so setting
graphdb.external-url
is recommended even though it might appear to work without setting it.
The external URL as seen via the proxy uses
/something
as its root (i.e., something in addition to the/
), for example,http://example.com/rdf
.GraphDB cannot map this automatically and needs to be configured using the property
graphdb.vhosts
orgraphdb.external-url
(see below).This will instruct GraphDB that URLs beginning with
http://example.com/rdf/
map to the root path/
of the GraphDB server.
The URL properties determine how GraphDB constructs URLs that refer to itself, as well as what URLs are recognized as URLs to access the GraphDB installation. GraphDB will try to auto-detect those values based on URLs used to access it, and the network configuration of the machine running GraphDB. In certain setups involving virtualization or a reverse proxy, it may be necessary to set one or more of the following properties:
Property |
Description |
---|---|
|
A comma-delimited list of virtual host URLs that can be used to access GraphDB. Setting this property is
necessary when GraphDB needs to be accessed behind a reverse proxy and the path of the external URL is different from |
|
Sets the canonical external URL. This property implies When a reverse proxy is in use and most users will access GraphDB through the proxy, it is recommended to set this property instead of, or in addition to Tip Prior to GraphDB 9.8, only the |
|
Overrides the hostname reported by the machine. |
Note
For remote locations, the URLs are always constructed using the base URL of the remote location as specified when the location was attached.
Typical use cases¶
GraphDB is behind a reverse proxy whose URL path is
/
and most clients will use the proxy URL.This setup will appear to work out-of-the box without setting any of the URL properties but it is recommended to set
graphdb.external-url
. Example URLs:Internal URL:
http://graphdb.example.com:7200/
External URL used by most clients:
http://rdf.example.com/
The corresponding configuration is:
# Recommended even though it may appear to work without setting this property graphdb.external-url = http://rdf.example.com/
GraphDB is behind a reverse proxy whose URL path is
/something
and most clients will use the proxy URL.This configuration requires setting
graphdb.external-url
(recommended) orgraphdb.vhosts
to the correct URLs as seen externally through the proxy. Example URLs:Internal URL:
http://graphdb.example.com:7200/
External URL used by most clients:
http://example.com/rdf/
The corresponding configuration is:
# Required and recommended graphdb.external-url = http://example.com/rdf/ # Non-recommended alternative to the above #graphdb.vhosts = http://example.com/rdf/
The GraphDB Workbench is used to set up a cluster and is accessed using a localhost URL.
The system will construct URLs using the hostname of the machine as reported by the machine. This works well with consistent network configurations. When the configuration is inconsistent, for example the hostname is not resolvable from the other machines that need to join the cluster, you may need to set
graphdb.hostname
to the correct hostname value or avoid using localhost URLs.Note that using localhost URLs is recommended only in limited scenarios, such as accessing GraphDB only from the machine where it is running.
Complex network configurations with GraphDB cluster
Some complex network configurations involve a reverse proxy used to access the master node of the GraphDB cluster but the communication between the cluster nodes does not use the proxy. In such cases, you may need to set more than one of the URL properties to match the specific needs.
This is also almost always the case with Docker, Kubernetes, or other virtualization or network isolation methods involving setting up a GraphDB cluster.
Let’s take the following example:
Master node is accessible at
http://master.example.com:7200/
via a direct connection.Worker node 1 is accessible at
http://worker1.example.com:7200/
via a direct connection.Worker node 2 is accessible at
http://worker2.example.com:7200/
via a direct connection.Cluster-internal communication needs to use the above addresses.
GraphDB users have no direct access to the master URL
http://master.example.com:7200/
and instead must use a URL through the reverse proxy, for examplehttp://example.com/graphdb/
.
Matching configuration (on the master node):
# Configures access through the proxy and lists the internal URL explicitly # as we need to use that URL as the value of graphdb.external-url below graphdb.vhosts = http://example.com/graphdb/, http://master.example.com:7200/ # Sets the internal URL of the master as the canonical external URL so that # cluster management and cluster backup will use the correct URLs graphdb.external-url = http://master.example.com:7200/
No extra configuration is needed on the worker nodes.
Network properties¶
The network properties control how the standalone application listens on a network. These properties correspond to the attributes of the embedded Tomcat Connector. For more information, see Tomcat’s documentation.
Each property is composed of the prefix graphdb.connector.
+ the relevant Tomcat Connector attribute.
The most important property is graphdb.connector.port
, which defines the port to be used. The default is 7200.
In addition, the sample config file provides an example for setting up SSL.
Note
The graphdb.connector.<xxx>
properties are only relevant when running GraphDB
as a standalone application.
Engine properties¶
You can configure the GraphDB Engine through a set of properties composed of the prefix
graphdb.engine.
+ the relevant engine property. These properties correspond to the properties
that can be set when creating a repository through the Workbench or through a .ttl
file.
Note
The properties defined in the config override the properties for each repository, regardless of whether you created the repository before or after setting the global value of an engine property. As such, the global override should be used only in specific cases. For normal everyday needs, set the corresponding properties when you create a repository.
Property name |
Description |
Default value |
---|---|---|
graphdb.engine.entity-pool-implementation |
Defines the Entity Pool implementation for the whole installation.
Possible values are |
The default value is |
|
Since GraphDB 8.6.1, inferencers for our Parallel loader are shut down at the end of each transaction to minimize GraphDB’s memory footprint. For cases where a lot of small insertions are done in a quick succession that can be a problem, as inferencer initialization times can be fairly slow. This setting reverts to the old behavior where inferencers are only shut down when the repository is released. |
|
|
A global setting that ensures IRI validation in the entity pool.
It is performed only when an IRI is seen for the first time
(i.e., when being created in the entity pool). For consistency
reasons, not only IRIs coming from RDF serializations, but also
all new IRIs (via API or SPARQL), will be validated in the same way.
This property can be turned off by setting its value to |
|
Note
Note that IRI validation makes the import of broken data more problematic - in such a case, you would have to change a config property and restart your GraphDB instance instead of changing the setting per import.
Configuring logging¶
GraphDB uses logback to configure logging. The default configuration is provided as logback.xml
in the GraphDB conf
directory.
Jolokia security policy¶
The GraphDB Jolokia security policy is provided as jolokia-access.xml
file in the GraphDB conf
directory. Open it to see the default restrictions.
Overriding of the default settings is done as follows:
If
graphdb.home.conf
is not explicitly set, you can configureconf/jolokia-access.xml
if necessary.If
graphdb.home.conf
is explicitly set, but thejolokia-access.xml
file is not placed in the respective directory, the default config will load.If
graphdb.home.conf
is explicitly set, andjolokia-access.xml
is placed in the respective directory, this file will load.
See more about the Jolokia security.