GraphDB Plugin API¶
What’s in this document?
What is the GraphDB Plugin API¶
The GraphDB Plugin API is a framework and a set of public classes and interfaces that allow developers to extend GraphDB in many useful ways. These extensions are bundled into plugins, which GraphDB discovers during its initialization phase and then uses to delegate parts of its query or update processing tasks. The plugins are given low-level access to the GraphDB repository data, which enables them to do their job efficiently. They are discovered via the Java service discovery mechanism, which enables dynamic addition/removal of plugins from the system without having to recompile GraphDB or change any configuration files.
Description of a GraphDB plugin¶
A GraphDB plugin is a Java class that implements the
com.ontotext.trree.sdk.Plugin
interface. All public classes and
interfaces of the plugin API are located in this Java package, i.e.,
com.ontotext.trree.sdk
. Here is what the plugin interface looks like in
an abbreviated form:
/**
* The base interface for a GraphDB plugin. As a minimum a plugin must implement this interface.
* <p>
* Plugins also need to be listed in META-INF/services/com.ontotext.trree.sdk.Plugin so that Java's services
* mechanism may discover them automatically.
*/
public interface Plugin extends Service {
/**
* A method used by the plugin framework to configure each plugin's file system directory. This
* directory should be used by the plugin to store its files
*
* @param dataDir file system directory to be used for plugin related files
*/
void setDataDir(File dataDir);
/**
* A method used by the plugin framework to provide plugins with a {@link Logger} object
*
* @param logger {@link Logger} object to be used for logging
*/
void setLogger(Logger logger);
/**
* Plugin initialization method called once when the repository is being initialized, after the plugin has been
* configured and before it is actually used. It enables plugins to execute whatever
* initialization routines they consider appropriate, load resources, open connections, etc., based on the
* specific reason for initialization, e.g., backup.
* <p>
* The provided {@link PluginConnection} instance may be used to create entities needed by the plugin.
*
* @param reason the reason for initialization
* @param pluginConnection an instance of {@link PluginConnection}
*/
void initialize(InitReason reason, PluginConnection pluginConnection);
/**
* Sets a new plugin fingerprint.
* Every plugin should maintain a fingerprint of its data that could be used by GraphDB to determine if the
* data has changed or not. Initially, on system initialization the plugins are injected their
* fingerprints as they reported them before the last system shutdown
*
* @param fingerprint the last known plugin fingerprint
*/
void setFingerprint(long fingerprint);
/**
* Returns the fingerprint of the plugin.
* <p>
* Every plugin should maintain a fingerprint of its data that could be used by GraphDB to determine if the
* data has changed or not. The plugin fingerprint will become part of the repository fingerprint.
*
* @return the current plugin fingerprint based on its data
*/
long getFingerprint();
/**
* Plugin shutdown method that is called when the repository is being shutdown. It enables plugins to execute whatever
* finalization routines they consider appropriate, free resources, buffered streams, etc., based on the
* specific reason for shutdown.
*
* @param reason the reason for shutdown
*/
void shutdown(ShutdownReason reason);
}
As it derives from the Service
interface, the plugin is
automatically discovered at run-time, provided that the following
conditions also hold:
The plugin class is located in the classpath.
It is mentioned in a
META-INF/services/com.ontotext.trree.sdk.Plugin
file in the classpath or in a.jar
that is in the classpath. The full class signature has to be written on a separate line in such a file.
The only method introduced by the Service
interface is
getName()
, which provides the plugin’s (service’s) name. This name
must be unique within a particular GraphDB repository, and serves as
a plugin identifier that can be used at any time to retrieve a
reference to the plugin instance.
/**
* Interface implemented by all run-time discoverable services (e.g. {@link Plugin} instances). Classes
* implementing this interface should furthermore be declared in the respective
* META-INF/services/<class.signature> file and will then be discoverable at run-time.
* <p>
* Plugins need not implement this interface directly but rather implement {@link Plugin}.
*/
public interface Service {
/**
* Gets the service name (serves as a key for discovering the service)
*
* @return service name
*/
String getName();
}
There are many more functions (interfaces) that a plugin could implement, but these are all optional and are declared in separate interfaces. Implementing any such complementary interface is the means to announce to the system what this particular plugin can do in addition to its mandatory plugin responsibilities. It is then automatically used as appropriate. See List of plugin interfaces and classes.
The life cycle of a plugin¶
A plugin’s life cycle consists of several phases:
Discovery¶
This phase is executed at repository initialization.
GraphDB searches for all plugin services in the classpath registered in the
META-INF/services/com.ontotext.trree.sdk.Plugins
service registry
files, and constructs a single instance of each plugin found.
Configuration¶
Every plugin instance discovered and constructed
during the previous phase is then configured. During this phase,
plugins are injected with a Logger
object, which they use for
logging (setLogger(Logger logger)
), and the path to their own
data directory (setDataDir(File dataDir)
), which they create, if
needed, and then use to store their data. If a plugin does not need
to store anything to the disk, it can skip the creation of its data
directory. However, if it needs to use it, it is guaranteed that this
directory will be unique and available only to the particular plugin
that it was assigned to.
This phase is also called when a plugin is enabled after repository initialization.
Initialization¶
After a plugin has been configured, the
framework calls its initialize(InitReason reason, PluginConnection pluginConnection)
method so it gets the chance to
do whatever initialization work it needs to do.
The passed instance of PluginConnection
provides access to various other structures and interfaces,
such as Statements
and Entities
instances (Repository internals), and
a SystemProperties
instance, which gives the plugins access to the
system-wide configuration options and settings. Plugins typically use this phase to create IRIs
that will be used to communicate with the plugin.
This phase is also called when a plugin is enabled after repository initialization.
Request processing¶
The plugin participates in the request processing. The request phase applies to the evaluation of SPARQL queries,
getStatements
calls, the transaction stages and the execution of SPARQL updates. Various event notifications
can also be part of this phase.
This phase is optional for the plugins but no plugin is useful without implementing at least one of its interfaces.
Request processing can be divided roughly into query processing and update processing.
Query processing¶
Query processing includes several sub-phases that can be used on their own or combined together:
- Pre-processing
Plugins are given the chance to modify the request before it is processed. In this phase, they could also initialize a context object, which will be visible till the end of the request processing (Pre-processing).
- Pattern interpretation
Plugins can choose to provide results for requested statement patterns (Pattern interpretation). This sub-phase applies only to queries.
- Post-processing
Before the request results are returned to the client, plugins are given a chance to modify them, filter them out, or even insert new results (Post-processing);
Update processing¶
Update processing includes several layers of processing:
- Transaction events
Plugins are notified about the beginning and end of a transaction.
- Update handling
Plugins can choose to handle certain updates (additions or removals) instead of letting the repository handle the updates as regular data.
- Entities and statements notifications
Plugins can be notified about the creation of entities, the addition and removal of statements.
Repository internals¶
The repository internals are accessed via an instance of PluginConnection
:
/**
* The {@link PluginConnection} interface provides access to various objects that can be used to query data
* or get the properties of the current transaction. An instance of {@link PluginConnection} will be passed to almost
* all methods that a plugin may implement.
*/
public interface PluginConnection {
/**
* Returns an instance of {@link Entities} that can be used to retrieve or create RDF entities.
*
* @return an {@link Entities} instance
*/
Entities getEntities();
/**
* Returns an instance of {@link Statements} that can be used to retrieve RDF statements.
*
* @return a {@link Statements} instance
*/
Statements getStatements();
/**
* Returns the transaction ID of the current transaction or 0 if no explicit transaction is available.
*
* @return the transaction ID
*/
long getTransactionId();
/**
* Returns the update testing status. In a multi-node GraphDB configuration (currently only GraphDB EE) an update
* will be sent to multiple nodes. The first node that receives the update will be used to test if the update is
* successful and only if so, it will be send to other nodes. Plugins may use the update test status to perform
* certain operations only when the update is tested (e.g. indexing data via an external service). The method will
* return true if this is a GraphDB EE worker node testing the update or this is GraphDB Free or SE. The method will
* return false only if this is a GraphDB EE worker node that is receiving a copy of the original update
* (after successful testing on another node).
*
* @return true if this update is sent for the first time (testing the update), false otherwise
*/
boolean isTesting();
/**
* Returns an instance of {@link SystemProperties} that can be used to retrieve various properties that identify
* the current GraphDB installation and repository.
*
* @return an instance of {@link SystemProperties}
*/
SystemProperties getProperties();
/**
* Returns the repository fingerprint. Note that during an active transactions the fingerprint will be updated
* at the very end of the transaction. Call it in {@link com.ontotext.trree.sdk.PluginTransactionListener#transactionCompleted(PluginConnection)}
* if you want to get the updated fingerprint for the just-completed transaction.
*
* @return the repository fingerprint
*/
String getFingerprint();
/**
* Returns whether the current GraphDB instance is part of a cluster. This is useful in cases where a plugin may modify
* the fingerprint via a query. To protect cluster integrity, the fingerprint can be changed only via an update.
*
* @return true if the current instance is in cluster group, false otherwise
*/
boolean isInCluster();
/*
* Creates a thread-safe instance of this {@link PluginConnection} that can be used by other threads.
* Note that every {@link ThreadsafePluginConnecton} must be explicitly closed when no longer needed.
*
* @return an instance of {@link ThreadsafePluginConnecton}
*/
ThreadsafePluginConnecton getThreadsafeConnection();
}
PluginConnection
instances passed to the plugin are not thread-safe and not guaranteed to operate normally
once the called method returns. If the plugin needs to process data asynchronously in another thread it must
get an instance of ThreadsafePluginConnection
via PluginConnection.getThreadsafeConnection()
.
Once the allocated thread-safe connection is no longer needed it should be closed.
PluginConnection
provides access to various other interfaces that access the repository’s data
(Statements
and Entities
), the current transaction’s properties, the repository fingerprint and various
system and repository properties (SystemProperties
).
Statements and Entities¶
In order to enable efficient request processing, plugins are given
low-level access to the repository data and internals. This is done
through the Statements
and Entities
interfaces.
The Entities
interface represents a set of RDF objects (IRIs, blank
nodes, literals, and RDF-star embedded triples). All such objects are termed entities and are
given unique long
identifiers. The Entities
instance is
responsible for resolving these objects from their identifiers and
inversely for looking up the identifier of a given entity. Most plugins
process entities using their identifiers, because dealing with integer
identifiers is a lot more efficient than working with the actual RDF
entities they represent. The Entities
interface is the single entry
point available to plugins for entity management. It supports the
addition of new entities, look-up of entity type and
properties, resolving entities, etc.
It is possible to declare two RDF objects to be
equivalent in a GraphDB repository, e.g., by using owl:sameAs optimization. In order to provide a way to
use such declarations, the Entities
interface assigns a class
identifier to each entity. For newly created entities, this class
identifier is the same as the entity identifier. When two entities are
declared equivalent, one of them adopts the class identifier of the
other, and thus they become members of the same equivalence class. The
Entities
interface exposes the entity class identifier for plugins
to determine which entities are equivalent.
Entities within an Entities
instance have a certain scope. There
are three entity scopes:
Default – entities are persisted on the disk and can be used in statements that are also physically stored on disk. They have positive (non-zero) identifiers, and are often referred to as physical or data entities.
System – system entities have negative identifiers and are not persisted on the disk. They can be used, for example, for system (or magic) predicates that can provide configuration to a plugin or request something to be handled by a plugin. They are available throughout the whole repository lifetime, but after restart, they have to be recreated again.
Request – entities are not persisted on disk and have negative identifiers. They only live in the scope of a particular request, and are not visible to other concurrent requests. These entities disappear immediately after the request processing finishes. The request scope is useful for temporary entities such as those entities that are returned by a plugin as a response to a particular query.
The Statements
interface represents a set of RDF statements, where
‘statement’ means a quadruple of subject, predicate, object, and
context RDF entity identifiers. Statements can be searched for but not modified.
Consuming or returning statements¶
An important abstract class, which is related to GraphDB internals, is
StatementIterator
. It has a boolean next()
method, which
attempts to scroll the iterator onto the next available statement and
returns true
only if it succeeds. In case of success, its subject
,
predicate
, object
, and context
fields are initialized with
the respective components of the next statement. Furthermore, some
properties of each statement are available via the following methods:
boolean isReadOnly()
– returnstrue
if the statement is in the Axioms part of the rule-file or is imported at initialization;boolean isExplicit()
– returnstrue
if the statement is explicitly asserted;boolean isImplicit()
– returnstrue
if the statement is produced by the inferencer (raw statements can be both explicit and implicit).
Here is a brief example that puts Statements
, Entities
, and
StatementIterator
together in order to output all literals that are
related to a given URI:
// resolve the URI identifier
long id = entities.resolve(SimpleValueFactory.getInstance().createIRI("http://example/uri"));
// retrieve all statements with this identifier in subject position
StatementIterator iter = statements.get(id, 0, 0, 0);
while (iter.next()) {
// only process literal objects
if (entities.getType(iter.object) == Entities.Type.LITERAL) {
// resolve the literal and print out its value
Value literal = entities.get(iter.object);
System.out.println(literal.stringValue());
}
}
StatementIterator
is also used to return statements via one of
the pattern interpretation interfaces.
Each GraphDB transaction has several properties accessible via PluginConnection
:
- Transaction ID (
PluginConnection.getTransactionId()
) An integer value. Bigger values indicate newer transactions.
- Testing (
PluginConnection.isTesting()
) A boolean value indicating the testing status of transaction. In GraphDB EE the testing transaction is the first execution of a given transaction that determines if the transaction can be executed successfully before being propagated to the entire cluster. Despite the
_testing_
name it is a full-featured transaction that will modify the data. In GraphDB Free and SE the transaction is always executed only once so it is always testing there.
System properties¶
PluginConnection
provides access to various static repository and system properties via
getProperties()
. The values of these properties are set at repository initialization time and will not
change while the repository is operating.
The getProperties()
method returns an instance of SystemProperties
:
/**
* This interface represents various properties for the running GraphDB instance and the repository as seen by the Plugin API.
*/
public interface SystemProperties {
/**
* Returns the read-only status of the current repository.
*
* @return true if read-only, false otherwise
*/
boolean isReadOnly();
/**
* Returns the number of bits needed to represent an entity id
*
* @return the number of bits as an integer
*/
int getEntityIdSize();
/**
* Returns the type of the current repository.
*
* @return one of {@link RepositoryType#FREE}, {@link RepositoryType#SE} or {@link RepositoryType#EE}
*/
RepositoryType getRepositoryType();
/**
* Returns the full GraphDB version string.
*
* @return a string describing the GraphDB version
*/
String getVersion();
/**
* Returns the GraphDB major version component.
*
* @return the major version as an integer
*/
int getVersionMajor();
/**
* Returns the GraphDB minor version component.
*
* @return the minor version as an integer
*/
int getVersionMinor();
/**
* Returns the GraphDB patch version component.
*
* @return the patch version as an integer
*/
int getVersionPatch();
/**
* Returns the number of cores in the currently set license up to the physical number of cores on the machine.
*
* @return the number of cores as an integerÒ
*/
int getNumberOfLicensedCores();
/**
* The possible editions for GraphDB repositories.
*/
enum RepositoryType {
/**
* GraphDB Free repository
*/
FREE,
/**
* GraphDB SE repository
*/
SE,
/**
* GraphDB EE worker repository
*/
EE
}
}
Repository properties¶
There are some dynamic repository properties that may change once a repository has been initialized. These properties are:
- Repository fingerprint (
PluginConnection.getFingerprint()
) The repository fingerprint. Note that the fingerprint will be updated at the very end of a transaction so the updated fingerprint after a transaction should be accessed within
PluginTransactionListener.transactionCompleted()
.- Whether the repository is attached to a cluster (
PluginConnection.isAttached()
) GraphDB EE worker repositories are typically attached to a master repository and not accessed directly. When this is the case this method will return
true
and the plugin may use it to refuse to perform actions that may cause the fingerprint to change outside of a transaction. In GraphDB Free and SE the method always returnsfalse
.
Query processing¶
As already mentioned, a plugin’s interaction with each of the request-processing phases is optional. The plugin declares if it plans to participate in any phase by implementing the appropriate interface.
Pre-processing¶
A plugin that will be participating in request pre-processing must
implement the Preprocessor
interface. It looks like this:
/**
* Interface that should be implemented by all plugins that need to maintain per-query context.
*/
public interface Preprocessor {
/**
* Pre-processing method called once for every SPARQL query or getStatements() request before it is
* processed.
*
* @param request request object
* @return context object that will be passed to all other plugin methods in the future stages of the
* request processing
*/
RequestContext preprocess(Request request);
}
The preprocess(Request request)
method receives the request object and returns a
RequestContext
instance. The passed request parameter is an instance of one
the interfaces extending Request
, depending on the type of the
request (QueryRequest
for a SPARQL query or StatementRequest
for “get statements”).
The plugin changes the request object accordingly, initializes, and returns its
context object, which is passed back to it in every other method during
the request processing phase. The returned request context may be
null
, but regardless of it is, it is only visible to the plugin that
initializes it. It can be used to store data visible for (and only for)
this whole request, e.g., to pass data related to two different
statement patterns recognized by the plugin. The request context gives
further request processing phases access to the Request
object
reference. Plugins that opt to skip this phase do not have a request
context, and are not able to get access to the original Request
object.
Plugins may create their own RequestContext
implementation or use the default one, RequestContextImpl
.
Pattern interpretation¶
This is one of the most important phases in the life cycle of a plugin. In fact, most plugins need to participate in exactly this phase. This is the point where request statement patterns need to get evaluated and statement results are returned.
For example, consider the following SPARQL query:
SELECT * WHERE {
?s <http://example.com/predicate> ?o
}
There is just one statement pattern inside this query:
?s <http://example/predicate> ?o
. All plugins that have implemented
the PatternInterpreter
interface (thus declaring that they intend to
participate in the pattern interpretation phase) are asked if they can
interpret this pattern. The first one to accept it and return results
will be used. If no plugin interprets the pattern, it will look to use the repository’s physical statements, i.e., the ones
persisted on the disk.
Here is the PatternInterpreter
interface:
/**
* Interface implemented by plugins that want to interpret basic triple patterns
*/
public interface PatternInterpreter {
/**
* Estimate the number of results that could be returned by the plugin for the given parameters
*
* @param subject subject ID (alternatively {@link Entities#BOUND} or {@link Entities#UNBOUND})
* @param predicate predicate ID (alternatively {@link Entities#BOUND} or {@link Entities#UNBOUND})
* @param object object ID (alternatively {@link Entities#BOUND} or {@link Entities#UNBOUND})
* @param context context value (alternatively {@link Entities#BOUND} or {@link Entities#UNBOUND})
* @param pluginConnection an instance of {@link PluginConnection}
* @param requestContext context object as returned by {@code Preprocessor.preprocess()} or null
* @return approximate number of results that could potentially be returned for this parameters by the
* interpret() method
*/
double estimate(long subject, long predicate, long object, long context, PluginConnection pluginConnection,
RequestContext requestContext);
/**
* Interpret basic triple pattern and return {@link StatementIterator} with results
*
* @param subject subject ID (alternatively {@link Entities#BOUND} or {@link Entities#UNBOUND})
* @param predicate predicate ID (alternatively {@link Entities#BOUND} or {@link Entities#UNBOUND})
* @param object object ID (alternatively {@link Entities#BOUND} or {@link Entities#UNBOUND})
* @param context context value (alternatively {@link Entities#BOUND} or {@link Entities#UNBOUND})
* @param pluginConnection an instance of {@link PluginConnection}
* @param requestContext context object as returned by {@code Preprocessor.preprocess()} or null
* @return statement iterator of results
*/
StatementIterator interpret(long subject, long predicate, long object, long context,
PluginConnection pluginConnection, RequestContext requestContext);
}
The estimate()
and interpret()
methods take the same arguments
and are used in the following way:
Given a statement pattern (e.g., the one in the SPARQL query above), all plugins that implement
PatternInterpreter
are asked tointerpret()
the pattern. Thesubject
,predicate
,object
andcontext
values are either the identifiers of the values in the pattern or 0, if any of them is an unbound variable. Thestatements
andentities
objects represent respectively the statements and entities that are available for this particular request. For instance, if the query contains anyFROM <http://some/graph>
clauses, thestatements
object will only provide access to the statements in the defined named graphs. Similarly, theentities
object contains entities that might be valid only for this particular request. The plugin’sinterpret()
method must return aStatementIterator
if it intends to interpret this pattern, ornull
if it refuses.In case the plugin signals that it will interpret the given pattern (returns a non-
null
value), GraphDB’s query optimizer will call the plugin’sestimate()
method, in order to get an estimate on how many results will be returned by theStatementIterator
returned byinterpret()
. This estimate does not need to be precise. But the more precise it is, the more likely the optimizer will make an efficient optimization. There is a slight difference in the values that will be passed toestimate()
. The statement components (e.g.,subject
) might not only be entity identifiers, but they can also be set to 2 special values:Entities.BOUND
– the pattern component is said to be bound, but its particular binding is not yet known;Entities.UNBOUND
– the pattern component will not be bound. These values must be treated as hints to theestimate()
method to provide a better approximation of the result set size, although its precise value cannot be determined before the query is actually run.
After the query has been optimized, the
interpret()
method of the plugin might be called again should any variable become bound due to the pattern reordering applied by the optimizer. Plugins must be prepared to expect different combinations of bound and unbound statement pattern components, and return appropriate iterators.
The requestContext
parameter is the value returned by the
preprocess()
method if one exists, or null
otherwise.
Results are returned as statements.
The plugin framework also supports the interpretation of an extended type of a list pattern.
Consider the following SPARQL queries:
SELECT * WHERE {
?s <http://example.com/predicate> (?o1 ?o2)
}
SELECT * WHERE {
(?s1, ?s2) <http://example.com/predicate> ?o
}
Internally the object or subject list will be converted to a series of triples conforming to
rdf:List. These triples can be handled with
PatternInterpreter
but the whole list semantics will have to be implemented by the plugin.
In order to make this task easier the Plugin API defines two additional interfaces very similar to the
PatternInterpreter
interface – ListPatternInterpreter
and SubjectListPatternInterpreter
.
ListPatternInterpreter
handles lists in the object position:
/**
* Interface implemented by plugins that want to interpret list-like triple patterns
*/
public interface ListPatternInterpreter {
/**
* Estimate the number of results that could be returned by the plugin for the given parameters
*
* @param subject subject ID (alternatively {@link Entities#BOUND} or {@link Entities#UNBOUND})
* @param predicate predicate ID (alternatively {@link Entities#BOUND} or {@link Entities#UNBOUND})
* @param objects object IDs (alternatively {@link Entities#BOUND} or {@link Entities#UNBOUND})
* @param context context value (alternatively {@link Entities#BOUND} or {@link Entities#UNBOUND})
* @param pluginConnection an instance of {@link PluginConnection}
* @param requestContext context object as returned by {@code Preprocessor.preprocess()} or null
* @return approximate number of results that could potentially be returned for this parameters by the
* interpret() method
*/
double estimate(long subject, long predicate, long[] objects, long context, PluginConnection pluginConnection,
RequestContext requestContext);
/**
* Interpret list-like triple pattern and return {@link StatementIterator} with results
*
* @param subject subject ID (alternatively {@link Entities#BOUND} or {@link Entities#UNBOUND})
* @param predicate predicate ID (alternatively {@link Entities#BOUND} or {@link Entities#UNBOUND})
* @param objects object IDs (alternatively {@link Entities#BOUND} or {@link Entities#UNBOUND})
* @param context context value (alternatively {@link Entities#BOUND} or {@link Entities#UNBOUND})
* @param pluginConnection an instance of {@link PluginConnection}
* @param requestContext context object as returned by {@code Preprocessor.preprocess()} or null
* @return statement iterator of results
*/
StatementIterator interpret(long subject, long predicate, long[] objects, long context,
PluginConnection pluginConnection, RequestContext requestContext);
}
It differs from PatternInterpreter
by having multiple objects passed as an array of
long
, instead of a single long
object. The semantics of both
methods is equivalent to the one in the basic pattern interpretation
case.
SubjectListPatternInterpreter
handles lists in the subject position:
/**
* Interface implemented by plugins that want to interpret list-like triple patterns
*/
public interface SubjectListPatternInterpreter {
/**
* Estimate the number of results that could be returned by the plugin for the given parameters
*
* @param subjects subject IDs (alternatively {@link Entities#BOUND} or {@link Entities#UNBOUND})
* @param predicate predicate ID (alternatively {@link Entities#BOUND} or {@link Entities#UNBOUND})
* @param object object ID (alternatively {@link Entities#BOUND} or {@link Entities#UNBOUND})
* @param context context value (alternatively {@link Entities#BOUND} or {@link Entities#UNBOUND})
* @param pluginConnection an instance of {@link PluginConnection}
* @param requestContext context object as returned by {@code Preprocessor.preprocess()} or null
* @return approximate number of results that could potentially be returned for this parameters by the
* interpret() method
*/
double estimate(long[] subjects, long predicate, long object, long context, PluginConnection pluginConnection,
RequestContext requestContext);
/**
* Interpret list-like triple pattern and return {@link StatementIterator} with results
*
* @param subjects subject IDs (alternatively {@link Entities#BOUND} or {@link Entities#UNBOUND})
* @param predicate predicate ID (alternatively {@link Entities#BOUND} or {@link Entities#UNBOUND})
* @param object object ID (alternatively {@link Entities#BOUND} or {@link Entities#UNBOUND})
* @param context context value (alternatively {@link Entities#BOUND} or {@link Entities#UNBOUND})
* @param pluginConnection an instance of {@link PluginConnection}
* @param requestContext context object as returned by {@code Preprocessor.preprocess()} or null
* @return statement iterator of results
*/
StatementIterator interpret(long[] subjects, long predicate, long object, long context,
PluginConnection pluginConnection, RequestContext requestContext);
}
It differs from PatternInterpreter
by having multiple subjects passed as an array of
long
, instead of a single long
subject. The semantics of both
methods is equivalent to the one in the basic pattern interpretation
case.
Post-processing¶
There are cases when a plugin would like to modify or otherwise filter
the final results of a request. This is where the Postprocessor
interface comes into play:
/**
* Interface that should be implemented by plugins that need to post-process results from queries.
*/
public interface Postprocessor {
/**
* A query method that is used by the framework to determine if a {@link Postprocessor} plugin really wants to
* post-process the request results.
*
* @param requestContext the request context reference
* @return boolean value
*/
boolean shouldPostprocess(RequestContext requestContext);
/**
* Method called for each {@link BindingSet} in the query result set. Each binding set is processed in
* sequence by all plugins that implement the {@link Postprocessor} interface, piping the result returned
* by each plugin into the next one. If any of the post-processing plugins returns null the result is
* deleted from the result set.
*
* @param bindingSet binding set object to be post-processed
* @param requestContext context objected as returned by {@link Preprocessor#preprocess(Request)} (in case this plugin
* implemented this interface)
* @return binding set object that should be post-processed further by next post-processing plugins or
* null if the current binding set should be deleted from the result set
*/
BindingSet postprocess(BindingSet bindingSet, RequestContext requestContext);
/**
* Method called after all post-processing has been finished for each plugin. This is the point where
* every plugin could introduce its results even if the original result set was empty
*
* @param requestContext context objected as returned by {@link Preprocessor#preprocess(Request)} (in case this plugin
* implemented this interface)
* @return iterator for resulting binding sets that need to be added to the final result set
*/
Iterator<BindingSet> flush(RequestContext requestContext);
}
The postprocess()
method is called for each binding set that is to
be returned to the repository client. This method may modify the binding
set and return it, or alternatively, return null
, in which case the
binding set is removed from the result set. After a binding set is
processed by a plugin, the possibly modified binding set is passed to
the next plugin having post-processing functionality enabled. After the
binding set is processed by all plugins (in the case where no plugin
deletes it), it is returned to the client. Finally, after all results
are processed and returned, each plugin’s flush()
method is called
to introduce new binding set results in the result set. These in turn
are finally returned to the client.
Update processing¶
Updates involving specific predicates¶
As well as query/read processing, plugins are able to process update
operations for statement patterns containing specific predicates. In
order to intercept updates, a plugin must implement the
UpdateInterpreter
interface. During initialization, the
getPredicatesToListenFor()
is called once by the framework, so that
the plugin can indicate which predicates it is interested in.
From then onwards, the plugin framework filters updates for statements using these predicates and notifies the plugin. The plugin may do whatever processing is required and must return a boolean value indicating whether the statement should be skipped. Skipped statements are not processed further by GraphDB, so the insert or delete will have no effect on actual data in the repository.
/**
* An interface that should be implemented by the plugins that want to be notified for particular update
* events. The getPredicatesToListenFor() method should return the predicates of interest to the plugin. This
* method will be called once only immediately after the plugin has been initialized. After that point the
* plugin's interpretUpdate() method will be called for each inserted or deleted statement sharing one of the
* predicates of interest to the plugin (those returned by getPredicatesToListenFor()).
*/
public interface UpdateInterpreter {
/**
* Returns the predicates for which the plugin needs to get notified when a statement with such a predicate is added or removed.
*
* @return array of predicates as entity IDs
*/
long[] getPredicatesToListenFor();
/**
* Hook that is called whenever a statement containing one of the registered predicates
* (see {@link #getPredicatesToListenFor()} is added or removed.
*
* @param subject subject value of the updated statement
* @param predicate predicate value of the updated statement
* @param object object value of the updated statement
* @param context context value of the updated statement
* @param isAddition true if the statement was added, false if it was removed
* @param isExplicit true if the updated statement was explicit one
* @param pluginConnection an instance of {@link PluginConnection}
* @return true - when the statement was handled by the plugin only and should <i>NOT</i> be added to/removed from the repository,
* false - when the statement should be added to/removed from the repository
*/
boolean interpretUpdate(long subject, long predicate, long object, long context, boolean isAddition,
boolean isExplicit, PluginConnection pluginConnection);
}
Removal of entire contexts¶
Statement deletion in GraphDB is specified as a quadruple (subject, predicate, object, context), where each position can be explicit or null. Null in this case means all subjects, predicates, objects or contexts depending on the position where null was specified.
When at least one of the positions is non-null, the plugin framework will fire individual events for each matching and removed statement.
When all positions are null (i.e., delete everything in the repository) the operation will be optimized
internally and individual events will not be fired. This means that UpdateInterpreter
and StatementListener
will not be called.
ClearInterpreter
is an interface that allows plugins to detect the removal of entire contexts or removal
of all data in the repository:
/**
* This interface can be implemented by plugins that want to be notified on clear()
* or remove() (all statements in any context).
*/
public interface ClearInterpreter {
/**
* Notification called before the statements are removed from the given context.
*
* @param context the ID of the context or 0 if all contexts
* @param pluginConnection an instance of {@link PluginConnection}
*/
void beforeClear(long context, PluginConnection pluginConnection);
/**
* Notification called after the statements have been removed from the given context.
*
* @param context the ID of the context or 0 if all contexts
* @param pluginConnection an instance of {@link PluginConnection}
*/
void afterClear(long context, PluginConnection pluginConnection);
}
Intercepting data for specific contexts¶
The Plugin API provides a way to intercept data inserted into or removed from a particular predefined context.
The ContextUpdateHandler
interface:
/**
* This interface provides a mechanism for plugins to handle updates to certain contexts.
* When a plugin requests handling of a context, all data for that context will forwarded to the plugin
* and not inserted into any GraphDB collections.
* <p>
* Note that unlike other plugin interfaces, {@link ContextUpdateHandler} does not use entity IDs but works directly
* with the RDF values. Data handled by this interface does not reach the entity pool and so no entity IDs are created.
*/
public interface ContextUpdateHandler {
/**
* Returns the contexts for which the plugin will handle the updates.
*
* @return array of {@link Resource}
*/
Resource[] getUpdateContexts();
/**
* Hook that handles updates for the configured contexts.
*
* @param subject subject value of the updated statement
* @param predicate predicate value of the updated statement
* @param object object value of the updated statement
* @param context context value of the updated statement (can be null when not an addition, then it means remove from all contexts)
* @param isAddition true if statement is being added, false if statement is being removed
* @param pluginConnection an instance of {@link PluginConnection}
*/
void handleContextUpdate(Resource subject, IRI predicate, Value object, Resource context, boolean isAddition,
PluginConnection pluginConnection);
}
This is similar to Updates involving specific predicates with some important differences:
ContextUpdateHandler
Configured via a list of contexts specified as IRI objects.
Statements with these contexts are passed to the plugin as Value objects and never enter any of the database collections.
The plugin is assumed to always handle the update.
UpdateInterpreter
Configured via a list of predicates specified as integer IDs.
Statements with these predicates are passed to the plugin as integer IDs after their RDF values are converted to integer IDs in the entity pool.
The plugin decides whether to handle the statement or pass it on to other plugins and eventually to the database.
This mechanism is especially useful for the creation of virtual contexts (graphs) whose data is stored within a plugin and never pollutes any of the database collections with unnecessary values.
Unlike the rest of the Plugin API this interface uses RDF values as objects bypassing the use of integer IDs.
Transactions¶
A plugin may require to participate in the transaction workflow, e.g., because the plugin needs to update certain data structures such that they reflect the actual data in the repository. Without being part of the transaction the plugin would not know when to persist or discard a given state.
Transactions can be easily tracked by implementing the PluginTransactionListener
interface:
/**
* The {@link PluginTransactionListener} allows plugins to be notified about transactions (start, commit+completed or abort)
*/
public interface PluginTransactionListener {
/**
* Notifies the listener about the start of a transaction.
*
* @param pluginConnection an instance of {@link PluginConnection}
*/
void transactionStarted(PluginConnection pluginConnection);
/**
* Notifies the listener about the commit phase of a transaction. Plugins should use this event to perform their own
* commit work if needed or to abort the transaction if needed.
*
* @param pluginConnection an instance of {@link PluginConnection}
*/
void transactionCommit(PluginConnection pluginConnection);
/**
* Notifies the listener about the completion of a transaction. This will be the last event in a successful transaction.
* The plugin is not allowed to throw any exceptions here and if so they will be ignored. If a plugin needs to abort
* a transaction it should be done in {@link #transactionCommit(PluginConnection)}.
*
* @param pluginConnection an instance of {@link PluginConnection}
*/
void transactionCompleted(PluginConnection pluginConnection);
/**
* Notifies the listener about the abortion of a transaction. This will be the last event in an aborted transaction.
* <p>
* Plugins should revert any modifications caused by this transaction, including the fingerprint.
*
* @param pluginConnection an instance of {@link PluginConnection}
*/
void transactionAborted(PluginConnection pluginConnection);
/**
* Notifies the listener about a user abort request. A user abort request is a request by an end-user to abort the
* transaction. Unlike the other events this will be called asynchronously whenever the request is received.
* <p>
* Plugins may react and terminate any long-running computation or ignore the request. This is just a handy way
* to speed up abortion when a user requests it. For example, this event may be received asynchronously while
* the plugin is indexing data (in {@link #transactionCommit(PluginConnection)} running in the main thread).
* The plugin may notify itself that the indexing should stop. Regardless of the actions taken by the plugin
* the transaction may still be aborted and {@link #transactionAborted(PluginConnection)} will be called.
* All clean up of the abortion should be handled in {@link #transactionAborted(PluginConnection)}.
*
* @param pluginConnection an instance of {@link PluginConnection}
*/
default void transactionAbortedByUser(PluginConnection pluginConnection) {
}
}
Each transaction has a beginning signalled by a call to transactionStarted()
. Then the transaction can
proceed in several ways:
Commit and completion:
transactionCommit()
is called;transactionCompleted()
is called.
Commit followed by abortion (typically because another plugin aborted the transaction in its own
transactionCommit()
):transactionCommit()
is called;transactionAborted()
is called.
Abortion before entering commit:
transactionAborted()
is called.
Plugins should strive to do all heavy transaction work in transactionCommit()
in such a way that
call to transactionAborted()
can revert the changes. Plugins may throw exceptions in
transactionCommit()
in order to abort the transaction, e.g., if some constraint was violated.
Plugins should do no heavy processing in transactionCompleted()
and are not allowed to throw exceptions there.
Such exceptions will be logged and ignored, and the transaction will still go through normally.
The transactionAbortedByUser()
will be called asynchronously (e.g., while the plugin is executing
transactionCommit()
in the main update thread) when a user requests the transaction to be aborted.
The plugin may use this to signal its other thread to abort processing at earliest convenience or simply
ignore the request.
Exceptions¶
Plugins may throw exceptions on invalid input, constraint violations or unexpected events (e.g. out of disk
space). It is possible to throw such exceptions almost everywhere with the notable exception of
PluginTransactionListener.transactionCompleted()
.
A good practice is to construct an instance of PluginException
or one of its subclasses:
ClientErrorException
– for example when the user provided invalid input.ServerErrorException
– for example when an unexpected server error occurred, such as lack of disk permissions.
Accessing other plugins¶
Plugins can make use of the functionality of other plugins. For
example, the Lucene-based full-text search plugin can make use of the
rank values provided by the RDF Rank plugin, to facilitate query result
scoring and ordering. This is not a matter of re-using program code
(e.g., in a .jar
with common classes), but rather it is about re-using
data. The mechanism to do this allows plugins to obtain references to
other plugin objects by knowing their names. To achieve this, they only
need to implement the PluginDependency
interface:
/**
* Interface that should be implemented by plugins that depend on other plugins and want to be able to
* retrieve references to them at runtime.
*/
public interface PluginDependency {
/**
* Method used by the plugin framework to inject a {@link PluginLocator} instance in the plugin.
*
* @param locator a {@link PluginLocator} instance
*/
void setLocator(PluginLocator locator);
}
They are then injected into an instance of the PluginLocator
interface (during the configuration phase), which does the actual
plugin discovery for them:
/**
* Interface that supports obtaining of a plugin instance by plugin name. An object implementing this
* interface is injected into plugins that implement the {@link PluginDependency} interface.
*/
public interface PluginLocator {
/**
* Retrieves a {@link Plugin} instance by plugin name
*
* @param name name of the plugin
* @return a {@link Plugin} instance or null if a plugin with that name is not available
*/
Plugin locate(String name);
/**
* Retrieves a {@link RDFRankProvider} instance.
*
* @return a {@link RDFRankProvider} instance or null if no {@link RDFRankProvider} is available
*/
RDFRankProvider locateRDFRankProvider();
}
Having a reference to another plugin is all that is needed to call its methods directly and make use of its services.
An important interface related to accessing other plugins is the RDFRankProvider
interface.
The sole implementation is the RDF Rank plugin but it can be easily replaced by another implementation.
By having a dedicated interface it is easy for plugins to get access to RDF ranks without relying
on a specific implementation.
List of plugin interfaces and classes¶
Basics¶
Plugin
The basic interface that defines a plugin.
PluginBase
A reference abstract implementation of
Plugin
that can serve as the base for implementing plugins.
There are a couple of extensions of the Plugin
interface that add additional configuration or behavior to plugins:
ParallelPlugin
- Marks a plugin as aware of parallel processing. The plugin will be injected an instance of
PluginExecutorService
viasetExecutorService(PluginExecutorService executorService)
.PluginExecutorService
is a simplified version of Java’sExecutorService
and provides an easy mechanism for plugins to schedule parallel tasks safely.No open-source plugins use
ParallelPlugin
. StatelessPlugin
- Marks a plugin as stateless. Stateless plugins do not contribute to the repository fingerprint and their fingerprint will not be queried.It is suitable for plugins that are unimportant for query results or update executions, e.g., plugins that are not typically used in the normal data flow.
Open-source plugins using
StatelessPlugin
:
On initialize()
and shutdown()
plugins receive an enum value, InitReason
and ShutdownReason
respectively, describing the reason why the plugin is being initialized or shut down.
InitReason
DEFAULT
: initialized as part of the repository initialization or the plugin was enabled;CREATED_BACKUP
: initialized after a shutdown for backup;RESTORED_FROM_BACKUP
: initialized after a shutdown for restore.
ShutdownReason
DEFAULT
: shutdown as part of the repository shutdown or the plugin was disabled;CREATE_BACKUP
: shutdown before backup;RESTORE_FROM_BACKUP
: shutdown before restore.
Plugins may use the reason to handle their own backup scenarios. In most cases it is unnecessary since the plugin’s files will be backed up or restored together with the rest of the repository data.
Data structures¶
For more information, see Repository internals.
PluginConnection
The main entry to repository internals. Passed to almost all methods in Plugin API interfaces.
ThreadsafePluginConnection
Thread-safe version of
PluginConnection
. Requested explicitly fromPluginConnection
and must be explicitly closed when no longer needed.Open-source plugins using
ThreadsafePluginConnection
:Entities
Provides access to the repository’s entities. Entities are mappings from integer IDs to RDF values (IRIs, blank nodes, literals, and RDF-star embedded triples).
Statements
Provides access to the repository’s statements. Results are returned as
StatementIterator
instances.StatementIterator
Interface for returning statements. Used both by
Statements
to list repository data and by plugins to return data via Pattern interpretation.SystemProperties
Provides access to static repository and system properties such as the GraphDB version and repository type.
All open-source plugins use the repository internals.
Query request handlers¶
For more information, see Query processing.
Pattern interpretation handlers¶
The pattern interpretation handlers interpret the evaluation of triple patterns. Each triple pattern will be sent to plugins that implement the respective interface.
For more information, see Pattern interpretation.
PatternInterpreter
- Interprets a simple triple pattern, where the subject, predicate, object and context are single values.This interface handles all triple patterns:
subject predicate object context
.Open-source plugins using
PatternInterpreter
: ListPatternInterpreter
- Interprets a triple pattern, where the subject, predicate and context are single values while the object is a list of values.This interface handles triple patterns of this form:
subject predicate (object1 object2 ...) context
.Open-source plugins using
ListPatternInterpreter
: SubjectListPatternInterpreter
- Interprets a triple pattern, where the predicate, object and context are single values while the subject is a list of values.This interface handles triple patterns of this form:
(subject1 subject2 ...) predicate object context
.No open-source plugins use
SubjectListPatternInterpreter
but the usage is similar toListPatternInterpreter
.
Pre- and postprocessing handlers¶
For more information, see Pre-processing and Post-processing.
Preprocessor
Allows plugins to maintain a per-query context and have access to
query/getStatements()
properties.Open-source plugins using
Preprocessor
:Postprocessor
Allows plugins to modify the final result of a query/getStatements() request.
No open-source plugins use
Postprocessor
but the example plugins do.
Query request support classes¶
Request
A basic read request. Passed to
Preprocess.preprocess()
. Provides access to theisIncludeInferred
property.QueryRequest
An extension of
Request
for SPARQL queries. It provides access to the various constituents of the query such as theFROM
clauses and the parsed query.StatementsRequest
An extension of
Request
forRepositoryConnection.getStatements()
. It provides access to each of the individual constituents of the request quadruple (subject, predicate, object, and context).RequestContext
Plugins may create an instance of this interface in
Preprocess.preprocess()
to keep track of request-global data. The instance will be passed toPatternInterpreter
,ListPatternInterpreter
,SubjectListPatternInterpreter
andPostprocessor
.RequestContextImpl
A default implementation of
RequestContext
that provides a way to keep arbitrary values by key.
Update request handlers¶
The update request handlers are responsible for processing updates. Unlike the query request handlers, the update handlers will be called only for statements that match a predefined pattern.
For more information, see Update processing.
UpdateInterpreter
- Handles the addition or removal of statements. Only statements that have one of a set of predefined predicates will be passed to the handler.The return value determines if the statement will be added or deleted as real data (in the repository) or processed only by the plugin.Note that this handler will not be called for each individual statement when removing all statements from all contexts.
Open-source plugins using
UpdateInterpreter
: ClearInterpreter
- Handles the removal of all statements in a given context or in all contexts.This handler is especially useful when all statements in all contexts are removed since
UpdateInterpreter
will not be called in this case.No open-source plugins use
ClearInterpreter
. ContextUpdateHandler
- Handles the addition or removal of statements in a set of predefined contexts.This can be used to implement virtual contexts and is the only part of the Plugin API that does not use integer identifiers but RDF values directly.
No open-source plugins use
ContextUpdateHandler
.
Notification listeners¶
In general the listeners are used as simple notifications about a certain event, such as the beginning of a new transaction or the creation of a new entity.
EntityListener
Notified about the creation of a new data entity (IRI, blank node, or literal).
Open-source plugins using
EntityListener
:StatementListener
- Notifications about the addition or removal of a statement.Unlike
UpdateInterpreter
, this listener will be notified about all statements and not just statements with a predefined predicate. The statement will be added or removed regardless of the return value.Open-source plugins using
StatementListener
: PluginTransactionListener
andParallelTransactionListener
- Notifications about the different stages of a transaction (started, followed by either commit + completed or aborted).Plugins should do the bulk of their transaction work within the commit stage.
ParallelTransactionListener
is a marker extensions ofPluginTransactionListener
whose commit stage is safe to call in parallel with the commit stage of other plugins.If the plugin does not perform any lengthy operations in the commit stage, it is better to stick toPluginTransactionListener
.Open-source plugins using
PluginTransactionListener
orParallelTransactionListener
:
Plugin dependencies¶
For more information, see Accessing other plugins.
PluginDependency
Plugins that need to use other plugins directly must implement this interface. They will be injected an instance of
PluginLocator
.PluginLocator
Provides access to other plugins by name or to the default implementation of
RDFRankProvider
.RDFRankProvider
A plugin that provides an RDF rank. The only implementation is the RDF Rank plugin.
Health checks¶
The health check classes can be used to include a plugin in the repository health check.
HealthCheckable
Marks a component (a plugin or part of a plugin) as able to provide health checks. If a plugin implements this interface it will be included in the repository health check.
HealthResult
The result from a health check. In general health results can be green (everything ok), yellow (needs attention) or red (something broken).
CompositeHealthResult
A composite health result that aggregates several
HealthResult
instances into a singleHealthResult
.
No open-source implement health checks.
Exceptions¶
A set of predefined exception classes that can be used by plugins.
PluginException
Generic plugin exception. Extends
RuntimeException
.ClientErrorException
User (client) error, e.g. invalid input. Extends
PluginException
.ServerErrorException
Server error, e.g. something unexpected such as lack of disk permissions. Extends
PluginException
.
Adding external plugins to GraphDB¶
With the graphdb.extra.plugins
property, you can attach a directory with external plugins when starting GraphDB. It is set the following way:
graphdb -Dgraphdb.extra.plugins=path/to/directory/with/external/plugins
If the property is omitted when starting GraphDB, then you need to load external plugins by placing them in the dist/lib/plugins
directory and then restarting GraphDB.
Tip
This property is useful in situations when, for example, GraphDB is used in an environment such as Kubernetes, where the database cannot be restarted and the dist
folder cannot be persisted.
Putting it all together: example plugins¶
A project containing two example plugins, ExampleBasicPlugin
and ExamplePlugin
can be found
here.
ExampleBasicPlugin¶
ExampleBasicPlugin
has the following functionality:
It interprets the pattern
?s <http://example.com/now> ?o
and binds the object to a literal containing the system date/time of the machine running GraphDB. The subject position is not used and its value does not matter.
The plugin implements the PatternInterpreter
interface. A date/time literal is created as a
request-scope entity to avoid cluttering the repository with extra literals.
The plugin extends the PluginBase
class that provides a default implementation of the Plugin
interface:
public class ExampleBasicPlugin extends PluginBase {
// The predicate we will be listening for
private static final String TIME_PREDICATE = "http://example.com/now";
private IRI predicate; // The predicate IRI
private long predicateId; // ID of the predicate in the entity pool
// Service interface methods
@Override
public String getName() {
return "exampleBasic";
}
// Plugin interface methods
@Override
public void initialize(InitReason reason, PluginConnection pluginConnection) {
// Create an IRI to represent the predicate
predicate = SimpleValueFactory.getInstance().createIRI(TIME_PREDICATE);
// Put the predicate in the entity pool using the SYSTEM scope
predicateId = pluginConnection.getEntities().put(predicate, Entities.Scope.SYSTEM);
getLogger().info("ExampleBasic plugin initialized!");
}
}
In this basic implementation, the plugin name is defined and during initialization, a single system-scope predicate is registered.
Note
It is important not to forget to register the plugin in the
META-INF/services/com.ontotext.trree.sdk.Plugin
file in the
classpath.
The next step is to implement the first of the plugin’s requirements – the pattern interpretation part:
public class ExamplePlugin extends PluginBase implements PatternInterpreter {
// ... initialize() and getName()
// PatternInterpreter interface methods
@Override
public StatementIterator interpret(long subject, long predicate, long object, long context,
PluginConnection pluginConnection, RequestContext requestContext) {
// Ignore patterns with predicate different than the one we are interested in. We want to return the
// SystemDate only when we detect the <http://example.com/time> predicate.
if (predicate != predicateId)
// This will tell the PluginManager that we cannot interpret the statement so the statement can be passed
// to another plugin.
return null;
// Create the date/time literal. Here it is important to create the literal in the entities instance of the
// request and NOT in getEntities(). If you create it in the entities instance returned by getEntities() it
// will not be visible in the current request.
long literalId = createDateTimeLiteral(pluginConnection.getEntities());
// return a StatementIterator with a single statement to be iterated. The object of this statement will be the
// current timestamp.
return StatementIterator.create(subject, predicate, literalId, 0);
}
@Override
public double estimate(long subject, long predicate, long object, long context,
PluginConnection pluginConnection, RequestContext requestContext) {
// We always return a single statement so we return a constant 1. This value will be used by the QueryOptimizer
// when crating the execution plan.
return 1;
}
private long createDateTimeLiteral(Entities entities) {
// Create a literal for the current timestamp.
Value literal = SimpleValueFactory.getInstance().createLiteral(new Date());
// Add the literal in the entity pool with REQUEST scope. This will make the literal accessible only for the
// current Request and will be disposed once the request is completed. Return it's ID.
return entities.put(literal, Entities.Scope.REQUEST);
}
}
The interpret()
method only processes patterns with a predicate
matching the desired predicate identifier. Further on, it simply creates
a new date/time literal (in the request scope) and places its identifier
in the object position of the returned single result. The estimate()
method always returns 1, because this is the exact size of the result
set.
ExamplePlugin¶
ExamplePlugin
has the following functionality:
If a
FROM <http://example.com/time>
clause is detected in the query, the result is a single binding set in which all projected variables are bound to a literal containing the system date/time of the machine running GraphDB.If a triple with the subject
http://example.com/time
and one of the predicateshttp://example.com/goInFuture
orhttp://example.com/goInPast
is inserted, its object is set as a positive or negative offset for all future requests querying the system date/time via the plugin.
The plugin extends the PluginBase
class that provides a default implementation of the Plugin
interface:
public class ExamplePlugin extends PluginBase implements UpdateInterpreter, Preprocessor, Postprocessor {
private static final String PREFIX = "http://example.com/";
private static final String TIME_PREDICATE = PREFIX + "time";
private static final String GO_FUTURE_PREDICATE = PREFIX + "goInFuture";
private static final String GO_PAST_PREDICATE = PREFIX + "goInPast";
private int timeOffsetHrs = 0;
private IRI timeIri;
// IDs of the entities in the entity pool
private long timeID;
private long goFutureID;
private long goPastID;
// Service interface methods
@Override
public String getName() {
return "example";
}
// Plugin interface methods
@Override
public void initialize(InitReason reason, PluginConnection pluginConnection) {
// Create IRIs to represent the entities
timeIri = SimpleValueFactory.getInstance().createIRI(TIME_PREDICATE);
IRI goFutureIRI = SimpleValueFactory.getInstance().createIRI(GO_FUTURE_PREDICATE);
IRI goPastIRI = SimpleValueFactory.getInstance().createIRI(GO_PAST_PREDICATE);
// Put the entities in the entity pool using the SYSTEM scope
timeID = pluginConnection.getEntities().put(timeIri, Entities.Scope.SYSTEM);
goFutureID = pluginConnection.getEntities().put(goFutureIRI, Entities.Scope.SYSTEM);
goPastID = pluginConnection.getEntities().put(goPastIRI, Entities.Scope.SYSTEM);
getLogger().info("Example plugin initialized!");
}
}
In this implementation, the plugin name is defined and during initialization, three system-scope predicates are registered.
To implement the first functional requirement the plugin must inspect the query and detect
the FROM
clause in the pre-processing phase. Then, the plugin must hook into the post-processing phase
where, if the pre-processing phase detected the desired FROM
clause, it deletes all query results
(in postprocess()
and returns a single result (in flush()
) containing the binding set specified
by the requirements. Since this happens as part of pre- and post-processing we can pass the literals
without going through the entity pool and using integer IDs.
To do this the plugin must implement Preprocessor
and Postprocessor
:
public class ExamplePlugin extends PluginBase implements Preprocessor, Postprocessor {
// ... initialize() and getName()
// Preprocessor interface methods
@Override
public RequestContext preprocess(Request request) {
// We are interested only in QueryRequests
if (request instanceof QueryRequest) {
QueryRequest queryRequest = (QueryRequest) request;
Dataset dataset = queryRequest.getDataset();
// Check if the predicate is included in the default graph. This means that we have a "FROM <our_predicate>"
// clause in the SPARQL query.
if ((dataset != null && dataset.getDefaultGraphs().contains(timeIri))) {
// Create a date/time literal
Value literal = createDateTimeLiteral();
// Prepare a binding set with all projected variables set to the date/time literal value
MapBindingSet result = new MapBindingSet();
for (String bindingName : queryRequest.getTupleExpr().getBindingNames()) {
result.addBinding(bindingName, literal);
}
// Create a Context object which will be available during the other phases of the request processing
// and set the created result as an attribute.
RequestContextImpl context = new RequestContextImpl();
context.setAttribute("bindings", result);
return context;
}
}
// If we are not interested in the request there is no need to create a Context.
return null;
}
// Postprocessor interface methods
@Override
public boolean shouldPostprocess(RequestContext requestContext) {
// Postprocess only if we have created RequestContext in the Preprocess phase. Here the requestContext object
// is the same one that we created in the preprocess(...) method.
return requestContext != null;
}
@Override
public BindingSet postprocess(BindingSet bindingSet, RequestContext requestContext) {
// Filter all results. Returning null will remove the binding set from the returned query result.
// We will add the result we want in the flush() phase.
return null;
}
@Override
public Iterator<BindingSet> flush(RequestContext requestContext) {
// Get the BindingSet we created in the Preprocess phase and return it.
// This will be returned as the query result.
BindingSet result = (BindingSet) ((RequestContextImpl) requestContext).getAttribute("bindings");
return new SingletonIterator<>(result);
}
private Literal createDateTimeLiteral() {
// Create a literal for the current timestamp.
Calendar calendar = Calendar.getInstance();
calendar.add(Calendar.HOUR, timeOffsetHrs);
return SimpleValueFactory.getInstance().createLiteral(calendar.getTime());
}
}
The plugin creates an instance of RequestContext
using the default implementation RequestContextImpl
.
It can hold attributes of any type referenced by a name. Then the plugin creates a BindingSet
with the date/time literal, bound to every variable name in the query projection, and sets it as an attribute
with the name “bindings”. The postprocess()
method filters out all results if the requestContext
is non-null
(i.e., if the FROM
clause was detected by preprocess()
). Finally, flush()
returns
a singleton iterator, containing the desired binding set in the required case or does not return anything.
To implement the second functional requirement that allows setting an offset in the future or the past,
the plugin must react to specific update statements. This is achieved via implementing UpdateInterpreter
:
public class ExamplePlugin extends PluginBase implements UpdateInterpreter, Preprocessor, Postprocessor {
// ... initialize() and getName()
// ... Pre- and Postprocessor methods
// UpdateInterpreter interface methods
@Override
public long[] getPredicatesToListenFor() {
// We can filter the tuples we are interested in by their predicate. We are interested only
// in tuples with have the predicate we are listening for.
return new long[] {goFutureID, goPastID};
}
@Override
public boolean interpretUpdate(long subject, long predicate, long object, long context, boolean isAddition,
boolean isExplicit, PluginConnection pluginConnection) {
// Make sure that the subject is the time entity
if (subject == timeID) {
final String intString = pluginConnection.getEntities().get(object).stringValue();
int step;
try {
step = Integer.parseInt(intString);
} catch (NumberFormatException e) {
// Invalid input, propagate the error to the caller
throw new ClientErrorException("Invalid integer value: " + intString);
}
if (predicate == goFutureID) {
timeOffsetHrs += step;
} else if (predicate == goPastID) {
timeOffsetHrs -= step;
}
// We handled the statement.
// Return true so the statement will not be interpreted by other plugins or inserted in the DB
return true;
}
// Tell the PluginManager that we can not interpret the tuple so further processing can continue.
return false;
}
}
UpdateInterpreter
must specify the predicates the plugin is interested in via
getPredicatesToListenFor()
. Then whenever a statement with one of those predicates is inserted or removed
the plugin framework calls interpretUpdate()
. The plugin then checks if the subject value is
http://example.com/time
and if so handles the update and returns true
to the plugin framework
to signal that the plugin has processed the update and it needs not be inserted as regular data.