GraphDB Plugin API

What is the GraphDB Plugin API

The GraphDB Plugin API is a framework and a set of public classes and interfaces that allow developers to extend GraphDB in many useful ways. These extensions are bundled into plugins, which GraphDB discovers during its initialization phase and then uses to delegate parts of its query or update processing tasks. The plugins are given low-level access to the GraphDB repository data, which enables them to do their job efficiently. They are discovered via the Java service discovery mechanism, which enables dynamic addition/removal of plugins from the system without having to recompile GraphDB or change any configuration files.

Description of a GraphDB plugin

A GraphDB plugin is a Java class that implements the com.ontotext.trree.sdk.Plugin interface. All public classes and interfaces of the plugin API are located in this Java package, i.e., com.ontotext.trree.sdk. Here is what the plugin interface looks like in an abbreviated form:

/**
 * The base interface for a GraphDB plugin. As a minimum a plugin must implement this interface.
 * <p>
 * Plugins also need to be listed in META-INF/services/com.ontotext.trree.sdk.Plugin so that Java's services
 * mechanism may discover them automatically.
 */
public interface Plugin extends Service {
    /**
     * A method used by the plugin framework to configure each plugin's file system directory. This
     * directory should be used by the plugin to store its files
     *
     * @param dataDir file system directory to be used for plugin related files
     */
    void setDataDir(File dataDir);

    /**
     * A method used by the plugin framework to provide plugins with a {@link Logger} object
     *
     * @param logger {@link Logger} object to be used for logging
     */
    void setLogger(Logger logger);

    /**
     * Plugin initialization method called once when the repository is being initialized, after the plugin has been
     * configured and before it is actually used. It enables plugins to execute whatever
     * initialization routines they consider appropriate, load resources, open connections, etc., based on the
     * specific reason for initialization, e.g., backup.
     * <p>
     * The provided {@link PluginConnection} instance may be used to create entities needed by the plugin.
     *
     * @param reason           the reason for initialization
     * @param pluginConnection an instance of {@link PluginConnection}
     */
    void initialize(InitReason reason, PluginConnection pluginConnection);

    /**
     * Sets a new plugin fingerprint.
     * Every plugin should maintain a fingerprint of its data that could be used by GraphDB to determine if the
     * data has changed or not. Initially, on system initialization the plugins are injected their
     * fingerprints as they reported them before the last system shutdown
     *
     * @param fingerprint the last known plugin fingerprint
     */
    void setFingerprint(long fingerprint);

    /**
     * Returns the fingerprint of the plugin.
     * <p>
     * Every plugin should maintain a fingerprint of its data that could be used by GraphDB to determine if the
     * data has changed or not. The plugin fingerprint will become part of the repository fingerprint.
     *
     * @return the current plugin fingerprint based on its data
     */
    long getFingerprint();

    /**
     * Plugin shutdown method that is called when the repository is being shutdown. It enables plugins to execute whatever
     * finalization routines they consider appropriate, free resources, buffered streams, etc., based on the
     * specific reason for shutdown.
     *
     * @param reason the reason for shutdown
     */
    void shutdown(ShutdownReason reason);
}

As it derives from the Service interface, the plugin is automatically discovered at run-time, provided that the following conditions also hold:

  • The plugin class is located in the classpath.

  • It is mentioned in a META-INF/services/com.ontotext.trree.sdk.Plugin file in the classpath or in a .jar that is in the classpath. The full class signature has to be written on a separate line in such a file.

The only method introduced by the Service interface is getName(), which provides the plugin’s (service’s) name. This name must be unique within a particular GraphDB repository, and serves as a plugin identifier that can be used at any time to retrieve a reference to the plugin instance.

/**
 * Interface implemented by all run-time discoverable services (e.g. {@link Plugin} instances). Classes
 * implementing this interface should furthermore be declared in the respective
 * META-INF/services/&lt;class.signature&gt; file and will then be discoverable at run-time.
 * <p>
 * Plugins need not implement this interface directly but rather implement {@link Plugin}.
 */
public interface Service {
    /**
     * Gets the service name (serves as a key for discovering the service)
     *
     * @return service name
     */
    String getName();
}

There are many more functions (interfaces) that a plugin could implement, but these are all optional and are declared in separate interfaces. Implementing any such complementary interface is the means to announce to the system what this particular plugin can do in addition to its mandatory plugin responsibilities. It is then automatically used as appropriate. See List of plugin interfaces and classes.

The life cycle of a plugin

A plugin’s life cycle consists of several phases:

Discovery

This phase is executed at repository initialization. GraphDB searches for all plugin services in the classpath registered in the META-INF/services/com.ontotext.trree.sdk.Plugins service registry files, and constructs a single instance of each plugin found.

Configuration

Every plugin instance discovered and constructed during the previous phase is then configured. During this phase, plugins are injected with a Logger object, which they use for logging (setLogger(Logger logger)), and the path to their own data directory (setDataDir(File dataDir)), which they create, if needed, and then use to store their data. If a plugin does not need to store anything to the disk, it can skip the creation of its data directory. However, if it needs to use it, it is guaranteed that this directory will be unique and available only to the particular plugin that it was assigned to.

This phase is also called when a plugin is enabled after repository initialization.

Initialization

After a plugin has been configured, the framework calls its initialize(InitReason reason, PluginConnection pluginConnection) method so it gets the chance to do whatever initialization work it needs to do. The passed instance of PluginConnection provides access to various other structures and interfaces, such as Statements and Entities instances (Repository internals), and a SystemProperties instance, which gives the plugins access to the system-wide configuration options and settings. Plugins typically use this phase to create IRIs that will be used to communicate with the plugin.

This phase is also called when a plugin is enabled after repository initialization.

Request processing

The plugin participates in the request processing. The request phase applies to the evaluation of SPARQL queries, getStatements calls, the transaction stages and the execution of SPARQL updates. Various event notifications can also be part of this phase.

This phase is optional for the plugins but no plugin is useful without implementing at least one of its interfaces.

Request processing can be divided roughly into query processing and update processing.

Query processing

Query processing includes several sub-phases that can be used on their own or combined together:

Pre-processing

Plugins are given the chance to modify the request before it is processed. In this phase, they could also initialize a context object, which will be visible till the end of the request processing (Pre-processing).

Pattern interpretation

Plugins can choose to provide results for requested statement patterns (Pattern interpretation). This sub-phase applies only to queries.

Post-processing

Before the request results are returned to the client, plugins are given a chance to modify them, filter them out, or even insert new results (Post-processing);

Update processing

Update processing includes several layers of processing:

Transaction events

Plugins are notified about the beginning and end of a transaction.

Update handling

Plugins can choose to handle certain updates (additions or removals) instead of letting the repository handle the updates as regular data.

Entities and statements notifications

Plugins can be notified about the creation of entities, the addition and removal of statements.

Shutdown

During repository shutdown, each plugin is prompted to execute its own shutdown routines, free resources, flush data to disk, etc. This must be done in the shutdown(ShutdownReason reason) method.

This phase is also called when a plugin is disabled after repository initialization.

Repository internals

The repository internals are accessed via an instance of PluginConnection:

/**
 * The {@link PluginConnection} interface provides access to various objects that can be used to query data
 * or get the properties of the current transaction. An instance of {@link PluginConnection} will be passed to almost
 * all methods that a plugin may implement.
 */
public interface PluginConnection {
    /**
     * Returns an instance of {@link Entities} that can be used to retrieve or create RDF entities.
     *
     * @return an {@link Entities} instance
     */
    Entities getEntities();

    /**
     * Returns an instance of {@link Statements} that can be used to retrieve RDF statements.
     *
     * @return a {@link Statements} instance
     */
    Statements getStatements();

    /**
     * Returns the transaction ID of the current transaction or 0 if no explicit transaction is available.
     *
     * @return the transaction ID
     */
    long getTransactionId();

    /**
     * Returns the update testing status. In a multi-node GraphDB configuration (currently only GraphDB EE) an update
     * will be sent to multiple nodes. The first node that receives the update will be used to test if the update is
     * successful and only if so, it will be send to other nodes. Plugins may use the update test status to perform
     * certain operations only when the update is tested (e.g. indexing data via an external service). The method will
     * return true if this is a GraphDB EE worker node testing the update or this is GraphDB Free or SE. The method will
     * return false only if this is a GraphDB EE worker node that is receiving a copy of the original update
     * (after successful testing on another node).
     *
     * @return true if this update is sent for the first time (testing the update), false otherwise
     */
    boolean isTesting();

    /**
     * Returns an instance of {@link SystemProperties} that can be used to retrieve various properties that identify
     * the current GraphDB installation and repository.
     *
     * @return an instance of {@link SystemProperties}
     */
    SystemProperties getProperties();

    /**
     * Returns the repository fingerprint. Note that during an active transactions the fingerprint will be updated
     * at the very end of the transaction. Call it in {@link com.ontotext.trree.sdk.PluginTransactionListener#transactionCompleted(PluginConnection)}
     * if you want to get the updated fingerprint for the just-completed transaction.
     *
     * @return the repository fingerprint
     */
    String getFingerprint();

    /**
     * Returns whether the current GraphDB instance is part of a cluster. This is useful in cases where a plugin may modify
     * the fingerprint via a query. To protect cluster integrity, the fingerprint can be changed only via an update.
     *
     * @return true if the current instance is in cluster group, false otherwise
     */
    boolean isInCluster();


    /*
     * Creates a thread-safe instance of this {@link PluginConnection} that can be used by other threads.
     * Note that every {@link ThreadsafePluginConnecton} must be explicitly closed when no longer needed.
     *
     * @return an instance of {@link ThreadsafePluginConnecton}
     */
    ThreadsafePluginConnecton getThreadsafeConnection();
}

PluginConnection instances passed to the plugin are not thread-safe and not guaranteed to operate normally once the called method returns. If the plugin needs to process data asynchronously in another thread it must get an instance of ThreadsafePluginConnection via PluginConnection.getThreadsafeConnection(). Once the allocated thread-safe connection is no longer needed it should be closed.

PluginConnection provides access to various other interfaces that access the repository’s data (Statements and Entities), the current transaction’s properties, the repository fingerprint and various system and repository properties (SystemProperties).

Statements and Entities

In order to enable efficient request processing, plugins are given low-level access to the repository data and internals. This is done through the Statements and Entities interfaces.

The Entities interface represents a set of RDF objects (IRIs, blank nodes, literals, and RDF-star embedded triples). All such objects are termed entities and are given unique long identifiers. The Entities instance is responsible for resolving these objects from their identifiers and inversely for looking up the identifier of a given entity. Most plugins process entities using their identifiers, because dealing with integer identifiers is a lot more efficient than working with the actual RDF entities they represent. The Entities interface is the single entry point available to plugins for entity management. It supports the addition of new entities, look-up of entity type and properties, resolving entities, etc.

It is possible to declare two RDF objects to be equivalent in a GraphDB repository, e.g., by using owl:sameAs optimization. In order to provide a way to use such declarations, the Entities interface assigns a class identifier to each entity. For newly created entities, this class identifier is the same as the entity identifier. When two entities are declared equivalent, one of them adopts the class identifier of the other, and thus they become members of the same equivalence class. The Entities interface exposes the entity class identifier for plugins to determine which entities are equivalent.

Entities within an Entities instance have a certain scope. There are three entity scopes:

  • Default – entities are persisted on the disk and can be used in statements that are also physically stored on disk. They have positive (non-zero) identifiers, and are often referred to as physical or data entities.

  • System – system entities have negative identifiers and are not persisted on the disk. They can be used, for example, for system (or magic) predicates that can provide configuration to a plugin or request something to be handled by a plugin. They are available throughout the whole repository lifetime, but after restart, they have to be recreated again.

  • Request – entities are not persisted on disk and have negative identifiers. They only live in the scope of a particular request, and are not visible to other concurrent requests. These entities disappear immediately after the request processing finishes. The request scope is useful for temporary entities such as those entities that are returned by a plugin as a response to a particular query.

The Statements interface represents a set of RDF statements, where ‘statement’ means a quadruple of subject, predicate, object, and context RDF entity identifiers. Statements can be searched for but not modified.

Consuming or returning statements

An important abstract class, which is related to GraphDB internals, is StatementIterator. It has a boolean next() method, which attempts to scroll the iterator onto the next available statement and returns true only if it succeeds. In case of success, its subject, predicate, object, and context fields are initialized with the respective components of the next statement. Furthermore, some properties of each statement are available via the following methods:

  • boolean isReadOnly() – returns true if the statement is in the Axioms part of the rule-file or is imported at initialization;

  • boolean isExplicit() – returns true if the statement is explicitly asserted;

  • boolean isImplicit() – returns true if the statement is produced by the inferencer (raw statements can be both explicit and implicit).

Here is a brief example that puts Statements, Entities, and StatementIterator together in order to output all literals that are related to a given URI:

// resolve the URI identifier
long id = entities.resolve(SimpleValueFactory.getInstance().createIRI("http://example/uri"));

// retrieve all statements with this identifier in subject position
StatementIterator iter = statements.get(id, 0, 0, 0);
while (iter.next()) {
    // only process literal objects
    if (entities.getType(iter.object) == Entities.Type.LITERAL) {
        // resolve the literal and print out its value
        Value literal = entities.get(iter.object);
        System.out.println(literal.stringValue());
    }
}

StatementIterator is also used to return statements via one of the pattern interpretation interfaces.

Each GraphDB transaction has several properties accessible via PluginConnection:

Transaction ID (PluginConnection.getTransactionId())

An integer value. Bigger values indicate newer transactions.

Testing (PluginConnection.isTesting())

A boolean value indicating the testing status of transaction. In GraphDB EE the testing transaction is the first execution of a given transaction that determines if the transaction can be executed successfully before being propagated to the entire cluster. Despite the _testing_ name it is a full-featured transaction that will modify the data. In GraphDB Free and SE the transaction is always executed only once so it is always testing there.

System properties

PluginConnection provides access to various static repository and system properties via getProperties(). The values of these properties are set at repository initialization time and will not change while the repository is operating.

The getProperties() method returns an instance of SystemProperties:

/**
 * This interface represents various properties for the running GraphDB instance and the repository as seen by the Plugin API.
 */
public interface SystemProperties {
    /**
     * Returns the read-only status of the current repository.
     *
     * @return true if read-only, false otherwise
     */
    boolean isReadOnly();

    /**
     * Returns the number of bits needed to represent an entity id
     *
     * @return the number of bits as an integer
     */
    int getEntityIdSize();

    /**
     * Returns the type of the current repository.
     *
     * @return one of {@link RepositoryType#FREE}, {@link RepositoryType#SE} or {@link RepositoryType#EE}
     */
    RepositoryType getRepositoryType();

    /**
     * Returns the full GraphDB version string.
     *
     * @return a string describing the GraphDB version
     */
    String getVersion();

    /**
     * Returns the GraphDB major version component.
     *
     * @return the major version as an integer
     */
    int getVersionMajor();

    /**
     * Returns the GraphDB minor version component.
     *
     * @return the minor version as an integer
     */
    int getVersionMinor();

    /**
     * Returns the GraphDB patch version component.
     *
     * @return the patch version as an integer
     */
    int getVersionPatch();

    /**
     * Returns the number of cores in the currently set license up to the physical number of cores on the machine.
     *
     * @return the number of cores as an integerÒ
     */
    int getNumberOfLicensedCores();

    /**
     * The possible editions for GraphDB repositories.
     */
    enum RepositoryType {
        /**
         * GraphDB Free repository
         */
        FREE,
        /**
         * GraphDB SE repository
         */
        SE,
        /**
         * GraphDB EE worker repository
         */
        EE
    }
}

Repository properties

There are some dynamic repository properties that may change once a repository has been initialized. These properties are:

Repository fingerprint (PluginConnection.getFingerprint())

The repository fingerprint. Note that the fingerprint will be updated at the very end of a transaction so the updated fingerprint after a transaction should be accessed within PluginTransactionListener.transactionCompleted().

Whether the repository is attached to a cluster (PluginConnection.isAttached())

GraphDB EE worker repositories are typically attached to a master repository and not accessed directly. When this is the case this method will return true and the plugin may use it to refuse to perform actions that may cause the fingerprint to change outside of a transaction. In GraphDB Free and SE the method always returns false.

Query processing

As already mentioned, a plugin’s interaction with each of the request-processing phases is optional. The plugin declares if it plans to participate in any phase by implementing the appropriate interface.

Pre-processing

A plugin that will be participating in request pre-processing must implement the Preprocessor interface. It looks like this:

/**
 * Interface that should be implemented by all plugins that need to maintain per-query context.
 */
public interface Preprocessor {
    /**
     * Pre-processing method called once for every SPARQL query or getStatements() request before it is
     * processed.
     *
     * @param request request object
     * @return context object that will be passed to all other plugin methods in the future stages of the
     * request processing
     */
    RequestContext preprocess(Request request);
}

The preprocess(Request request) method receives the request object and returns a RequestContext instance. The passed request parameter is an instance of one the interfaces extending Request, depending on the type of the request (QueryRequest for a SPARQL query or StatementRequest for “get statements”). The plugin changes the request object accordingly, initializes, and returns its context object, which is passed back to it in every other method during the request processing phase. The returned request context may be null, but regardless of it is, it is only visible to the plugin that initializes it. It can be used to store data visible for (and only for) this whole request, e.g., to pass data related to two different statement patterns recognized by the plugin. The request context gives further request processing phases access to the Request object reference. Plugins that opt to skip this phase do not have a request context, and are not able to get access to the original Request object.

Plugins may create their own RequestContext implementation or use the default one, RequestContextImpl.

Pattern interpretation

This is one of the most important phases in the life cycle of a plugin. In fact, most plugins need to participate in exactly this phase. This is the point where request statement patterns need to get evaluated and statement results are returned.

For example, consider the following SPARQL query:

SELECT * WHERE {
    ?s <http://example.com/predicate> ?o
}

There is just one statement pattern inside this query: ?s <http://example/predicate> ?o. All plugins that have implemented the PatternInterpreter interface (thus declaring that they intend to participate in the pattern interpretation phase) are asked if they can interpret this pattern. The first one to accept it and return results will be used. If no plugin interprets the pattern, it will look to use the repository’s physical statements, i.e., the ones persisted on the disk.

Here is the PatternInterpreter interface:

/**
 * Interface implemented by plugins that want to interpret basic triple patterns
 */
public interface PatternInterpreter {
    /**
     * Estimate the number of results that could be returned by the plugin for the given parameters
     *
     * @param subject          subject ID (alternatively {@link Entities#BOUND} or {@link Entities#UNBOUND})
     * @param predicate        predicate ID (alternatively {@link Entities#BOUND} or {@link Entities#UNBOUND})
     * @param object           object ID (alternatively {@link Entities#BOUND} or {@link Entities#UNBOUND})
     * @param context          context value (alternatively {@link Entities#BOUND} or {@link Entities#UNBOUND})
     * @param pluginConnection an instance of {@link PluginConnection}
     * @param requestContext   context object as returned by {@code Preprocessor.preprocess()} or null
     * @return approximate number of results that could potentially be returned for this parameters by the
     * interpret() method
     */
    double estimate(long subject, long predicate, long object, long context, PluginConnection pluginConnection,
                    RequestContext requestContext);

    /**
     * Interpret basic triple pattern and return {@link StatementIterator} with results
     *
     * @param subject        subject ID (alternatively {@link Entities#BOUND} or {@link Entities#UNBOUND})
     * @param predicate      predicate ID (alternatively {@link Entities#BOUND} or {@link Entities#UNBOUND})
     * @param object         object ID (alternatively {@link Entities#BOUND} or {@link Entities#UNBOUND})
     * @param context        context value (alternatively {@link Entities#BOUND} or {@link Entities#UNBOUND})
     * @param pluginConnection an instance of {@link PluginConnection}
     * @param requestContext context object as returned by {@code Preprocessor.preprocess()} or null
     * @return statement iterator of results
     */
    StatementIterator interpret(long subject, long predicate, long object, long context,
                                PluginConnection pluginConnection, RequestContext requestContext);
}

The estimate() and interpret() methods take the same arguments and are used in the following way:

  • Given a statement pattern (e.g., the one in the SPARQL query above), all plugins that implement PatternInterpreter are asked to interpret() the pattern. The subject, predicate, object and context values are either the identifiers of the values in the pattern or 0, if any of them is an unbound variable. The statements and entities objects represent respectively the statements and entities that are available for this particular request. For instance, if the query contains any FROM <http://some/graph> clauses, the statements object will only provide access to the statements in the defined named graphs. Similarly, the entities object contains entities that might be valid only for this particular request. The plugin’s interpret() method must return a StatementIterator if it intends to interpret this pattern, or null if it refuses.

  • In case the plugin signals that it will interpret the given pattern (returns a non-null value), GraphDB’s query optimizer will call the plugin’s estimate() method, in order to get an estimate on how many results will be returned by the StatementIterator returned by interpret(). This estimate does not need to be precise. But the more precise it is, the more likely the optimizer will make an efficient optimization. There is a slight difference in the values that will be passed to estimate(). The statement components (e.g., subject) might not only be entity identifiers, but they can also be set to 2 special values:

    • Entities.BOUND – the pattern component is said to be bound, but its particular binding is not yet known;

    • Entities.UNBOUND – the pattern component will not be bound. These values must be treated as hints to the estimate() method to provide a better approximation of the result set size, although its precise value cannot be determined before the query is actually run.

  • After the query has been optimized, the interpret() method of the plugin might be called again should any variable become bound due to the pattern reordering applied by the optimizer. Plugins must be prepared to expect different combinations of bound and unbound statement pattern components, and return appropriate iterators.

The requestContext parameter is the value returned by the preprocess() method if one exists, or null otherwise.

Results are returned as statements.

The plugin framework also supports the interpretation of an extended type of a list pattern.

Consider the following SPARQL queries:

SELECT * WHERE {
    ?s <http://example.com/predicate> (?o1 ?o2)
}
SELECT * WHERE {
    (?s1, ?s2) <http://example.com/predicate> ?o
}

Internally the object or subject list will be converted to a series of triples conforming to rdf:List. These triples can be handled with PatternInterpreter but the whole list semantics will have to be implemented by the plugin.

In order to make this task easier the Plugin API defines two additional interfaces very similar to the PatternInterpreter interface – ListPatternInterpreter and SubjectListPatternInterpreter.

ListPatternInterpreter handles lists in the object position:

/**
 * Interface implemented by plugins that want to interpret list-like triple patterns
 */
public interface ListPatternInterpreter {
    /**
     * Estimate the number of results that could be returned by the plugin for the given parameters
     *
     * @param subject          subject ID (alternatively {@link Entities#BOUND} or {@link Entities#UNBOUND})
     * @param predicate        predicate ID (alternatively {@link Entities#BOUND} or {@link Entities#UNBOUND})
     * @param objects          object IDs (alternatively {@link Entities#BOUND} or {@link Entities#UNBOUND})
     * @param context          context value (alternatively {@link Entities#BOUND} or {@link Entities#UNBOUND})
     * @param pluginConnection an instance of {@link PluginConnection}
     * @param requestContext   context object as returned by {@code Preprocessor.preprocess()} or null
     * @return approximate number of results that could potentially be returned for this parameters by the
     * interpret() method
     */
    double estimate(long subject, long predicate, long[] objects, long context, PluginConnection pluginConnection,
                    RequestContext requestContext);

    /**
     * Interpret list-like triple pattern and return {@link StatementIterator} with results
     *
     * @param subject          subject ID (alternatively {@link Entities#BOUND} or {@link Entities#UNBOUND})
     * @param predicate        predicate ID (alternatively {@link Entities#BOUND} or {@link Entities#UNBOUND})
     * @param objects          object IDs (alternatively {@link Entities#BOUND} or {@link Entities#UNBOUND})
     * @param context          context value (alternatively {@link Entities#BOUND} or {@link Entities#UNBOUND})
     * @param pluginConnection an instance of {@link PluginConnection}
     * @param requestContext   context object as returned by {@code Preprocessor.preprocess()} or null
     * @return statement iterator of results
     */
    StatementIterator interpret(long subject, long predicate, long[] objects, long context,
                                PluginConnection pluginConnection, RequestContext requestContext);
}

It differs from PatternInterpreter by having multiple objects passed as an array of long, instead of a single long object. The semantics of both methods is equivalent to the one in the basic pattern interpretation case.

SubjectListPatternInterpreter handles lists in the subject position:

/**
 * Interface implemented by plugins that want to interpret list-like triple patterns
 */
public interface SubjectListPatternInterpreter {
    /**
     * Estimate the number of results that could be returned by the plugin for the given parameters
     *
     * @param subjects         subject IDs (alternatively {@link Entities#BOUND} or {@link Entities#UNBOUND})
     * @param predicate        predicate ID (alternatively {@link Entities#BOUND} or {@link Entities#UNBOUND})
     * @param object           object ID (alternatively {@link Entities#BOUND} or {@link Entities#UNBOUND})
     * @param context          context value (alternatively {@link Entities#BOUND} or {@link Entities#UNBOUND})
     * @param pluginConnection an instance of {@link PluginConnection}
     * @param requestContext   context object as returned by {@code Preprocessor.preprocess()} or null
     * @return approximate number of results that could potentially be returned for this parameters by the
     * interpret() method
     */
    double estimate(long[] subjects, long predicate, long object, long context, PluginConnection pluginConnection,
                    RequestContext requestContext);

    /**
     * Interpret list-like triple pattern and return {@link StatementIterator} with results
     *
     * @param subjects       subject IDs (alternatively {@link Entities#BOUND} or {@link Entities#UNBOUND})
     * @param predicate      predicate ID (alternatively {@link Entities#BOUND} or {@link Entities#UNBOUND})
     * @param object         object ID (alternatively {@link Entities#BOUND} or {@link Entities#UNBOUND})
     * @param context        context value (alternatively {@link Entities#BOUND} or {@link Entities#UNBOUND})
     * @param pluginConnection an instance of {@link PluginConnection}
     * @param requestContext context object as returned by {@code Preprocessor.preprocess()} or null
     * @return statement iterator of results
     */
    StatementIterator interpret(long[] subjects, long predicate, long object, long context,
                                PluginConnection pluginConnection, RequestContext requestContext);
}

It differs from PatternInterpreter by having multiple subjects passed as an array of long, instead of a single long subject. The semantics of both methods is equivalent to the one in the basic pattern interpretation case.

Post-processing

There are cases when a plugin would like to modify or otherwise filter the final results of a request. This is where the Postprocessor interface comes into play:

/**
 * Interface that should be implemented by plugins that need to post-process results from queries.
 */
public interface Postprocessor {
    /**
     * A query method that is used by the framework to determine if a {@link Postprocessor} plugin really wants to
     * post-process the request results.
     *
     * @param requestContext the request context reference
     * @return boolean value
     */
    boolean shouldPostprocess(RequestContext requestContext);

    /**
     * Method called for each {@link BindingSet} in the query result set. Each binding set is processed in
     * sequence by all plugins that implement the {@link Postprocessor} interface, piping the result returned
     * by each plugin into the next one. If any of the post-processing plugins returns null the result is
     * deleted from the result set.
     *
     * @param bindingSet     binding set object to be post-processed
     * @param requestContext context objected as returned by {@link Preprocessor#preprocess(Request)} (in case this plugin
     *                       implemented this interface)
     * @return binding set object that should be post-processed further by next post-processing plugins or
     * null if the current binding set should be deleted from the result set
     */
    BindingSet postprocess(BindingSet bindingSet, RequestContext requestContext);

    /**
     * Method called after all post-processing has been finished for each plugin. This is the point where
     * every plugin could introduce its results even if the original result set was empty
     *
     * @param requestContext context objected as returned by {@link Preprocessor#preprocess(Request)} (in case this plugin
     *                       implemented this interface)
     * @return iterator for resulting binding sets that need to be added to the final result set
     */
    Iterator<BindingSet> flush(RequestContext requestContext);
}

The postprocess() method is called for each binding set that is to be returned to the repository client. This method may modify the binding set and return it, or alternatively, return null, in which case the binding set is removed from the result set. After a binding set is processed by a plugin, the possibly modified binding set is passed to the next plugin having post-processing functionality enabled. After the binding set is processed by all plugins (in the case where no plugin deletes it), it is returned to the client. Finally, after all results are processed and returned, each plugin’s flush() method is called to introduce new binding set results in the result set. These in turn are finally returned to the client.

Update processing

Updates involving specific predicates

As well as query/read processing, plugins are able to process update operations for statement patterns containing specific predicates. In order to intercept updates, a plugin must implement the UpdateInterpreter interface. During initialization, the getPredicatesToListenFor() is called once by the framework, so that the plugin can indicate which predicates it is interested in.

From then onwards, the plugin framework filters updates for statements using these predicates and notifies the plugin. The plugin may do whatever processing is required and must return a boolean value indicating whether the statement should be skipped. Skipped statements are not processed further by GraphDB, so the insert or delete will have no effect on actual data in the repository.

/**
 * An interface that should be implemented by the plugins that want to be notified for particular update
 * events. The getPredicatesToListenFor() method should return the predicates of interest to the plugin. This
 * method will be called once only immediately after the plugin has been initialized. After that point the
 * plugin's interpretUpdate() method will be called for each inserted or deleted statement sharing one of the
 * predicates of interest to the plugin (those returned by getPredicatesToListenFor()).
 */
public interface UpdateInterpreter {
    /**
     * Returns the predicates for which the plugin needs to get notified when a statement with such a predicate is added or removed.
     *
     * @return array of predicates as entity IDs
     */
    long[] getPredicatesToListenFor();

    /**
     * Hook that is called whenever a statement containing one of the registered predicates
     * (see {@link #getPredicatesToListenFor()} is added or removed.
     *
     * @param subject          subject value of the updated statement
     * @param predicate        predicate value of the updated statement
     * @param object           object value of the updated statement
     * @param context          context value of the updated statement
     * @param isAddition       true if the statement was added, false if it was removed
     * @param isExplicit       true if the updated statement was explicit one
     * @param pluginConnection an instance of {@link PluginConnection}
     * @return true - when the statement was handled by the plugin only and should <i>NOT</i> be added to/removed from the repository,
     * false - when the statement should be added to/removed from the repository
     */
    boolean interpretUpdate(long subject, long predicate, long object, long context, boolean isAddition,
                            boolean isExplicit, PluginConnection pluginConnection);
}

Removal of entire contexts

Statement deletion in GraphDB is specified as a quadruple (subject, predicate, object, context), where each position can be explicit or null. Null in this case means all subjects, predicates, objects or contexts depending on the position where null was specified.

When at least one of the positions is non-null, the plugin framework will fire individual events for each matching and removed statement.

When all positions are null (i.e., delete everything in the repository) the operation will be optimized internally and individual events will not be fired. This means that UpdateInterpreter and StatementListener will not be called.

ClearInterpreter is an interface that allows plugins to detect the removal of entire contexts or removal of all data in the repository:

/**
 * This interface can be implemented by plugins that want to be notified on clear()
 * or remove() (all statements in any context).
 */
public interface ClearInterpreter {
    /**
     * Notification called before the statements are removed from the given context.
     *
     * @param context          the ID of the context or 0 if all contexts
     * @param pluginConnection an instance of {@link PluginConnection}
     */
    void beforeClear(long context, PluginConnection pluginConnection);

    /**
     * Notification called after the statements have been removed from the given context.
     *
     * @param context          the ID of the context or 0 if all contexts
     * @param pluginConnection an instance of {@link PluginConnection}
     */
    void afterClear(long context, PluginConnection pluginConnection);
}

Intercepting data for specific contexts

The Plugin API provides a way to intercept data inserted into or removed from a particular predefined context. The ContextUpdateHandler interface:

/**
 * This interface provides a mechanism for plugins to handle updates to certain contexts.
 * When a plugin requests handling of a context, all data for that context will forwarded to the plugin
 * and not inserted into any GraphDB collections.
 * <p>
 * Note that unlike other plugin interfaces, {@link ContextUpdateHandler} does not use entity IDs but works directly
 * with the RDF values. Data handled by this interface does not reach the entity pool and so no entity IDs are created.
 */
public interface ContextUpdateHandler {
    /**
     * Returns the contexts for which the plugin will handle the updates.
     *
     * @return array of {@link Resource}
     */
    Resource[] getUpdateContexts();

    /**
     * Hook that handles updates for the configured contexts.
     *
     * @param subject          subject value of the updated statement
     * @param predicate        predicate value of the updated statement
     * @param object           object value of the updated statement
     * @param context          context value of the updated statement (can be null when not an addition, then it means remove from all contexts)
     * @param isAddition       true if statement is being added, false if statement is being removed
     * @param pluginConnection an instance of {@link PluginConnection}
     */
    void handleContextUpdate(Resource subject, IRI predicate, Value object, Resource context, boolean isAddition,
                             PluginConnection pluginConnection);
}

This is similar to Updates involving specific predicates with some important differences:

  • ContextUpdateHandler

    • Configured via a list of contexts specified as IRI objects.

    • Statements with these contexts are passed to the plugin as Value objects and never enter any of the database collections.

    • The plugin is assumed to always handle the update.

  • UpdateInterpreter

    • Configured via a list of predicates specified as integer IDs.

    • Statements with these predicates are passed to the plugin as integer IDs after their RDF values are converted to integer IDs in the entity pool.

    • The plugin decides whether to handle the statement or pass it on to other plugins and eventually to the database.

This mechanism is especially useful for the creation of virtual contexts (graphs) whose data is stored within a plugin and never pollutes any of the database collections with unnecessary values.

Unlike the rest of the Plugin API this interface uses RDF values as objects bypassing the use of integer IDs.

Transactions

A plugin may require to participate in the transaction workflow, e.g., because the plugin needs to update certain data structures such that they reflect the actual data in the repository. Without being part of the transaction the plugin would not know when to persist or discard a given state.

Transactions can be easily tracked by implementing the PluginTransactionListener interface:

/**
 * The {@link PluginTransactionListener} allows plugins to be notified about transactions (start, commit+completed or abort)
 */
public interface PluginTransactionListener {
    /**
     * Notifies the listener about the start of a transaction.
     *
     * @param pluginConnection an instance of {@link PluginConnection}
     */
    void transactionStarted(PluginConnection pluginConnection);

    /**
     * Notifies the listener about the commit phase of a transaction. Plugins should use this event to perform their own
     * commit work if needed or to abort the transaction if needed.
     *
     * @param pluginConnection an instance of {@link PluginConnection}
     */
    void transactionCommit(PluginConnection pluginConnection);

    /**
     * Notifies the listener about the completion of a transaction. This will be the last event in a successful transaction.
     * The plugin is not allowed to throw any exceptions here and if so they will be ignored. If a plugin needs to abort
     * a transaction it should be done in {@link #transactionCommit(PluginConnection)}.
     *
     * @param pluginConnection an instance of {@link PluginConnection}
     */
    void transactionCompleted(PluginConnection pluginConnection);

    /**
     * Notifies the listener about the abortion of a transaction. This will be the last event in an aborted transaction.
     * <p>
     * Plugins should revert any modifications caused by this transaction, including the fingerprint.
     *
     * @param pluginConnection an instance of {@link PluginConnection}
     */
    void transactionAborted(PluginConnection pluginConnection);

    /**
     * Notifies the listener about a user abort request. A user abort request is a request by an end-user to abort the
     * transaction. Unlike the other events this will be called asynchronously whenever the request is received.
     * <p>
     * Plugins may react and terminate any long-running computation or ignore the request. This is just a handy way
     * to speed up abortion when a user requests it. For example, this event may be received asynchronously while
     * the plugin is indexing data (in {@link #transactionCommit(PluginConnection)} running in the main thread).
     * The plugin may notify itself that the indexing should stop. Regardless of the actions taken by the plugin
     * the transaction may still be aborted and {@link #transactionAborted(PluginConnection)} will be called.
     * All clean up of the abortion should be handled in {@link #transactionAborted(PluginConnection)}.
     *
     * @param pluginConnection an instance of {@link PluginConnection}
     */
    default void transactionAbortedByUser(PluginConnection pluginConnection) {

    }
}

Each transaction has a beginning signalled by a call to transactionStarted(). Then the transaction can proceed in several ways:

  • Commit and completion:

    • transactionCommit() is called;

    • transactionCompleted() is called.

  • Commit followed by abortion (typically because another plugin aborted the transaction in its own transactionCommit()):

    • transactionCommit() is called;

    • transactionAborted() is called.

  • Abortion before entering commit:

    • transactionAborted() is called.

Plugins should strive to do all heavy transaction work in transactionCommit() in such a way that call to transactionAborted() can revert the changes. Plugins may throw exceptions in transactionCommit() in order to abort the transaction, e.g., if some constraint was violated.

Plugins should do no heavy processing in transactionCompleted() and are not allowed to throw exceptions there. Such exceptions will be logged and ignored, and the transaction will still go through normally.

The transactionAbortedByUser() will be called asynchronously (e.g., while the plugin is executing transactionCommit() in the main update thread) when a user requests the transaction to be aborted. The plugin may use this to signal its other thread to abort processing at earliest convenience or simply ignore the request.

Exceptions

Plugins may throw exceptions on invalid input, constraint violations or unexpected events (e.g. out of disk space). It is possible to throw such exceptions almost everywhere with the notable exception of PluginTransactionListener.transactionCompleted().

A good practice is to construct an instance of PluginException or one of its subclasses:

  • ClientErrorException – for example when the user provided invalid input.

  • ServerErrorException – for example when an unexpected server error occurred, such as lack of disk permissions.

Accessing other plugins

Plugins can make use of the functionality of other plugins. For example, the Lucene-based full-text search plugin can make use of the rank values provided by the RDF Rank plugin, to facilitate query result scoring and ordering. This is not a matter of re-using program code (e.g., in a .jar with common classes), but rather it is about re-using data. The mechanism to do this allows plugins to obtain references to other plugin objects by knowing their names. To achieve this, they only need to implement the PluginDependency interface:

/**
 * Interface that should be implemented by plugins that depend on other plugins and want to be able to
 * retrieve references to them at runtime.
 */
public interface PluginDependency {
    /**
     * Method used by the plugin framework to inject a {@link PluginLocator} instance in the plugin.
     *
     * @param locator a {@link PluginLocator} instance
     */
    void setLocator(PluginLocator locator);
}

They are then injected into an instance of the PluginLocator interface (during the configuration phase), which does the actual plugin discovery for them:

/**
 * Interface that supports obtaining of a plugin instance by plugin name. An object implementing this
 * interface is injected into plugins that implement the {@link PluginDependency} interface.
 */
public interface PluginLocator {
    /**
     * Retrieves a {@link Plugin} instance by plugin name
     *
     * @param name name of the plugin
     * @return a {@link Plugin} instance or null if a plugin with that name is not available
     */
    Plugin locate(String name);

    /**
     * Retrieves a {@link RDFRankProvider} instance.
     *
     * @return a {@link RDFRankProvider} instance or null if no {@link RDFRankProvider} is available
     */
    RDFRankProvider locateRDFRankProvider();
}

Having a reference to another plugin is all that is needed to call its methods directly and make use of its services.

An important interface related to accessing other plugins is the RDFRankProvider interface. The sole implementation is the RDF Rank plugin but it can be easily replaced by another implementation. By having a dedicated interface it is easy for plugins to get access to RDF ranks without relying on a specific implementation.

List of plugin interfaces and classes

Basics

Plugin

The basic interface that defines a plugin.

PluginBase

A reference abstract implementation of Plugin that can serve as the base for implementing plugins.

There are a couple of extensions of the Plugin interface that add additional configuration or behavior to plugins:

ParallelPlugin
Marks a plugin as aware of parallel processing. The plugin will be injected an instance of PluginExecutorService via setExecutorService(PluginExecutorService executorService).
PluginExecutorService is a simplified version of Java’s ExecutorService and provides an easy mechanism for plugins to schedule parallel tasks safely.

No open-source plugins use ParallelPlugin.

StatelessPlugin
Marks a plugin as stateless. Stateless plugins do not contribute to the repository fingerprint and their fingerprint will not be queried.
It is suitable for plugins that are unimportant for query results or update executions, e.g., plugins that are not typically used in the normal data flow.

Open-source plugins using StatelessPlugin:

On initialize() and shutdown() plugins receive an enum value, InitReason and ShutdownReason respectively, describing the reason why the plugin is being initialized or shut down.

InitReason
  • DEFAULT: initialized as part of the repository initialization or the plugin was enabled;

  • CREATED_BACKUP: initialized after a shutdown for backup;

  • RESTORED_FROM_BACKUP: initialized after a shutdown for restore.

ShutdownReason
  • DEFAULT: shutdown as part of the repository shutdown or the plugin was disabled;

  • CREATE_BACKUP: shutdown before backup;

  • RESTORE_FROM_BACKUP: shutdown before restore.

Plugins may use the reason to handle their own backup scenarios. In most cases it is unnecessary since the plugin’s files will be backed up or restored together with the rest of the repository data.

Data structures

For more information, see Repository internals.

PluginConnection

The main entry to repository internals. Passed to almost all methods in Plugin API interfaces.

ThreadsafePluginConnection

Thread-safe version of PluginConnection. Requested explicitly from PluginConnection and must be explicitly closed when no longer needed.

Open-source plugins using ThreadsafePluginConnection:

Entities

Provides access to the repository’s entities. Entities are mappings from integer IDs to RDF values (IRIs, blank nodes, literals, and RDF-star embedded triples).

Statements

Provides access to the repository’s statements. Results are returned as StatementIterator instances.

StatementIterator

Interface for returning statements. Used both by Statements to list repository data and by plugins to return data via Pattern interpretation.

SystemProperties

Provides access to static repository and system properties such as the GraphDB version and repository type.

All open-source plugins use the repository internals.

Query request handlers

For more information, see Query processing.

Pattern interpretation handlers

The pattern interpretation handlers interpret the evaluation of triple patterns. Each triple pattern will be sent to plugins that implement the respective interface.

For more information, see Pattern interpretation.

PatternInterpreter
Interprets a simple triple pattern, where the subject, predicate, object and context are single values.
This interface handles all triple patterns: subject predicate object context.

Open-source plugins using PatternInterpreter:

ListPatternInterpreter
Interprets a triple pattern, where the subject, predicate and context are single values while the object is a list of values.
This interface handles triple patterns of this form: subject predicate (object1 object2 ...) context.

Open-source plugins using ListPatternInterpreter:

SubjectListPatternInterpreter
Interprets a triple pattern, where the predicate, object and context are single values while the subject is a list of values.
This interface handles triple patterns of this form: (subject1 subject2 ...) predicate object context.

No open-source plugins use SubjectListPatternInterpreter but the usage is similar to ListPatternInterpreter.

Pre- and postprocessing handlers

For more information, see Pre-processing and Post-processing.

Preprocessor

Allows plugins to maintain a per-query context and have access to query/getStatements() properties.

Open-source plugins using Preprocessor:

Postprocessor

Allows plugins to modify the final result of a query/getStatements() request.

No open-source plugins use Postprocessor but the example plugins do.

Query request support classes

Request

A basic read request. Passed to Preprocess.preprocess(). Provides access to the isIncludeInferred property.

QueryRequest

An extension of Request for SPARQL queries. It provides access to the various constituents of the query such as the FROM clauses and the parsed query.

StatementsRequest

An extension of Request for RepositoryConnection.getStatements(). It provides access to each of the individual constituents of the request quadruple (subject, predicate, object, and context).

RequestContext

Plugins may create an instance of this interface in Preprocess.preprocess() to keep track of request-global data. The instance will be passed to PatternInterpreter, ListPatternInterpreter, SubjectListPatternInterpreter and Postprocessor.

RequestContextImpl

A default implementation of RequestContext that provides a way to keep arbitrary values by key.

Update request handlers

The update request handlers are responsible for processing updates. Unlike the query request handlers, the update handlers will be called only for statements that match a predefined pattern.

For more information, see Update processing.

UpdateInterpreter
Handles the addition or removal of statements. Only statements that have one of a set of predefined predicates will be passed to the handler.
The return value determines if the statement will be added or deleted as real data (in the repository) or processed only by the plugin.
Note that this handler will not be called for each individual statement when removing all statements from all contexts.

Open-source plugins using UpdateInterpreter:

ClearInterpreter
Handles the removal of all statements in a given context or in all contexts.
This handler is especially useful when all statements in all contexts are removed since UpdateInterpreter will not be called in this case.

No open-source plugins use ClearInterpreter.

ContextUpdateHandler
Handles the addition or removal of statements in a set of predefined contexts.
This can be used to implement virtual contexts and is the only part of the Plugin API that does not use integer identifiers but RDF values directly.

No open-source plugins use ContextUpdateHandler.

Notification listeners

In general the listeners are used as simple notifications about a certain event, such as the beginning of a new transaction or the creation of a new entity.

EntityListener

Notified about the creation of a new data entity (IRI, blank node, or literal).

Open-source plugins using EntityListener:

StatementListener
Notifications about the addition or removal of a statement.
Unlike UpdateInterpreter, this listener will be notified about all statements and not just statements with a predefined predicate. The statement will be added or removed regardless of the return value.

Open-source plugins using StatementListener:

PluginTransactionListener and ParallelTransactionListener
Notifications about the different stages of a transaction (started, followed by either commit + completed or aborted).
Plugins should do the bulk of their transaction work within the commit stage.
ParallelTransactionListener is a marker extensions of PluginTransactionListener whose commit stage is safe to call in parallel with the commit stage of other plugins.
If the plugin does not perform any lengthy operations in the commit stage, it is better to stick to PluginTransactionListener.

Open-source plugins using PluginTransactionListener or ParallelTransactionListener:

Plugin dependencies

For more information, see Accessing other plugins.

PluginDependency

Plugins that need to use other plugins directly must implement this interface. They will be injected an instance of PluginLocator.

PluginLocator

Provides access to other plugins by name or to the default implementation of RDFRankProvider.

RDFRankProvider

A plugin that provides an RDF rank. The only implementation is the RDF Rank plugin.

Health checks

The health check classes can be used to include a plugin in the repository health check.

HealthCheckable

Marks a component (a plugin or part of a plugin) as able to provide health checks. If a plugin implements this interface it will be included in the repository health check.

HealthResult

The result from a health check. In general health results can be green (everything ok), yellow (needs attention) or red (something broken).

CompositeHealthResult

A composite health result that aggregates several HealthResult instances into a single HealthResult.

No open-source implement health checks.

Exceptions

A set of predefined exception classes that can be used by plugins.

PluginException

Generic plugin exception. Extends RuntimeException.

ClientErrorException

User (client) error, e.g. invalid input. Extends PluginException.

ServerErrorException

Server error, e.g. something unexpected such as lack of disk permissions. Extends PluginException.

Adding external plugins to GraphDB

With the graphdb.extra.plugins property, you can attach a directory with external plugins when starting GraphDB. It is set the following way:

graphdb -Dgraphdb.extra.plugins=path/to/directory/with/external/plugins

If the property is omitted when starting GraphDB, then you need to load external plugins by placing them in the dist/lib/plugins directory and then restarting GraphDB.

Tip

This property is useful in situations when, for example, GraphDB is used in an environment such as Kubernetes, where the database cannot be restarted and the dist folder cannot be persisted.

Putting it all together: example plugins

A project containing two example plugins, ExampleBasicPlugin and ExamplePlugin can be found here.

ExampleBasicPlugin

ExampleBasicPlugin has the following functionality:

  • It interprets the pattern ?s <http://example.com/now> ?o and binds the object to a literal containing the system date/time of the machine running GraphDB. The subject position is not used and its value does not matter.

The plugin implements the PatternInterpreter interface. A date/time literal is created as a request-scope entity to avoid cluttering the repository with extra literals.

The plugin extends the PluginBase class that provides a default implementation of the Plugin interface:

public class ExampleBasicPlugin extends PluginBase {
    // The predicate we will be listening for
    private static final String TIME_PREDICATE = "http://example.com/now";

    private IRI predicate; // The predicate IRI
    private long predicateId; // ID of the predicate in the entity pool

    // Service interface methods
    @Override
    public String getName() {
        return "exampleBasic";
    }

    // Plugin interface methods
    @Override
    public void initialize(InitReason reason, PluginConnection pluginConnection) {
        // Create an IRI to represent the predicate
        predicate = SimpleValueFactory.getInstance().createIRI(TIME_PREDICATE);
        // Put the predicate in the entity pool using the SYSTEM scope
        predicateId = pluginConnection.getEntities().put(predicate, Entities.Scope.SYSTEM);

        getLogger().info("ExampleBasic plugin initialized!");
    }
}

In this basic implementation, the plugin name is defined and during initialization, a single system-scope predicate is registered.

Note

It is important not to forget to register the plugin in the META-INF/services/com.ontotext.trree.sdk.Plugin file in the classpath.

The next step is to implement the first of the plugin’s requirements – the pattern interpretation part:

public class ExamplePlugin extends PluginBase implements PatternInterpreter {

    // ... initialize() and getName()

    // PatternInterpreter interface methods
    @Override
    public StatementIterator interpret(long subject, long predicate, long object, long context,
                                       PluginConnection pluginConnection, RequestContext requestContext) {
        // Ignore patterns with predicate different than the one we are interested in. We want to return the
        // SystemDate only when we detect the <http://example.com/time> predicate.
        if (predicate != predicateId)
            // This will tell the PluginManager that we cannot interpret the statement so the statement can be passed
            // to another plugin.
            return null;

        // Create the date/time literal. Here it is important to create the literal in the entities instance of the
        // request and NOT in getEntities(). If you create it in the entities instance returned by getEntities() it
        // will not be visible in the current request.
        long literalId = createDateTimeLiteral(pluginConnection.getEntities());

        // return a StatementIterator with a single statement to be iterated. The object of this statement will be the
        // current timestamp.
        return StatementIterator.create(subject, predicate, literalId, 0);
    }

    @Override
    public double estimate(long subject, long predicate, long object, long context,
                           PluginConnection pluginConnection, RequestContext requestContext) {
        // We always return a single statement so we return a constant 1. This value will be used by the QueryOptimizer
        // when crating the execution plan.
        return 1;
    }

    private long createDateTimeLiteral(Entities entities) {
        // Create a literal for the current timestamp.
        Value literal = SimpleValueFactory.getInstance().createLiteral(new Date());

        // Add the literal in the entity pool with REQUEST scope. This will make the literal accessible only for the
        // current Request and will be disposed once the request is completed. Return it's ID.
        return entities.put(literal, Entities.Scope.REQUEST);
    }

}

The interpret() method only processes patterns with a predicate matching the desired predicate identifier. Further on, it simply creates a new date/time literal (in the request scope) and places its identifier in the object position of the returned single result. The estimate() method always returns 1, because this is the exact size of the result set.

ExamplePlugin

ExamplePlugin has the following functionality:

  • If a FROM <http://example.com/time> clause is detected in the query, the result is a single binding set in which all projected variables are bound to a literal containing the system date/time of the machine running GraphDB.

  • If a triple with the subject http://example.com/time and one of the predicates http://example.com/goInFuture or http://example.com/goInPast is inserted, its object is set as a positive or negative offset for all future requests querying the system date/time via the plugin.

The plugin extends the PluginBase class that provides a default implementation of the Plugin interface:

public class ExamplePlugin extends PluginBase implements UpdateInterpreter, Preprocessor, Postprocessor {

    private static final String PREFIX = "http://example.com/";

    private static final String TIME_PREDICATE = PREFIX + "time";
    private static final String GO_FUTURE_PREDICATE = PREFIX + "goInFuture";
    private static final String GO_PAST_PREDICATE = PREFIX + "goInPast";

    private int timeOffsetHrs = 0;

    private IRI timeIri;

    // IDs of the entities in the entity pool
    private long timeID;
    private long goFutureID;
    private long goPastID;


    // Service interface methods
    @Override
    public String getName() {
        return "example";
    }

    // Plugin interface methods
    @Override
    public void initialize(InitReason reason, PluginConnection pluginConnection) {
        // Create IRIs to represent the entities
        timeIri = SimpleValueFactory.getInstance().createIRI(TIME_PREDICATE);
        IRI goFutureIRI = SimpleValueFactory.getInstance().createIRI(GO_FUTURE_PREDICATE);
        IRI goPastIRI = SimpleValueFactory.getInstance().createIRI(GO_PAST_PREDICATE);

        // Put the entities in the entity pool using the SYSTEM scope
        timeID = pluginConnection.getEntities().put(timeIri, Entities.Scope.SYSTEM);
        goFutureID = pluginConnection.getEntities().put(goFutureIRI, Entities.Scope.SYSTEM);
        goPastID = pluginConnection.getEntities().put(goPastIRI, Entities.Scope.SYSTEM);

        getLogger().info("Example plugin initialized!");
    }
}

In this implementation, the plugin name is defined and during initialization, three system-scope predicates are registered.

To implement the first functional requirement the plugin must inspect the query and detect the FROM clause in the pre-processing phase. Then, the plugin must hook into the post-processing phase where, if the pre-processing phase detected the desired FROM clause, it deletes all query results (in postprocess() and returns a single result (in flush()) containing the binding set specified by the requirements. Since this happens as part of pre- and post-processing we can pass the literals without going through the entity pool and using integer IDs.

To do this the plugin must implement Preprocessor and Postprocessor:

public class ExamplePlugin extends PluginBase implements Preprocessor, Postprocessor {
    // ... initialize() and getName()

    // Preprocessor interface methods
    @Override
    public RequestContext preprocess(Request request) {
        // We are interested only in QueryRequests
        if (request instanceof QueryRequest) {
            QueryRequest queryRequest = (QueryRequest) request;
            Dataset dataset = queryRequest.getDataset();

            // Check if the predicate is included in the default graph. This means that we have a "FROM <our_predicate>"
            // clause in the SPARQL query.
            if ((dataset != null && dataset.getDefaultGraphs().contains(timeIri))) {
                // Create a date/time literal
                Value literal = createDateTimeLiteral();

                // Prepare a binding set with all projected variables set to the date/time literal value
                MapBindingSet result = new MapBindingSet();
                for (String bindingName : queryRequest.getTupleExpr().getBindingNames()) {
                    result.addBinding(bindingName, literal);
                }

                // Create a Context object which will be available during the other phases of the request processing
                // and set the created result as an attribute.
                RequestContextImpl context = new RequestContextImpl();
                context.setAttribute("bindings", result);

                return context;
            }
        }
        // If we are not interested in the request there is no need to create a Context.
        return null;
    }

    // Postprocessor interface methods
    @Override
    public boolean shouldPostprocess(RequestContext requestContext) {
        // Postprocess only if we have created RequestContext in the Preprocess phase. Here the requestContext object
        // is the same one that we created in the preprocess(...) method.
        return requestContext != null;
    }

    @Override
    public BindingSet postprocess(BindingSet bindingSet, RequestContext requestContext) {
        // Filter all results. Returning null will remove the binding set from the returned query result.
        // We will add the result we want in the flush() phase.
        return null;
    }

    @Override
    public Iterator<BindingSet> flush(RequestContext requestContext) {
        // Get the BindingSet we created in the Preprocess phase and return it.
        // This will be returned as the query result.
        BindingSet result = (BindingSet) ((RequestContextImpl) requestContext).getAttribute("bindings");
        return new SingletonIterator<>(result);
    }

    private Literal createDateTimeLiteral() {
        // Create a literal for the current timestamp.
        Calendar calendar = Calendar.getInstance();
        calendar.add(Calendar.HOUR, timeOffsetHrs);

        return SimpleValueFactory.getInstance().createLiteral(calendar.getTime());
    }
}

The plugin creates an instance of RequestContext using the default implementation RequestContextImpl. It can hold attributes of any type referenced by a name. Then the plugin creates a BindingSet with the date/time literal, bound to every variable name in the query projection, and sets it as an attribute with the name “bindings”. The postprocess() method filters out all results if the requestContext is non-null (i.e., if the FROM clause was detected by preprocess()). Finally, flush() returns a singleton iterator, containing the desired binding set in the required case or does not return anything.

To implement the second functional requirement that allows setting an offset in the future or the past, the plugin must react to specific update statements. This is achieved via implementing UpdateInterpreter:

public class ExamplePlugin extends PluginBase implements UpdateInterpreter, Preprocessor, Postprocessor {
    // ... initialize() and getName()

    // ... Pre- and Postprocessor methods

    // UpdateInterpreter interface methods
    @Override
    public long[] getPredicatesToListenFor() {
        // We can filter the tuples we are interested in by their predicate. We are interested only
        // in tuples with have the predicate we are listening for.
        return new long[] {goFutureID, goPastID};
    }

    @Override
    public boolean interpretUpdate(long subject, long predicate, long object, long context, boolean isAddition,
                                   boolean isExplicit, PluginConnection pluginConnection) {
        // Make sure that the subject is the time entity
        if (subject == timeID) {
            final String intString = pluginConnection.getEntities().get(object).stringValue();
            int step;
            try {
                step = Integer.parseInt(intString);
            } catch (NumberFormatException e) {
                // Invalid input, propagate the error to the caller
                throw new ClientErrorException("Invalid integer value: " + intString);
            }

            if (predicate == goFutureID) {
                timeOffsetHrs += step;
            } else if (predicate == goPastID) {
                timeOffsetHrs -= step;
            }

            // We handled the statement.
            // Return true so the statement will not be interpreted by other plugins or inserted in the DB
            return true;
        }

        // Tell the PluginManager that we can not interpret the tuple so further processing can continue.
        return false;
    }
}

UpdateInterpreter must specify the predicates the plugin is interested in via getPredicatesToListenFor(). Then whenever a statement with one of those predicates is inserted or removed the plugin framework calls interpretUpdate(). The plugin then checks if the subject value is http://example.com/time and if so handles the update and returns true to the plugin framework to signal that the plugin has processed the update and it needs not be inserted as regular data.