Rules optimizations

GraphDB includes the useful feature of rule optimizing that allows you to profile and debug rule performance.

How to enable rule profiling

Rule profiling prints out statistics about rule execution.

To enable rule profiling, start GraphDB with the following Java option:

-Denable-debug-rules=true

This enables the collection of rule statistics (various counters).

Warning

Rule Profiling Limitations

  • Must use a custom ruleset, since built-in rulesets do not have the required instrumentation (counters);

  • The debug rules statistics are available only for importing data in serial mode. It does not work for Parallel Inferencing, which is default. Check Force serial pipeline in the Import settings dialog to enable it.

Warning

Rule profiling slows down the rule execution (the leading premise checking part) by 10-30%, so do not use it in production.

Log file

When rule profiling is enabled:

  • Complete rule statistics are printed at every million statements, every 5 minutes, or on shutdown, depending on which occurs first.

  • They are written to graphdb-folder/logs/main-<date>.log;

  • The descriptive rule stats format looks like this:

    ----------rs start----------
    Rule statistics for repository <name> :
    RULE: ...
    
    Time overall (all rules): ... ns.
    ----------rs end----------
    
  • A tabular format of the descriptive rule stats is also supported, e.g.:

    ----------rs csv start----------
    Repository name: university
    RULE;TIME;INVOKED;ITERNEXTS;FIRED;FIRETIME;INFERRED
    prp_spo1_1;36973700;66;81;81;36917600;28
    prp_spo1_0;34354600;17;83;83;34310500;25
    ----------rs csv end----------
    
  • Stats are printed for each active repository.

  • Stats are cumulative, so find the last section rs startrs end for your repo of interest.

  • Rule variants are ordered by total time (descending).

For example, consider the following rule:

Id: ptop_PropRestr
  t <ptop:premise>     p
  t <ptop:restriction> r
  t <ptop:conclusion>  q
  t <rdf:type>         <ptop:PropRestr>
  x p y
  x r y
  ----------------
  x q y

This is a conjunction of two props. It is declared with the axiomatic (A-Box) triples involving t. Whenever the premise p and restriction r hold between two resources, the rule infers the conclusion q between the same resources, i.e., p & r => q.

The corresponding log for variant 4 of this rule may look like the following:

RULE ptop_PropRestr_4 invoked 163,475,763 times.
ptop_PropRestr_4:
e b f
a ptop_premise b
a rdf_type ptop_PropRestr
e c f
a ptop_restriction c
a ptop_conclusion d
------------------------------------
e d f

a ptop_conclusion d invoked 1,456,793 times and took 1,814,710,710 ns.
a rdf_type ptop_PropRestr invoked 7,261,649 times and took 9,409,794,441 ns.
a ptop_restriction c invoked 1,456,793 times and took 1,901,987,589 ns.
e c f invoked 17,897,752 times and took 635,785,943,152 ns.
a ptop_premise b invoked 10,175,697 times and took 9,669,316,036 ns.
Fired 1,456,793 times and took 157,163,249,304 ns.
Inferred 1,456,793 statements.
Time overall: 815,745,001,232 ns.

Note

Variable names are renamed due to the compilation to Java bytecode.

Understanding the output:

  • The premises are checked in the order given in RULE. (The premise statistics printed after the blank line are not in any particular order.)

  • Invoked is the number of times the rule variant or specific premise was checked successfully. Tracing through the rule:

    • ptop_PropRestr_4 checked successfully 163 million times: for each incoming triple, since the lead premise (e b f = x p y) is a free pattern.

    • a ptop_premise b checked successfully 10 million times: for each b=p that has an axiomatic triple involving ptop_premise.

      This premise was selected because it has only 1 unbound variable a and it is first in the rule text.

    • a rdf_type ptop_PropRestr checked successfully 7 million times: for each ptop_premise that has type ptop_PropRestr.

      This premise was selected because it has 0 unbound variables (after the previous premise binds a).

  • The time to check each premise is printed in ns.

  • Fired is the number of times all premises matched, so the rule variant was fired.

  • Inferred is the number of inferred triples.

    It may be greater than fired if there are multiple conclusions.
    It may be less than fired since a duplicate triple is not inferred a second time.
  • Time overall is the total time that this rule variant took.

Excel format

The log records detailed information about each rule and premise, which is very useful when you are trying to understand which of the rules is too time-consuming. However, it can still be overwhelming because of this level of detail.

Therefore, we have developed the rule-stats.pl script that outputs a TSV file such as the following:

rule

ver

tried

time

patts

checks

time

fired

time

triples

speed

ptop_PropChain

4

163475763

776.3

5

117177482

185.3

15547176

590.9

9707142

12505

Parameters:

Parameter

Description

rule

the rule ID (name)

ver

the rule version (variant) or “T” for overall rule totals

tried, time

the number of times the rule/variant was tried, the overall time in sec

patts

the number of triple patterns (premises) in the rule, not counting the leading premise

checks, time

the number of times premises were checked, time in sec

fired

the number of times all premises matched, so the rule was fired

triples

the number of inferred triples

speed

inference speed, triples/sec

Run the script the following way:

perl rule-stats.pl main-2014-07-28.log > main-2014-07-28.xls

Investigating performance

The following is an example of using the Excel format to investigate where time is spent during rule execution.

Download the time-spent-during-rule.xlsx example file, and use it as a template.

_images/main-2014-08-08.png

Note

These formulas are dynamic, and they are updated every time you change the filters.

To perform your investigation:

  1. Open the results in Excel.

  2. Set a filter “ver=T”, first looking at the rules in their entirety instead of rule variants.

  3. Sort by “time” (fourth column) in descending order.

  4. Check which rules are highlighted in red (those that take significantly long and whose speed is substantially lower than average).

  5. Pick a rule (for example, PropRestr).

  6. Filter by “rule=PropRestr” and “ver<>T” to see its variants.

    _images/main-PropRestr.png
  7. Focus on a variant to investigate the reasons for its poorer time and speed performance.

In this example, the first variant you would want to investigate will be ptop_PropRestr_5, as it is spending 30% of the time of this rule, and has very low speed. The reason is that it fired 1.4 million times but produced only 238 triples, so most of the inferred triples were duplicates.

You can find the definition of this variant in the log file:

RULE ptop_PropRestr_5 invoked 163,475,763 times.
ptop_PropRestr_5:
e c f
a ptop_restriction c
a rdf_type ptop_PropRestr
e b f
a ptop_premise b
a ptop_conclusion d
------------------------------------
e d f

It is very similar to the productive variant ptop_PropRestr_4 (see Log file above):

  • one checks e b f. a ptop_premise b first,

  • the other checks e c f. a ptop_restriction c first.

Still, the function of these premises in the rule is the same and therefore the variant ptop_PropRestr_5 (which is checked after 4) is unproductive.

The most likely way to improve performance would be if you make the two premises use the same axiomatic triple ptop:premise (emphasizing they have the same role), and introduce a Cut:

Id: ptop_PropRestr_SYM
  t <ptop:premise>     p
  t <ptop:premise>     r
  t <ptop:conclusion>  q
  t <rdf:type>         <ptop:PropRestr>
  x p y
  x r y                [Cut]
  ----------------
  x q y

The Cut eliminates the rule variant with x r y as leading premise. It is legitimate to do this, since the two variants are the same, up to substitution p<->r.

Note

Introducing a Cut in the original version of the rule would not be legitimate:

Id: ptop_PropRestr_CUT
  t <ptop:premise>     p
  t <ptop:restriction> r
  t <ptop:conclusion>  q
  t <rdf:type>         <ptop:PropRestr>
  x p y
  x r y                [Cut]
  ----------------
  x q y

since it would omit some potential inferences (in the case above, 238 triples), changing the semantics of the rule (see the example below).

Assume these axiomatic triples:

:t_CUT a ptop:PropRestr; ptop:premise :p; ptop:restriction :r; ptop:conclusion :q. # for ptop_PropRestr_CUT
:t_SYM a ptop:PropRestr; ptop:premise :p; ptop:premise     :r; ptop:conclusion :q. # for ptop_PropRestr_SYM

Now consider a sequence of inserted triples :x :p :y. :x :r :y.

  • ptop_PropRestr_CUT will not infer :x :q :y, since no variant is fired by the second incoming triple :x :r :y: it is matched against x p y, but there is no axiomatic triple t ptop:premise :r.

  • ptop_PropRestr_SYM will infer :x :q :y, since the second incoming triple :x :r :y will match x p y and t ptop:premise :r, then the previously inserted :x :p :y will match t ptop:premise :p and the rule will fire.

Tip

Rule execution is often non-intuitive, therefore we recommend that you detail the speed history and compare the performance after each change.

Hints on optimizing GraphDB’s rulesets

The complexity of the ruleset has a significant effect on the loading performance, the number of inferred statements, and the overall size of the repository after inferencing. The complexity of the standard rulesets increases as follows:

  • no inference (lowest complexity, best performance)

  • RDFS-Optimized

  • RDFS

  • RDFS-Plus-Optimized

  • RDFS-Plus

  • OWL-Horst-Optimized

  • OWL-Horst

  • OWL-Max-Optimized

  • OWl-Max

  • OWL2-QL-Optimized

  • OWL2-QL

  • OWL2-RL-Optimized

  • OWL2-RL (highest complexity, worst performance)

It needs to be mentioned that OWL-RL and OWL-QL do a lot of heavy work that is often not required by applications. For more details, see OWL Compliance.

Know what you want to infer

Check the expansion ratio (total/explicit statements) for your dataset in order to get an idea of whether this is the result that you are expecting. For example, if your ruleset infers 4 times more statements over a large number of explicit statements, this will take time regardless of the ways in which you try to optimize the rules.

Minimize the number of rules

The number of rules and their complexity affects inferencing performance, even for rules that never infer any new statements. The reason for this is that every incoming statement is passed through every variant of every rule to check whether something can be inferred. This often results in many checks and joins, even if the rule never fires.

So, start with a minimal ruleset and only add the rules that you need. The default ruleset (RDFS-Plus-Optimized) works for many users, but you might even consider starting from RDFS. For example, if you need owl:Symmetric and owl:inverseOf on top of RDFS, you can copy only these rules from OWL-Horst to RDFS and leave out the rest.

Conversely, you can start with a bigger standard ruleset and remove the rules that you do not need.

Note

To deploy a custom ruleset, set the ruleset configuration parameter to the full pathname of your custom .pie file.

Write your rules carefully

  • Be careful with the recursive rules as they can lead to an explosion in the number of inferred statements.

  • Always check your spelling:

    • A misspelled variable in a premise leads to a Cartesian explosion (variables quickly growing to an intractable level) of the number of triple joins to be considered by the rule.

    • A misspelled variable in a conclusion (or the use of an unbound variable) leads to the creation of new blank nodes. This is almost never what you really want.

  • Order premises by specificity. GraphDB first checks premises with the least number of unbound variables. But if there is a tie, it follows the order given by you. Since you may know the cardinalities of triples in your data, you may be in a better position to determine which premise has better specificity (selectivity).

  • Use Cut for premises that have the same role (for an example, see Investigating performance), but be careful not to remove any necessary inferences by mistake.

Avoid duplicate statements

Avoid inserting explicit statements in a named graph if the same statements are inferable. GraphDB always stores inferred statements in the default graph, so this will lead to duplicating statements. This will increase the repository size and slow down query answering.

You can eliminate duplicates from query results using DISTINCT or FROM onto:skip-redundant-implicit (see Other special GraphDB query behavior). However, these are slow operations, so it is better not to produce duplicate statements in the first place.

Know the implications of ontology mapping

People often use owl:equivalentProperty, owl:equivalentClass (and less often rdfs:subPropertyOf, rdfs:subClassOf) to map ontologies. However, every such assertion means that many more statements are inferred (owl:equivalentProperty works as a pair of rdfs:subPropertyOf, and owl:equivalentClass works as a pair of rdfs:subClassOf).

A good example is DCTerms (DCT): almost each DC property has a declared DCT subproperty and there is also a hierarchy amongst DCT properties, for instance:

dcterms:created rdfs:subPropertyOf dc:date, dcterms:date.
dcterms:date rdfs:subPropertyOf dc:date.

This means that every dcterms:created statement will expand to 3 statements. So, do not load the DC ontology unless you really need these inferred dc:date.

Consider avoiding inverse statements

Inverse properties (e.g., :p owl:inverseOf :q) offer some convenience in querying, but are never necessary:

  • SPARQL natively has bidirectional data access: instead of ?x :q ?y, you can always query for ?y :p ?x.

  • You can even invert the direction in a property path: instead of ?x :p1/:q ?y, use ?x :p1/(^:p) ?y.

If an ontology defines inverses but you skip inverse reasoning, you have to check which of the two properties is used in a particular dataset, and write your queries carefully.

The Provenance Ontology (PROV-O) has considered this dilemma thoroughly, and has abstained from defining inverses to “avoid the need for OWL reasoning, additional code, and larger queries” (see http://www.w3.org/TR/prov-o/#inverse-names).

Consider avoiding long transitive chains

A chain of N transitive relations (e.g., rdfs:subClassOf) causes GraphDB to infer and store a further \((n^2 - n) / 2\) statements. If the relationship is also symmetric (e.g., in a family ontology with a predicate such as relatedTo), then there will be \(n^2 - n\) inferred statements.

Consider removing the transitivity and/or symmetry of relations that make long chains. Or, if you must have them, consider the implementation of TransitiveProperty through step property, which can be faster than the standard implementation of owl:TransitiveProperty.

Consider specialized property constructs

While OWL2 has very powerful class constructs, its property constructs are quite weak. Some widely used OWL2 property constructs could be done faster.

See this draft for some ideas and clear illustrations. Below, we describe three of these ideas.

Tip

To learn more, see a detailed account of applying some of these ideas in a real-world setting. Here are the respective rule implementations.

PropChain

Consider 2-place PropChain instead of general owl:propertyChainAxiom.

owl:propertyChainAxiom needs to use intermediate nodes and edges in order to unroll the rdf:List representing the chain. Since most chains found in practice are 2-place chains (and a chain of any length can be implemented as a sequence of 2-place chains), consider a rule such as the following:

Id: ptop_PropChain
  t <ptop:premise1>   p1
  t <ptop:premise2>   p2
  t <ptop:conclusion> q
  t <rdf:type> <ptop:PropChain>
  x p1 y
  y p2 z
  ----------------
  x q z

It is used with axiomatic triples as in the following:

@prefix ptop: <http://www.ontotext.com/proton/protontop#>.
:t a ptop:PropChain; ptop:premise1 :p1; ptop:premise2 :p2; ptop:conclusion :q.

transitiveOver

psys:transitiveOver has been part of Ontotext’s PROTON ontology since 2008. It is defined as follows:

Id: psys_transitiveOver
  p <psys:transitiveOver> q
  x p y
  y q z
  ---------------
  x p z

It is a specialized PropChain, where premise1 and conclusion coincide. It allows you to chain p with q on the right, yielding p. For example, the inferencing of types along the class hierarchy can be expressed as:

rdf:type psys:transitiveOver rdfs:subClassOf

TransitiveProperty through step property

owl:TransitiveProperty is widely used and is usually implemented as follows:

Id: owl_TransitiveProperty
  p <rdf:type> <owl:TransitiveProperty>
  x p y
  y p z
  ----------
  x p z

You may recognize this as a self-chain, thus a specialization of psys:transitiveOver, i.e.:

?p rdf:type owl:TransitiveProperty <=> ?p psys:transitiveOver ?p

Most transitive properties comprise transitive closure over a basic ‘step’ property. For example, skos:broaderTransitive is based on skos:broader and is implemented as:

skos:broader rdfs:subPropertyOf skos:broaderTransitive.
skos:broaderTransitive a owl:TransitiveProperty.

Now consider a chain of N skos:broader between two nodes. The owl_TransitiveProperty rule has to consider every split of the chain, thus inferring the same closure between the two nodes N times, leading to quadratic inference complexity.

This can be optimized by looking for the step property s and extending the chain only at the right end:

Id: TransitiveUsingStep
  p <rdf:type> <owl:TransitiveProperty>
  s <rdfs:subPropertyOf> p
  x p y
  y s z
  ----------
  x p z

However, this would not make the same inferences as owl_TransitiveProperty if someone inserts the transitive property explicitly (which is a bad practice).

A more robust approach is to declare the step and transitive properties together using psys:transitiveOver, for instance:

skos:broader rdfs:subPropertyOf skos:broaderTransitive.
skos:broaderTransitive psys:transitiveOver skos:broader.

Translating OWL constructs to specialized property constructs

Other options for optimizing your rulesets to make them faster:

  • ptop:transitiveOver is faster than owl:TransitiveProperty: quadratic vs cubic complexity over the length of transitive chains.

  • ptop:PropChain (a 2-place chain) is faster than general owl:propertyChainAxiom (n-place chain) because it does not need to unroll the rdf:List underlying the representation of owl:propertyChainAxiom.

Under some conditions, you can translate the standard OWL constructs to these custom constructs to have both standards compliance and faster speed:

  • use rule TransitiveUsingStep; if every TransitiveProperty p (e.g., skos:broaderTransitive) is defined over a step property s (e.g., skos:broader) and you do not insert p directly.

  • if you use only 2-step owl:propertyChainAxiom, then translate them to custom using the following rule, and infer using rule ptop_PropChain:

    Id: ptop_PropChain_from_propertyChainAxiom
      q <owl:propertyChainAxiom> l1
      l1 <rdf:first> p1
      l1 <rdf:rest> l2
      l2 <rdf:first> p2
      l2 <rdf:rest> <rdf:nil>
      ----------------------
      t <ptop:premise1>   p1
      t <ptop:premise2>   p2
      t <ptop:conclusion> q
      t <rdf:type> <ptop:PropChain>
    

Additional ruleset usage optimization

GraphDB applies special processing to the following rules so that inferred statements such as <P a rdf:Property>, <P rdfs:subPropertyOf P> and <X a rdfs:Resource> can appear in the repository without slowdown of inference:

/*partialRDFS*/
Id: rdf1_rdfs4a_4b
x a y
-------------------------------
a <rdf:type> <rdf:Property>
x <rdf:type> <rdfs:Resource>
a <rdf:type> <rdfs:Resource>
y <rdf:type> <rdfs:Resource>
/*partialRDFS*/

Id: rdfs6
a <rdf:type> <rdf:Property>
-------------------------------
a <rdfs:subPropertyOf> a

According to them, whatever statement comes into the repository, its subject, predicate and object are resources and its predicate is an rdf:Property, which then becomes subPropertyOf itself using the second rule (the reflexivity of subPropertyOf). These rules, however, if executed every time, present a similar challenge to when using owl:sameAs. To avoid the performance drop, GraphDB obtains these statements through code so that <P a rdf:Property> and <X a rdfs:Resource> are asserted only once – when a property or a resource is encountered for the first time (except in the ‘optimized’ rulesets, where rdfs:Resource is omitted because of the very limited use of such inference).

If we start with the empty ruleset, <P a rdf:Property>, <P rdfs:subPropertyOf P> and <X a rdfs:Resource> statements will not be inferred until we switch the ruleset. Then the inference will take place for the new properties and resources only.

Inversely, if we start with a non-empty ruleset and switch to the empty one, then the statements <P a rdf:Property>, <P rdfs:subPropertyOf P> and <X a rdfs:Resource> inferred so far will remain. This is true even if we delete statements or recompute the inferred closure.