Rules optimisations

GraphDB includes useful feature that allows you to profile and debug rule performance.

How to enable rule profiling

Rule profiling prints out statistics about rule execution.

To enable rule profiling, start GraphDB with the following Java option:

-Denable-debug-rules=true

This enables the collection of rule statistics (various counters).

Important

Rule Profiling Limitations

  • Must use a custom ruleset, since builtin rulesets don’t have the required instrumentation (counters);
  • The debug rules statistics are available only for importing data in serial mode. It doesn’t work for Parallel Inferencing, which is default. Check “Force serial pipeline” in Import settings dialog to enable it.

Warning

Rule profiling slows down the rule execution (the leading premise checking part) by 10-30%, so do not use it in production.

Log file

When rule profiling is enabled:

  • Complete rule statistics are printed every 1M statements, every 5 minutes, or on shutdown (whichever comes first).
  • They are written to graphdb-folder/logs/main-<date>.log;
  • The descriptive rule stats format looks like this:
----------rs start----------
Rule statistics for repository <name> :
RULE: ...

Time overall (all rules): ... ns.
----------rs end----------
  • Stats are printed for each active repository
  • Stats are cumulative, so find the last section rs start ... rs end for your repo of interest.
  • Rule variants are ordered by total time (descending).

For example, consider the following rule :

Id: ptop_PropRestr
  t <ptop:premise>     p
  t <ptop:restriction> r
  t <ptop:conclusion>  q
  t <rdf:type>         <ptop:PropRestr>
  x p y
  x r y
  ----------------
  x q y

This is a conjunction of two props. It is declared with the axiomatic (A-Box) triples involving t. Whenever the premise p and restriction r hold between two resources, the rule infers the conclusion q between the same resources, i.e., p & r => q

The corresponding log for variant 4 of this rule may look like the following:

RULE ptop_PropRestr_4 invoked 163,475,763 times.
ptop_PropRestr_4:
e b f
a ptop_premise b
a rdf_type ptop_PropRestr
e c f
a ptop_restriction c
a ptop_conclusion d
------------------------------------
e d f

a ptop_conclusion d invoked 1,456,793 times and took 1,814,710,710 ns.
a rdf_type ptop_PropRestr invoked 7,261,649 times and took 9,409,794,441 ns.
a ptop_restriction c invoked 1,456,793 times and took 1,901,987,589 ns.
e c f invoked 17,897,752 times and took 635,785,943,152 ns.
a ptop_premise b invoked 10,175,697 times and took 9,669,316,036 ns.
Fired 1,456,793 times and took 157,163,249,304 ns.
Inferred 1,456,793 statements.
Time overall: 815,745,001,232 ns.

Note

Variable names are renamed due to the compilation to Java bytecode.

Understanding the output:

  • The premises are checked in the order given in RULE. (The premise statistics printed after the blank line are not in any particular order.)

  • Invoked is the number of times the rule variant or specific premise was checked successfully. Tracing through the rule:

    • ptop_PropRestr_4 checked successfully 163M times: for each incoming triple, since the lead premise (e b f = x p y) is a free pattern.

    • a ptop_premise b checked successfully 10M times: for each b=p that has an axiomatic triple involving ptop_premise.

      This premise was selected because it has only 1 unbound variable a and it is first in the rule text.

    • a rdf_type ptop_PropRestr checked successfully 7M times: for each ptop_premise that has type ptop_PropRestr.

      This premise was selected because it has 0 unbound variables (after the previous premise binds a).

  • The time to check each premise is printed in ns.

  • Fired is the number of times all premises matched, so the rule variant was fired.

  • Inferred is the number of inferred triples.

    It may be more than fired if there are multiple conclusions.
    It may be less then fired since a duplicate triple is not inferred a second time.
  • Time overall is the total time that this rule variant took.

Excel format

The log records detailed information about each rule and premise, which is indispensable when you are trying to understand which rule is spending too much time. However, it can still be overwhelming because of this level of detail.

Therefore, we have developed the script rule-stats.pl that outputs a TSV file such as the following:

rule ver tried time patts checks time fired time triples speed
ptop_PropChain 4 163475763 776.3 5 117177482 185.3 15547176 590.9 9707142 12505

Parameters:

  • rule: the rule ID (name);
  • ver: the rule version (variant) or “T” for overall rule totals;
  • tried, time: the number of times the rule/variant was tried, the overall time in sec;
  • patts: the number of triple patterns (premises) in the rule, not counting the leading premise;
  • checks, time: the number of times premises were checked, time in sec;
  • fired: the number of times all premises matched, so the rule was fired;
  • triples: the number of inferred triples;
  • speed: inference speed, triples/sec.

Run the script in the following way:

perl rule-stats.pl main-2014-07-28.log > main-2014-07-28.xls

Investigating performance

The following is an example of using the Excel format to investigate where time is spent during rule execution.

Download the example file time-spent-during-rule.xlsx and use it as a template.

_images/main-2014-08-08.png

Note

These formulas are dynamic and they refresh every time you change the filters.

To perform your investigation:

  1. Open the results in Excel.

  2. Set a filter “ver=T” (to first look at rules as a whole, not rule variants).

  3. Sort in a descending order by total “time” (third column).

  4. Check out which rules are highlighted in red (the rules that spend substantial time and whose speed is significantly lower than average).

  5. Pick up a rule (for example, PropRestr).

  6. Filter on “rule=PropRestr” and “ver<>T” to look at its variants.

    _images/main-PropRestr.png
  7. Focus on a variant to investigate the reasons for time and speed performance.

In this example, first you have to focus on the variant ptop_PropRestr_5, which spends 30% of the time of this rule, and has very low “speed”. The reason is that it fired 1.4M times but produced only 238 triples, so most of the inferred triples were duplicates.

You can find the definition of this variant in the log file:

RULE ptop_PropRestr_5 invoked 163,475,763 times.
ptop_PropRestr_5:
e c f
a ptop_restriction c
a rdf_type ptop_PropRestr
e b f
a ptop_premise b
a ptop_conclusion d
------------------------------------
e d f

It is very similar to the productive variant ptop_PropRestr_4 (see Log file above):

  • one checks e b f. a ptop_premise b first,
  • the other checks e c f. a ptop_restriction c first.

Still, the function of these premises in the rule is the same and therefore the variant ptop_PropRestr_5 (which is checked after 4) is unproductive.

The most likely way to improve performance would be if you make the two premises use the same axiomatic triple ptop:premise (emphasising they have the same role), and introduce a Cut:

Id: ptop_PropRestr_SYM
  t <ptop:premise>     p
  t <ptop:premise>     r
  t <ptop:conclusion>  q
  t <rdf:type>         <ptop:PropRestr>
  x p y
  x r y                [Cut]
  ----------------
  x q y

The Cut eliminates the rule variant with x r y as leading premise. It is legitimate to do this, since the two variants are the same, up to substitution p<->r.

Note

Introducing a Cut in the original version of the rule would not be legitimate:

Id: ptop_PropRestr_CUT
  t <ptop:premise>     p
  t <ptop:restriction> r
  t <ptop:conclusion>  q
  t <rdf:type>         <ptop:PropRestr>
  x p y
  x r y                [Cut]
  ----------------
  x q y

since it would omit some potential inferences (in the case above, 238 triples), changing the semantics of the rule (see the example below).

Assume these axiomatic triples:

:t_CUT a ptop:PropRestr; ptop:premise :p; ptop:restriction :r; ptop:conclusion :q. # for ptop_PropRestr_CUT
:t_SYM a ptop:PropRestr; ptop:premise :p; ptop:premise     :r; ptop:conclusion :q. # for ptop_PropRestr_SYM

Now consider a sequence of inserted triples :x :p :y. :x :r :y.

  • ptop_PropRestr_CUT will not infer :x :q :y since no variant is fired by the second incoming triple :x :r :y: it is matched against x p y, but there is no axiomatic triple t ptop:premise :r
  • ptop_PropRestr_SYM will infer :x :q :y since the second incoming triple :x :r :y will match x p y and t ptop:premise :r, then the previously inserted :x :p :y will match t ptop:premise :p and the rule will fire.

Tip

Rule execution is often non-intuitive, therefore we recommend that you detailed the speed history and compare the performance after each change.

Hints on optimising GraphDB’s rulesets

The complexity of the ruleset has a large effect on the loading performance, the number of inferred statements and the overall size of the repository after inferencing. The complexity of the standard rulesets increases as follows:

  • no inference (lowest complexity, best performance)
  • rdfs-optimized
  • rdfs
  • rdfs-plus-optimized
  • rdfs-plus
  • owl-horst-optimized
  • owl-horst
  • owl-max-optimized
  • owl-max
  • owl2-ql-optimized
  • owl2-ql
  • owl2-rl-optimized
  • owl2-rl (highest complexity, worst performance)

OWL RL and OWL QL do a lot of heavy work that is often not required by applications. For more details, see OWL compliance.

Know what you want to infer

Check the ‘expansion ratio’ (total/explicit statements) for your dataset and get an idea whether this is what you expect. If your ruleset infers, for example, 4 times more statements over a large number of explicit statements, this will take time, no matter how you try to optimise the rules.

Minimise the number of rules

The number of rules and their complexity affects inferencing performance, even for rules that never infer any new statements. This is because every incoming statement is passed through every variant of every rule to check whether something can be inferred. This often results in many checks and joins, even if the rule never fires.

So, start with a minimal ruleset and add only the additional rules that you require. The default ruleset (rdfs-plus-optimized) works for many people, but you could even consider starting from RDFS. For example, if you need owl:Symmetric and owl:inverseOf on top of RDFS, you can copy only these rules from OWL Horst to RDFS and leave the rest aside.

Conversely, you can start with a bigger standard ruleset and remove the rules that you do not need.

Note

To deploy a custom ruleset, set the ruleset configuration parameter to the full pathname of your custom .pie file.

Write your rules carefully

  • Be careful with the recursive rules as they can lead to an explosion in the number of inferred statements.
  • Always check your spelling:
    • A misspelled variable in a premise leads to a Cartesian explosion of the number of triple joins to be considered by the rule.
    • A misspelled variable in a conclusion (or use an unbound variable) causes new blank nodes to be created. This is almost never what you really want.
  • Order premises by specificity. GraphDB first checks premises with the least number of unbound variables. But if there is a tie, it follows the order given by you. Since you may know the cardinalities of triples in your data, you may be in a better position to determine which premise has better specificity (selectivity).
  • Use [Cut] for premises that have the same role (for an example, see Investigating performance), but be careful not to remove some needed inferences by mistake.

Avoid duplicate statements

Avoid inserting explicit statements in a named graph if the same statements are inferable. GraphDB always stores inferred statements in the default graph, so this will lead to duplicating statements. This will increase the repository size and will slow down query answering.

You can eliminate duplicates from query results using DISTINCT or FROM onto:skip-redundant-implicit (see Other special GraphDB query behaviour). But these are slow operations and it is better not to produce duplicate statements in the first place.

Know the implications of ontology mapping

People often use owl:equivalentProperty, owl:equivalentClass (and less often rdfs:subPropertyOf, rdfs:subClassOf) to map ontologies. But every such assertion means that many more statements are inferred (owl:equivalentProperty works as a pair of rdfs:subPropertyOf, and owl:equivalentClass works as a pair of rdfs:subClassOf).

A good example is DCTerms (DCT): almost each DC property has a declared DCT subproperty and there is also a hierarchy amongst DCT properties, for instance:

dcterms:created rdfs:subPropertyOf dc:date, dcterms:date .
dcterms:date rdfs:subPropertyOf dc:date.

This means that every dcterms:created statement will expand to 3 statements. So, do not load the DC ontology unless you really need these inferred dc:date.

Consider avoiding inverse statements

Inverse properties (e.g., :p owl:inverseOf :q) offer some convenience in querying, but are never necessary:

  • SPARQL natively has bidirectional data access: instead of ?x :q ?y, you can always query for ?y :p ?x.
  • You can even invert the direction in a property path: instead of ?x :p1/:q ?y, use ?x :p1/(^:p) ?y

If an ontology defines inverses but you skip inverse reasoning, you have to check which of the two properties is used in a particular dataset, and write your queries carefully.

The Provenance Ontology (PROV-O) has considered this dilemma carefully and have abstained from defining inverses, to “avoid the need for OWL reasoning, additional code, and larger queries” (see http://www.w3.org/TR/prov-o/#inverse-names).

Consider avoiding long transitive chains

A chain of N transitive relations (e.g., rdfs:subClassOf) causes GraphDB to infer and store a further \((n^2 - n) / 2\) statements. If the relationship is also symmetric (e.g., in a family ontology with a predicate such as relatedTo), then there will be \(n^2 - n\) inferred statements.

Consider removing the transitivity and/or symmetry of relations that make long chains. Or if you must have them, consider the implementation of TransitiveProperty through step property, which can be faster than the standard implementation of owl:TransitiveProperty.

Consider specialised property constructs

While OWL2 has very powerful class constructs, its property constructs are quite weak. Some widely used OWL2 property constructs could be done faster.

See this draft for some ideas and clear illustrations. Below we describe 3 of these ideas.

Tip

To learn more, see a detailed account of applying some of these ideas in a real-world setting. The link provides the respective rule implementations.

PropChain

Consider 2-place PropChain instead of general owl:propertyChainAxiom.

owl:propertyChainAxiom needs to use intermediate nodes and edges in order to unroll the rdf:List representing the chain. Since most chains found in practice are 2-place chains (and a chain of any length can be implemented as a sequence of 2-place chains), consider a rule such as the following:

Id: ptop_PropChain
  t <ptop:premise1>   p1
  t <ptop:premise2>   p2
  t <ptop:conclusion> q
  t <rdf:type> <ptop:PropChain>
  x p1 y
  y p2 z
  ----------------
  x q z

It is used with axiomatic triples as in the following:

@prefix ptop: <http://www.ontotext.com/proton/protontop#>.
:t a ptop:PropChain; ptop:premise1 :p1; ptop:premise2 :p2; ptop:conclusion :q.

transitiveOver

psys:transitiveOver has been part of Ontotext’s PROTON ontology since 2008. It is defined as follows:

Id: psys_transitiveOver
  p <psys:transitiveOver> q
  x p y
  y q z
  ---------------
  x p z

It is a specialised PropChain, where premise1 and conclusion coincide. It allows you to chain p with q on the right, yielding p. For example, the inferencing of types along the class hierarchy can be expressed as:

rdf:type psys:transitiveOver rdfs:subClassOf

TransitiveProperty through step property

owl:TransitiveProperty is widely used and is usually implemented as follows:

Id: owl_TransitiveProperty
  p <rdf:type> <owl:TransitiveProperty>
  x p y
  y p z
  ----------
  x p z

You may recognise this as a self-chain, thus a specialisation of psys:transitiveOver, i.e.:

?p rdf:type owl:TransitiveProperty <=> ?p psys:transitiveOver ?p

Most transitive properties comprise transitive closure over a basic ‘step’ property. For example, skos:broaderTransitive is based on skos:broader and is implemented as:

skos:broader rdfs:subPropertyOf skos:broaderTransitive.
skos:broaderTransitive a owl:TransitiveProperty.

Now consider a chain of N skos:broader between two nodes. The owl_TransitiveProperty rule has to consider every split of the chain, thus inferring the same closure between the two nodes N times, leading to quadratic inference complexity.

This can be optimised by looking for the step property s and extending the chain only at the right end:

Id: TransitiveUsingStep
  p <rdf:type> <owl:TransitiveProperty>
  s <rdfs:subPropertyOf> p
  x p y
  y s z
  ----------
  x p z

However, this would not make the same inferences as owl_TransitiveProperty if someone inserts the transitive property explicitly (which is a bad practice).

It is more robust to declare the step and transitive properties together using psys:transitiveOver, for instance:

skos:broader rdfs:subPropertyOf skos:broaderTransitive.
skos:broaderTransitive psys:transitiveOver skos:broader.