# RDF rank¶

## What is RDF Rank¶

RDF Rank is an algorithm that identifies the more important or more popular entities in the repository by examining their interconnectedness. The popularity of entities can then be used to order the query results in a similar way to the internet search engines, the way Google orders search results using PageRank.

The RDF Rank component computes a numerical weighting for all nodes in the entire RDF graph stored in the repository, including URIs, blank nodes and literals. The weights are floating point numbers with values between 0 and 1 that can be interpreted as a measure of a node’s relevance/popularity.

Since the values range from 0 to 1, the weights can be used for sorting a result set (the lexicographical order works fine even if the rank literals are interpreted as plain strings).

Here is an example SPARQL query that uses the RDF rank for sorting results by their popularity:

PREFIX rank: <http://www.ontotext.com/owlim/RDFRank#>
PREFIX opencyc-en: <http://sw.opencyc.org/2008/06/10/concept/en/>
SELECT * WHERE {
?Person a opencyc-en:Entertainer .
?Person rank:hasRDFRank ?rank .
}
ORDER BY DESC(?rank) LIMIT 100


As seen in the example query, RDF Rank weights are made available via a special system predicate. GraphDB handles triple patterns with the predicate http://www.ontotext.com/owlim/RDFRank#hasRDFRank in a special way, where the object of the statement pattern is bound to a literal containing the RDF Rank of the subject.

rank#hasRDFRank returns the rank with precision of 0.01. You can as well retrieve the rank with precision of 0.001, 0.0001 and 0.00001 using respectively rank#hasRDFRank3, rank#hasRDFRank4 and rank#hasRDFRank5.

In order to use this mechanism, the RDF ranks for the whole repository must be computed in advance. This is done by committing a series of SPARQL updates that use special vocabulary to parameterise the weighting algorithm, followed by an update that triggers the computation itself.

## Parameters¶

RDF Rank is fully controllable from Setup -> RDF Rank.

Parameter Maximum iterations
Predicate http://www.ontotext.com/owlim/RDFRank#maxIterations
Description Sets the maximum number of iterations of the algorithm over all entities in the repository.
Default 20
Example
PREFIX rank: <http://www.ontotext.com/owlim/RDFRank#>
INSERT DATA { rank:maxIterations rank:setParam “16” . }
Parameter Epsilon
Predicate http://www.ontotext.com/owlim/RDFRank#epsilon
Description Terminates the weighting algorithm early when the total change of all RDF Rank scores has fallen below this value.
Default 0.01
Example
PREFIX rank: <http://www.ontotext.com/owlim/RDFRank#>
INSERT DATA { rank:epsilon rank:setParam “0.05” . }

## Full computation¶

To trigger the computation of the RDF Rank values for all resources, use the following update:

PREFIX rank: <http://www.ontotext.com/owlim/RDFRank#>
INSERT DATA { _:b1 rank:compute _:b2. }


The full computation of RDF Rank values for all resources can be relatively expensive. When new resources have been added to the repository after a previous full computation of the RDF Rank values, you can either have a full re-computation for all resources (see above) or compute only the RDF Rank values for the new resources (an incremental update).

The following control update:

PREFIX rank: <http://www.ontotext.com/owlim/RDFRank#>
INSERT DATA {_:b1 rank:computeIncremental "true"}


computes RDF Rank values for the resources that do not have an associated value, i.e., the ones that have been added to the repository since the last full RDF Rank computation.

Note

The incremental computation uses a different algorithm, which is lightweight (in order to be fast), but is not as accurate as the proper ranking algorithm. As a result, ranks assigned by the proper and the lightweight algorithms will be slightly different.

## Exporting RDF Rank values¶

The computed weights can be exported to an external file using an update of this form:

PREFIX rank: <http://www.ontotext.com/owlim/RDFRank#>
INSERT DATA { _:b1 rank:export "/home/user1/rdf_ranks.txt" . }


If the export fails, the update throws an exception and an error message is recorded in the log file.

## Checking the RDF Rank status¶

The RDF Rank plugin can be in one of the following statuses:

/**
* The ranks computation has been canceled
*/
CANCELED,

/**
* The ranks are computed and up-to-date
*/
COMPUTED,

/**
* A computing task is currently in progress
*/
COMPUTING,

/**
* There are no calculated ranks
*/
EMPTY,

/**
* Exception has been thrown during computation
*/
ERROR,

/**
* The ranks are outdated and need computing
*/
OUTDATED,

/**
* The filtering is enabled and it's configuration has been changed since the last full computation
*/
CONFIG_CHANGED


You can get the current status of the plugin by running the following query:

PREFIX rank: <http://www.ontotext.com/owlim/RDFRank#>
SELECT ?o WHERE { ?s rank:status ?o }


## Rank Filtering¶

By default the RDF Rank is calculated over the whole repository. This is useful when you want to find the most interconnected and important entities in general.

However, there are times when you are interested only in entities in certain graphs or entities related to a particular predicate. This is why the RDF Rank has a filtered mode - to filter the statements in the repository which are taken under account when calculating the rank.

You can enable the filtered mode with the following query:

PREFIX rank: <http://www.ontotext.com/owlim/RDFRank#>
INSERT DATA { rank:filtering rank:setParam true }


The filtering of the statements can be performed based on predicate, graph or type - explicit or implicit (inferred). You can make both inclusion and exclusion rules.

In order to include only statements having a particular predicate or being in a particular named graph you should include the predicate / graph IRI in one of the following lists: includedPredicates / includedGraphs. Empty lists are treated as wildcards. Below you can find how to control the lits with SPARQL queries:

Get the content of a list:

PREFIX rank: <http://www.ontotext.com/owlim/RDFRank#>
SELECT ?s WHERE { ?s rank:includedPredicates ?o }


Add an IRI to a list:

PREFIX rank: <http://www.ontotext.com/owlim/RDFRank#>
INSERT DATA { <http:predicate> rank:includedPredicates "add" }


Remove an IRI from a list:

PREFIX rank: <http://www.ontotext.com/owlim/RDFRank#>
INSERT DATA { <http:predicate> rank:includedPredicates "remove" }


The filtering can be done not only by including statements of interest but by removing ones as well. In order to do so there are 2 additional lists: excludedPredicates and excludedGraphs. These lists take precedence over their inclusion alternatives so if for instance you have the same predicate in both inclusion and exclusion lists it will be treated as excluded. These lists can be controlled in exactly the same way as the inclusion ones.

There is a convenient way to include/exclude all explicit/implicit statements. This is done with 2 parameters - includeExplicit and includeImplicit which are set to true by default. When these parameters are set to true they are just disregarded - do not take part in the filtering. However, if you set them to false they start acting as exclusion rules - this means they take precedence over the inclusion lists.

You can get the status of these parameters using:

PREFIX rank: <http://www.ontotext.com/owlim/RDFRank#>
ASK { _:b1 rank:includeExplicit _:b2 . }


You can set value of the parameters with:

PREFIX rank: <http://www.ontotext.com/owlim/RDFRank#>
INSERT DATA { rank:includeExplicit rank:setParam true }