Graph path search¶
What’s in this document?
The challenge¶
The SPARQL 1.1 property paths feature supported by GraphDB enables graph traversal, allowing you to discover relationships between resources through arbitrary length patterns. However, property paths uncover the start and end nodes of a path, but not the intermediate ones, meaning that traceability remains a challenge. With it as a key factor in a variety of use cases and fields, such as data fabric analysis of supply chains, clinical trials in drug research, or social media management, discovering connections between resources must come hand in hand with the ability to explain them to key stakeholders.
What the extension does¶
GraphDB solves this problem by extending SPARQL with the Graph Path Search functionality that allows you to not only find complex relationships between resources but also explore them and use them as filters to identify graph patterns. It includes features for Shortest path and All paths search, which enable you to explore the connecting edges (RDF statements) between resources for the shortest property paths and subsequently for all connecting paths. Other supported features include finding the shortest distance between resources and discovering cyclical dependencies in a graph.
It also supports wildcard property search and more targeted graph pattern search. A graph pattern is an edge abstraction that can be used to define more complex relationships between resources in a graph. It targets specific types of relationships in order to filter and limit the amount of paths returned. For example, it can define indirect relationships such as N-ary relations that rely on another resource and that cannot be expressed using a standard subject-predicate-object directional relationship.
The graph path search extension is compatible with the GraphDB service plugin syntax, which allows for easy integration into queries.
For the examples included further down in this page, we have used a dataset containing Marvel Studios-related data combined with some information from DBpedia. To try them out yourself, download
it and load it into a GraphDB repository via .
Graph path search features¶
Four graph path search features are supported: Shortest path, All paths, Shortest distance, and Cyclic path.
For Shortest path and All paths, the following is valid:
All of the shortest paths with the shortest length are returned. If, when searching for the shortest path between two nodes, there are several different paths that meet this requirement, all of them will be returned as results.
Bindings for at least the source and/or destination (preferably both) must be provided.
The
startNode
andendNode
properties are unbound prior to path evaluation and are bound by the path search for each edge returned by the query. If a graph pattern is used, they show the relation between the two nodes, and are bound by the path search dynamically and recursively.Edges can be returned as RDF-star statements.
Each binding can also be returned separately.
When using a wildcard predicate pattern, the edge label (predicate) can be accessed as well.
All of the graph path search features support using a literal as a destination. Both source and destination can be literals (e.g., N-ary relations).
path:findPath
is a required property that defines the type of search function.
The features are defined by three type of properties described in detail below.
Path Search Kinds
Property name |
Description |
---|---|
|
Required property that computes the shortest path between two input nodes or between one bound and one unbound node. If, when searching for the shortest path between two nodes, there are several different paths that meet this requirement, all of them will be returned as results. |
|
Required property that finds all paths between two nodes or between all nodes and the starting node. |
|
Required property that finds the distance of the shortest path between two resources, which is the number of edges that connect the resources. |
|
Required property that finds cyclic dependencies for a given resource, meaning that a resource points back to itself. |
Variable Bindings
Property name |
Description |
Supported in feature |
---|---|---|
|
Required variable binding that specifies the source node from which the path search commences. If a destination is selected, this variable can be optional. |
Shortest path
All paths
Shortest distance
Cyclic path
|
|
Required variable binding that specifies the destination node where the path traversal completes. If a source is selected, this variable can be optional. |
Shortest path
All paths
Shortest distance
|
|
Required variable binding that returns the value of |
Shortest distance
|
|
Optional variable binding used to view the path edges as RDF-star statements.
If a wildcard predicate pattern is used, the actual properties connecting the
resources inside the path would be fetched as well. If a graph pattern is used,
they would not be accessible, and the magic predicate
|
Shortest path
All paths
Cyclic path
|
|
Optional variable binding that specifies the starting resource in the recursive graph pattern. This variable should only be used when defining a graph pattern rather than using a wildcard predicate pattern. It will return the interim source node for each edge inside the path. Equally good to use with and without graph pattern in cases where we do not care about the entire edge. |
Shortest path
All paths
Shortest distance
Cyclic path
|
|
Optional variable binding that specifies the ending resource in the recursive graph pattern. This variable should only be used when defining a graph pattern rather than using a wildcard predicate pattern. It will return the interim destination node for each edge inside the path. Equally good to use with and without pattern in cases where we do not care about the entire edge. |
Shortest path
All paths
Shortest distance
Cyclic path
|
|
Optional variable binding used to view the properties connecting the resources inside a path at each step. This variable can only be used with a wildcard predicate search. |
Shortest path
All paths
Cyclic path
|
|
Optional variable binding that returns the index of each edge inside a path in incremental order. It follows the Java array indexing notation that starts from 0. |
Shortest path
All paths
Cyclic path
|
|
Optional variable binding that returns the index of each path returned by the graph path search in incremental order. For each path, all edges that constitute it will have the same path index. Path indexing follows the Java array indexing notation that starts from 0. |
Shortest path
All paths
Cyclic path
|
|
Optional variable binding that returns bindings from the graph pattern query service. Can also be specified in optional blocks, unions etc. Has to be defined inside the main service of the path search query, and the names defined in the parameters of the search query have to be present in the nested graph pattern service. Cannot be used with a wildcard predicate. |
Shortest path
All paths
Cyclic path
|
|
Optional variable binding that traverses adjacent nodes both in the S-P-O order and the O-P-S order where the subject and object are the recursively evaluated start and end nodes. Can also be specified together with export bindings. |
Shortest path
All paths
Shortest distance
Cyclic path
|
Filtering Parameters
Property name |
Description |
Supported in feature |
---|---|---|
|
Optional variable binding that specifies the minimal path length returned by
an all paths search. This property is inclusive (meaning that a min length of 3
edges or edge abstractions for graph patterns would fetch all paths with length 3
and higher) and requires a value of type
xsd:int .The default value is -1, meaning that there is no minimal requirement.
|
All paths
Cyclic path
|
|
Optional variable binding that specifies the maximum path length returned by
an all paths or shortest path search. This property is inclusive (meaning that a
max length of 3 edges or edge abstractions for graph patterns would fetch all
paths with length 3 and less) and requires a value of type
xsd:int .The default value is 8, but if set to -1, the graph path search
would fetch all the paths with no limit in terms of length.
|
Shortest path
All paths
Shortest distance
Cyclic path
|
Required properties include a binding for source and/or destination, as well as the type of the search.
Optional properties include min/max path length, edge bindings, or path indexing. Setting a maximum path length can be useful, for instance, when you are querying a large repository of over several hundred million statements and want to limit the results so as to not strain the database.
Shortest path¶
The feature finds the shortest path between two input nodes or between one bound and one unbound node. It recursively evaluates the graph pattern in the query and replaces the start variable with the binding of the end variable in the previous execution. If we have specified a start node in the query, its value is used for the first evaluation of the graph pattern. If we have specified an end node, the query execution will stop when that end node is reached.
The shortest path feature can be used with a wildcard predicate as well as a graph pattern that is used as an edge abstraction. With it, we can impose filtering through property negation or selection, define indirect relationships, specify named graphs, etc.
Note
Inside the graph pattern, we cannot define other sub-queries or use federated queries for performance reasons. The variables bound as objects to the path:startNode
and path:endNode
properties are required to be present at least once inside the graph pattern.
Examples¶
Let’s try it out with queries that we will run against the Marvel Studios dataset that we loaded into GraphDB earlier.
1. Shortest path search with wildcard predicate
Suppose we want to find the shortest path between source node - the movie “The Black Panther (1977)”, and destination node - Marvel Comics’ creative leader Stan Lee.
In the Workbench SPARQL editor, run the following query:
PREFIX path: <http://www.ontotext.com/path#>
PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
PREFIX dbr: <http://dbpedia.org/resource/>
SELECT ?pathIndex ?edgeIndex ?edge
WHERE {
VALUES (?src ?dst) {
( dbr:The_Black_Panther_\(1977_film\) dbr:Stan_Lee )
}
SERVICE path:search {
[] path:findPath path:shortestPath ;
path:sourceNode ?src ;
path:destinationNode ?dst ;
path:pathIndex ?pathIndex ;
path:resultBindingIndex ?edgeIndex ;
path:resultBinding ?edge ;
.
}
}
Here, the path traversal is done by using a wildcard predicate. This is because we want to explore the predicates connecting the resources inside the path, and we do not know the relationships within the data.
The path:resultBinding
property returns path edges as RDF-star statements. Each edge is indexed with the path:resultBindingIndex
property, and each of the shortest paths is indexed with the path:pathIndex
property.
The results show that there are ten shortest paths between Stan Lee and the 1977 “Black Panther” movie (paths 0-9), each consisting of four edges. The first one, for example, reveals the following relationship:
“The Black Panther (1977)” is a different movie from “Black Panther”. The studio that made “Black Panther” is Marvel Studios, founded by Marvel Entertainment, where Stan Lee is a key person.

We can also trace the path in the Visual graph of the Workbench.
Go to
to enable it.From
, search for the resource The Black Panther (1977) (the resource view will autocomplete the IRI).Trace the identified path.
Note
Due to the large number of connections in the dataset and for better readability, in this and the following examples, the relationships in the Visual graph are filtered to display only the resources connected by preferred predicates. (In our case here: differentFrom
, studio
, founder
, and keyPerson
)

2. Shortest path search with graph pattern
In this query, we will again be searching for the shortest path between source node “The Black Panther (1977)” and destination node Stan Lee, but this time excluding any properties of the type http://dbpedia.org/property/keyPerson
. The path traversal will be executed using a graph pattern specifying the exclusion of this property type through property negation with the SPARQL 1.1 property paths syntax.
PREFIX path: <http://www.ontotext.com/path#>
PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
PREFIX dbr: <http://dbpedia.org/resource/>
PREFIX dbp: <http://dbpedia.org/property/>
SELECT ?start ?end ?index ?path
WHERE {
VALUES (?src ?dst) {
( dbr:The_Black_Panther_\(1977_film\) dbr:Stan_Lee )
}
SERVICE <http://www.ontotext.com/path#search> {
<urn:path> path:findPath path:shortestPath ;
path:sourceNode ?src ;
path:destinationNode ?dst ;
path:pathIndex ?path ;
path:startNode ?start;
path:endNode ?end;
path:resultBindingIndex ?index .
SERVICE <urn:path> {
?start !dbp:keyPerson ?end
}
}
}
The paths are “served” by the nested SERVICE <urn:path>
sub-clause where the service IRI coincides with the subject node invoking path:findPath
. The paths connect the nodes specified by the path:startNode
and path:endNode
bindings.
As we are using a graph pattern to specify the relation, we cannot view the predicates connecting the resources, i.e., path:resultBinding
is not applicable, but we can still view the nodes.
As in the previous example, we can index the edge bindings with the path:resultBindingIndex
property, and index each of the shortest paths with the path:pathIndex
property.
After excluding the DBpedia keyPerson
property from the search, two shortest paths between these resources are returned as results:
first path is: “The Black Panther (1977)” - “Black Panther” - Marvel Studios - Marvel Entertainment - Stan Lee
second path is: “The Black Panther (1977)” - “Black Panther” - Marvel Studios - Marvel Productions - Stan Lee

In the Visual graph, it will look like this:

All paths¶
This feature finds all paths between two nodes or between all nodes and the starting/destination node. It can be used with a wildcard predicate, as well as with more complex graph patterns and relationships. With it, we can also impose filtering with min/max number of edges, and can include or exclude inferred edges.
Examples¶
1. All paths search with unbound source
The next query will find all resources and their respective paths that can reach resource Stan Lee with a minimum of five edges using a wildcard predicate pattern.
PREFIX path: <http://www.ontotext.com/path#>
PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
PREFIX dbr: <http://dbpedia.org/resource/>
SELECT ?edge ?index ?path
WHERE {
VALUES (?dst) {
( dbr:Stan_Lee )
}
SERVICE <http://www.ontotext.com/path#search> {
<urn:path> path:findPath path:allPaths ;
path:sourceNode ?src ;
path:destinationNode ?dst ;
path:pathIndex ?path ;
path:minPathLength 5 ;
path:resultBinding ?edge ;
path:resultBindingIndex ?index .
}
}
As with Shortest path, path edges are returned as RDF-star statements through the path:resultBinding
property. Each edge is indexed with the path:resultBindingIndex
property.
The first returned path will be:

Visualizing path search results is possible through the CONSTRUCT query where you can propagate bindings from each edge through the path:startNode
, path:endNode
, path:exportBinding
(for more complex traversals), and path:propertyBinding
(when not specifying graph patterns) to the CONSTRUCT query projection.
PREFIX path: <http://www.ontotext.com/path#>
PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
PREFIX dbr: <http://dbpedia.org/resource/>
CONSTRUCT {
?start ?edgeLabel ?end
} WHERE {
VALUES (?dst) {
( dbr:Stan_Lee )
}
SERVICE <http://www.ontotext.com/path#search> {
<urn:path> path:findPath path:allPaths ;
path:sourceNode ?src ;
path:destinationNode ?dst ;
path:minPathLength 5 ;
path:startNode ?start ;
path:propertyBinding ?edgeLabel ;
path:endNode ?end ;
}
}
With the Visual button now visible at the bottom right of the SPARQL editor, you can see the results in the visual graph:

Warning
The graph visualization tool is not fully compatible with the graph path search functionality and in most cases would not display every path returned by the path search query.
2. All paths search with unbound destination
Now, let’s find all resources and their respective paths that can be reached by the resource “Guardians of the Galaxy (TV series)” with a minimum of four and a maximum of five edges using a wildcard predicate pattern.
PREFIX path: <http://www.ontotext.com/path#>
PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
PREFIX dbr: <http://dbpedia.org/resource/>
SELECT ?start ?property ?end ?index ?path
WHERE {
VALUES (?src) {
( dbr:Guardians_of_the_Galaxy_\(TV_series\))
}
SERVICE <http://www.ontotext.com/path#search> {
<urn:path> path:findPath path:allPaths ;
path:sourceNode ?src ;
path:destinationNode ?dst ;
path:minPathLength 4 ;
path:maxPathLength 5 ;
path:startNode ?start;
path:propertyBinding ?property ;
path:endNode ?end;
path:resultBindingIndex ?index ;
path:pathIndex ?path .
}
}
All edge nodes as well as predicates connecting them are viewed through the path:startNode
, path:propertyBinding
, and path:endNode
properties.
Tip
There is more than one way to return results – for example, path edges returned as RDF-star statements through the path:resultBinding
property.
These will be the first four paths returned:

Which will be visualized like this:

3. All paths search with graph pattern - bound source & destination
Similarly to the example for shortest path search with graph pattern from earlier, we will be searching for all paths between source node “The Black Panther (1977)” and destination node Stan Lee, but this time excluding any properties of the type http://dbpedia.org/property/keyPerson
. The path traversal will be executed using a graph pattern specifying the exclusion of this property type through property negation with the SPARQL 1.1 property paths syntax.
PREFIX path: <http://www.ontotext.com/path#>
PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
PREFIX dbr: <http://dbpedia.org/resource/>
PREFIX dbp: <http://dbpedia.org/property/>
SELECT ?edge ?index ?path
WHERE {
VALUES (?src ?dst) {
( dbr:The_Black_Panther_\(1977_film\) dbr:Stan_Lee )
}
SERVICE <http://www.ontotext.com/path#search> {
<urn:path> path:findPath path:allPaths ;
path:sourceNode ?src ;
path:destinationNode ?dst ;
path:pathIndex ?path ;
path:resultBinding ?edge;
path:startNode ?start;
path:endNode ?end;
path:resultBindingIndex ?index .
SERVICE <urn:path> {
?start !dbp:keyPerson ?end
}
}
}
Path edges are returned as RDF-star statements through the path:resultBinding
property, and each edge is indexed with the path:resultBindingIndex
property.
We can see that the first identified path excluding the DBpedia keyPerson
property traverses the following nodes:
The movie “The Black Panther (1977)” - Marvel Studios - Marvel Entertainment - Stan Lee.

Note
Keep in mind that when using graph patterns, we cannot view the predicates connecting the nodes. Thus, when exploring the path edges as RDF-star statements, the predicate http://www.ontotext.com/path#connectedTo
is generated.
4. All paths search with N-ary relation
You might be familiar with the Six Degrees of Kevin Bacon parlor game where players arbitrarily choose an actor and then connect them to another actor via a film that both actors have starred in, repeating this process to try and find the shortest path that ultimately leads to famous US actor Kevin Bacon. The game is a reference to the six degrees of separation concept based on the assumption that any two people on Earth are six or fewer acquaintance links apart.
In this context, let’s find all paths between source node Chris Evans and destination node Chris Hemsworth where the relationship between nodes is defined through an N-ary graph pattern based on actors co-starring in movies. The path search is limited with a minimum of two edges.
PREFIX path: <http://www.ontotext.com/path#>
PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
PREFIX dbr: <http://dbpedia.org/resource/>
PREFIX dbp: <http://dbpedia.org/property/>
PREFIX dbo: <http://dbpedia.org/ontology/>
SELECT ?edge ?index ?path
WHERE {
VALUES (?src ?dst) {
( dbr:Chris_Evans_\(actor\) dbr:Chris_Hemsworth )
}
SERVICE <http://www.ontotext.com/path#search> {
<urn:path> path:findPath path:allPaths ;
path:sourceNode ?src ;
path:destinationNode ?dst ;
path:pathIndex ?path ;
path:minPathLength 2 ;
path:startNode ?start;
path:resultBinding ?edge ;
path:endNode ?end;
path:resultBindingIndex ?index .
SERVICE <urn:path> {
?film a dbo:Film .
?film dbp:starring ?start .
?film dbp:starring ?end .
}
}
}
The first two returned paths are:

Shortest distance¶
The feature finds the distance of the shortest path between two resources, which is the number of edges that connect the resources. This is done through the path:distanceBinding
property. The nodes themselves will not be returned as results, only the distance.
Example¶
The next query finds the shortest distance between source node Marvel Studios and a date literal which represents Marvel Studios President Kevin Feige’s birthday.
PREFIX path: <http://www.ontotext.com/path#>
PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
PREFIX dbr: <http://dbpedia.org/resource/>
SELECT ?dist
WHERE {
VALUES (?src ?dst) {
( dbr:Marvel_Studios "1973-06-02"^^xsd:date )
}
SERVICE <http://www.ontotext.com/path#search> {
<urn:path> path:findPath path:distance ;
path:sourceNode ?src ;
path:destinationNode ?dst ;
path:distanceBinding ?dist;
}
}
We can see that the shortest path connecting them consists of two edges.

Cyclic path¶
With the cyclic path search, we can explore self-referring relationships between resources. Similarly to the All paths search, this one can also be limited with min/max values.
Example¶
The following query finds all paths that begin and end with source node Marvel Studios.
PREFIX path: <http://www.ontotext.com/path#>
PREFIX dbr: <http://dbpedia.org/resource/>
SELECT ?edge ?index ?path
WHERE {
VALUES (?src) {
(dbr:Marvel_Studios)
}
SERVICE <http://www.ontotext.com/path#search> {
<urn:path> path:findPath path:cycle ;
path:sourceNode ?src ;
path:resultBinding ?edge ;
path:pathIndex ?path ;
path:resultBindingIndex ?index .
}
}
The first three returned paths will be:

In the visual graph:

Exportable graph pattern bindings¶
Export bindings allow you to project any number of bindings from the graph pattern query service. The power of SPARQL graph pattern-matching property paths is combined with GraphDB’s path search algorithm, enabling the user to restrict the start and the end nodes of the path search to those pairs that match a particular graph pattern defined as SPARQL property path. You can “export” bindings from such graph patterns and this way get additional details about the found paths.
The export bindings as parameters have to be defined inside the main service of the path search query with the magic predicate <http://www.ontotext.com/path#exportBinding>
(or simply path:exportBinding
). Keep in mind that the binding names defined in the parameters of the search query have to be present in the nested graph pattern service.
Example¶
This query finds all paths between source node Chris Evans and destination node Chris Hemsworth where the relationship between nodes is defined through an N-ary graph pattern based on actors co-starring in movies. The path search is limited to a minimum of two edges. We also want to see the movies and their labels as part of the returned path.
PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX path: <http://www.ontotext.com/path#>
PREFIX dbr: <http://dbpedia.org/resource/>
PREFIX dbp: <http://dbpedia.org/property/>
PREFIX dbo: <http://dbpedia.org/ontology/>
SELECT ?start ?end ?index ?path ?label ?film
WHERE {
VALUES (?src ?dst) {
( dbr:Chris_Evans_\(actor\) dbr:Chris_Hemsworth )
}
SERVICE path:search {
<urn:path> path:findPath path:allPaths ;
path:sourceNode ?src ;
path:destinationNode ?dst ;
path:pathIndex ?path ;
path:resultBindingIndex ?index ;
path:minPathLength 2 ;
path:startNode ?start;
path:exportBinding ?film ;
path:exportBinding ?label ;
path:endNode ?end;
SERVICE <urn:path> {
?film a dbo:Film .
?film rdfs:label ?label .
?film dbp:starring ?start .
?film dbp:starring ?end .
}
}
}
The first six returned paths will be:

Which in the visual graph would look like this:

Bidirectional graph path search¶
The bidirectional search functionality can be used to traverse paths as if the graph is undirected, i.e., as if the edges between the nodes have no direction. Technically, bidirectional search traverses adjacent nodes both in S-P-O and O-P-S order, where the subject and object are the recursively evaluated start and end nodes. It can be used with all functions and can be combined with wildcard and graph pattern search as well as with exportable graph pattern bindings.
In order to do bidirectional search, you can use the magic predicate <http://www.ontotext.com/path#bidirectional>
(or simply path:bidirectional
) followed by value true
of type xsd:boolean
.
Example¶
This query finds the shortest bidirectional path between source node The Black Panther movie from 1977 and destination node Marvel Studios.
PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
PREFIX path: <http://www.ontotext.com/path#>
PREFIX dbr: <http://dbpedia.org/resource/>
SELECT ?edge ?index ?path
WHERE {
VALUES (?src ?dst) {
( dbr:The_Black_Panther_\(1977_film\) dbr:Marvel_Studios )
}
SERVICE path:search {
<urn:path> path:findPath path:allPaths ;
path:sourceNode ?src ;
path:destinationNode ?dst ;
path:resultBinding ?edge ;
path:pathIndex ?path ;
path:maxPathLength 4 ;
path:bidirectional true ;
path:resultBindingIndex ?index ;
}
}
The first two returned bidirectional paths are:

In the visual graph:
