LUBM

What is LUBM

The Lehigh University Benchmark (LUBM) is a benchmark developed for evaluating the performance of Semantic Web repositories with respect to extensional queries over a large dataset that commits to a single ontology. It consists of a university domain ontology, customisable data, a set of test queries, and several performance metrics.

Configuring GraphDB and Sesame

To run LUBM, you may be required to modify the script setvars (.cmd for Windows or .sh for Mac/Linux) in the scripts folder of the GraphDB distribution. The most important setting there is the specification of the Java virtual machine through the JAVA_HOME environment variable.

To tune the performance of the benchmark to the particular hardware on which it runs, modify the repository specification file named lubm.ttl located in the benchmark/lubm subfolder of the GraphDB distribution.

Running the benchmark

  1. Generate the test file-set by using the lubm-generate (.cmd or .sh) script in the benchmark/lubm subfolder of the GraphDB distribution. The distribution includes a pre-built library of the benchmark source code. The pre-built lubm.jar library is located in the ext2 folder and also includes the wrapper classes that the benchmark’s code-base uses in order to run against a Sesame repository (configured with GraphDB).

    Note

    The lubm-generate script accepts a single numeric argument as the target number of universities and creates a subfolder for the test run with the name univer and the number of universities appended, e.g., univer1000. This is the folder in which it places the generated OWL files.

  2. Edit the lubm.config file and comment/uncomment the appropriate benchmark configuration to be executed.

    By default, a single University dataset is configured and its configuration section is:

    # 1-university
    [GraphDB_1]
    class=owlim.OwlimWrapper
    data=./univer1
    database=jdbc:ignore
    ontology=http://www.lehigh.edu/~zhp2/2004/0401/univ-bench.owl
    

    Note

    Use the # symbol at the beginning of a line to comment it.

  3. Launch the test via the lubm-benchmark script.

    Note

    Before generating datasets or executing the test, make sure that GraphDB is configured as discussed in Configuring a Repository by editing lubm.ttl.

    The file provided is suitable for running LUBM with small datasets.

    To find out more about this benchmark, visit the LUBM web site.

Alternative to generating the LUBM datasets

It is also possible to generate LUBM datasets ‘on-the-fly’ during the loading stage.

To do this, edit the lubm.config file and use a data directory with the format GENERATE-n, where n is the number of universities, e.g.:

# 1-university
[GraphDB_1]
class=owlim.OwlimWrapper
data=GENERATE-2
database=jdbc:ignore
ontology=http://www.lehigh.edu/~zhp2/2004/0401/univ-bench.owl

Note

This approach saves the extra step (and disk space) of generating data files.