GraphDB Free 7.1
Table of contents
- General
- Quick start guide
- Installation
- Administration
- Administration tasks
- Administration tools
- Creating locations and repositories
- Configuring a repository
- Sizing guidelines
- Disk space requirements
- Configuring the Entity Pool
- Managing repositories
- Access rights and security
- Backing up and recovering a repository
- Query monitoring and termination
- Database health checks
- Diagnosing and reporting critical errors
- Usage
- Tools
- References
- Release notes
- FAQ
- Support
GraphDB Free 7.1
Table of contents
- General
- Quick start guide
- Installation
- Administration
- Administration tasks
- Administration tools
- Creating locations and repositories
- Configuring a repository
- Sizing guidelines
- Disk space requirements
- Configuring the Entity Pool
- Managing repositories
- Access rights and security
- Backing up and recovering a repository
- Query monitoring and termination
- Database health checks
- Diagnosing and reporting critical errors
- Usage
- Tools
- References
- Release notes
- FAQ
- Support
Disk space requirements¶
GraphDB disk space requirements for loading a dataset¶
It depends on the reasoning complexity (the number of inferred triples), the length of the URIs, the additional indices used, etc. For example, the following table shows the required disk space in bytes per explicit statement when loading the Wordnet dataset with various GraphDB configurations:
Configuration | Bytes per explicit statement |
---|---|
owl2-rl + all optional indices | 366 |
owl2-rl | 236 |
owl-horst + all optional indices | 290 |
owl-horst | 196 |
empty + all optional indices | 240 |
empty | 171 |
When planning for storage capacity based on the input RDF file size, the required disk space depends not only on the GraphDB configuration, but also on the RDF file format used and the complexity of its contents. The following table gives a rough estimate of the expected expansion from an input RDF file to GraphDB storage requirements. E.g., when using OWL2-RL with all optional indices turned on, GraphDB needs about 6.7GB of storage space to load one gigabyte N3 file. With no inference (‘empty’) and no optional indices, GraphDB needs about 0.7GB of storage space to load one gigabyte Trix file. Again, these results were created with the Wordnet dataset:
N3 | N-Triples | RDF/XML | Trig | Trix | Turtle | |
---|---|---|---|---|---|---|
owl2-rl + all optional indices | 6.7 | 2.2 | 4.8 | 6.6 | 1.5 | 6.7 |
owl2-rl | 4.3 | 1.4 | 3.1 | 4.2 | 1.0 | 4.3 |
owl-horst + all optional indices | 5.3 | 1.7 | 3.8 | 5.2 | 1.2 | 5.3 |
owl-horst | 3.6 | 1.2 | 2.6 | 3.5 | 0.8 | 3.6 |
empty + all optional indices | 4.4 | 1.4 | 3.1 | 4.3 | 1.0 | 4.4 |
empty | 3.1 | 1.0 | 2.2 | 3.1 | 0.7 | 3.1 |
GraphDB disk space requirements per statement¶
GraphDB computes inferences when new explicit statements are committed to the repository. The number of inferred statements can be zero, when using the ‘empty’ ruleset, or many multiples of the number of explicit statements (depending on the chosen ruleset and the complexity of the data).
The disk space required for each statement further depends on the size of the URIs and literals. The typical datasets with only the default indices require around 200 bytes, and up to about 300 bytes when all optional indices are turned on.
So, when using the default indices, a good estimate for the amount of disk space you will need is 200 bytes per statement (explicit and inferred), i.e.:
- 1 million statements => ~200 Megabytes storage;
- 1 billion statements => ~200 Gigabytes storage;
- 10 billion statements => ~2 Terabytes storage.