Laura Morales kirjoitti 02.11.2017 klo 15:54:
The tool is in the hdt-jena package (not hdt-java-cli where the other
command line tools reside), since it uses parts of Jena (e.g. ARQ).
There is a wrapper script called hdtsparql.sh for executing it with the
proper Java environment.
Does this tool work nicely with large HDT files such as wikidata? Or does it need to load the whole graph+index into memory?
I haven't tested it with huge datasets like Wikidata. But for the moderately sized (40M triples) data that I use it for, it runs pretty fast and without using lots of memory, so I think it just memory maps the hdt and index file and reads only what it needs to answer the query.
-Osma