Interesting use case Laura! Your UC is rather "special" :) Let me try to understand ... You are a "data consumer" with the following needs: - Latest version of the data - Quick access to the data - You don't want to use the current ways to access the data by the publisher (endpoint, ttl dumps, LDFragments) However, you ask for a binary format (HDT), but you don't have enough memory to set up your own environment/endpoint due to lack of memory. For that reason, you are asking the publisher to support both .hdt and .hdt.index files.
Do you think there are many users with your current UC?
El mar., 31 oct. 2017 a las 14:56, Laura Morales (lauretas@mail.com) escribió:
@Laura: I suspect Wouter wants to know if he "ignores" the previous
errors and proposes a rather incomplete dump (just for you) or waits for Stas' feedback.
OK. I wonder though, if it would be possible to setup a regular HDT dump alongside the already regular dumps. Looking at the dumps page, https://dumps.wikimedia.org/wikidatawiki/entities/, it looks like a new dump is generated once a week more or less. So if a HDT dump could be added to the schedule, it should show up with the next dump and then so forth with the future dumps. Right now even the Turtle dump contains the bad triples, so adding a HDT file now would not introduce more inconsistencies. The problem will be fixed automatically with the future dumps once the Turtle is fixed (because the HDT is generated from the .ttl file anyway).
Btw why don't you use the oldest version in HDT website?
- I have downloaded it and I'm trying to use it, but the HDT tools (eg.
query) require to build an index before I can use the HDT file. I've tried to create the index, but I ran out of memory again (even though the index is smaller than the .hdt file itself). So any Wikidata dump should contain both the .hdt file and the .hdt.index file unless there is another way to generate the index on commodity hardware
- because it's 1 year old :)
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata