Hola, Please don’t get me wrong and don’t give any interpretation based on my question. Since the beginning of this thread, I am also trying to push the use of HDT here. For example, I was the one contacting HDT gurus to fix the dataset error on Twitter and so on...
Sorry if Laura or any one thought I was giving “some lessons here “. I don’t have a super computer either nor a member of Wikidata team. Just a “data consumer” as many here ..
Best, Ghislain
Sent from my iPhone, may include typos
Le 31 oct. 2017 à 20:44, Luigi Assom itsawesome.yes@gmail.com a écrit :
Doh what's wrong with asking for supporting own user case "UC" ?
I think it is a totally legit question to ask, and that's why this thread exists.
Also, I do support for possibility to help access to data that would be hard to process from "common" hardware. Especially in the case of open data. They exists to allow someone take them and build them - amazing if can prototype locally, right?
I don't like the use case where a data-scientist-or-IT show to the other data-scientist-or-IT own work looking for emotional support or praise. I've seen that, not here, and I hope this attitude stays indeed out from here..
I do like when the work of data-scientist-or-IT ignites someone else's creativity - someone who is completely external - , to say: hey your work is cool and I wanna use it for... my use case! That's how ideas go around and help other people build complexity over them, without constructing not necessary borders.
About a local version of compressed, index RDF - I think that if was available, more people yes probably would use it.
On Tue, Oct 31, 2017 at 4:03 PM, Laura Morales lauretas@mail.com wrote: I feel like you are misrepresenting my request, and possibly trying to offend me as well.
My "UC" as you call it, is simply that I would like to have a local copy of wikidata, and query it using SPARQL. Everything that I've tried so far doesn't seem to work on commodity hardware since the database is so large. But HDT could work. So I asked if a HDT dump could, please, be added to other dumps that are periodically generated by wikidata. I also told you already that *I AM* trying to use the 1 year old dump, but in order to use the HDT tools I'm told that I *MUST* generate some other index first which unfortunately I can't generate for the same reasons that I can convert the Turtle to HDT. So what I was trying to say is, that if wikidata were to add any HDT dump, this dump should contain both the .hdt file and .hdt.index in order to be useful. That's about it, and it's not just about me. Anybody who wants to have a local copy of wikidata could benefit from this, since setting up a .hdt file seems much easier than a Turtle dump. And I don't understand why you're trying to blame me for this?
If you are part of the wikidata dev team, I'd greatly appreciate a "can/can't" or "don't care" response rather than playing the passive-aggressive game that you displayed in your last email.
Let me try to understand ... You are a "data consumer" with the following needs:
- Latest version of the data
- Quick access to the data
- You don't want to use the current ways to access the data by the publisher (endpoint, ttl dumps, LDFragments)
However, you ask for a binary format (HDT), but you don't have enough memory to set up your own environment/endpoint due to lack of memory. For that reason, you are asking the publisher to support both .hdt and .hdt.index files.
Do you think there are many users with your current UC?
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata