Hi Jeremie, Thanks for this info. In the meantime, what about making chunks of 3.5Bio triples (or any size less than 4Bio) and a script to convert the dataset? Would that be possible ?
Best, Ghislain
Provenance : Courrier pour Windows 10
De : Jérémie Roquet Envoyé le :mardi 7 novembre 2017 15:25 À : Discussion list for the Wikidata project. Objet :Re: [Wikidata] Wikidata HDT dump
Hi everyone,
I'm afraid the current implementation of HDT is not ready to handle more than 4 billions triples as it is limited to 32 bit indexes. I've opened an issue upstream: https://github.com/rdfhdt/hdt-cpp/issues/135
Until this is addressed, don't waste your time trying to convert the entire Wikidata to HDT: it can't work.