You shouldn't have to keep anything in RAM to HDT-ize something as you could make the dictionary by sorting on disk and also do the joins to look up everything against the dictionary by sorting.

------ Original Message ------
From: "Ettore RIZZA" <ettorerizza@gmail.com>
To: "Discussion list for the Wikidata project." <wikidata@lists.wikimedia.org>
Sent: 10/1/2018 5:03:59 PM
Subject: Re: [Wikidata] Wikidata HDT dump

> what computer did you use for this? IIRC it required >512GB of RAM to function.  

Hello Laura,

Sorry for my confusing message, I am not at all a member of the HDT team. But according to its creator, 100 GB "with an optimized code" could be enough to produce an HDT like that. 

On Mon, 1 Oct 2018 at 18:59, Laura Morales <lauretas@mail.com> wrote:
> a new dump of Wikidata in HDT (with index) is available[http://www.rdfhdt.org/datasets/].

Thank you very much! Keep it up!
Out of curiosity, what computer did you use for this? IIRC it required >512GB of RAM to function.

> You will see how Wikidata has become huge compared to other datasets. it contains about twice the limit of 4B triples discussed above.

There is a 64-bit version of HDT that doesn't have this limitation of 4B triples.

> In this regard, what is in 2018 the most user friendly way to use this format?

Speaking for me at least, Fuseki with a HDT store. But I know there are also some CLI tools from the HDT folks.

_______________________________________________
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata