You shouldn't have to keep anything in RAM to HDT-ize something as you
could make the dictionary by sorting on disk and also do the joins to
look up everything against the dictionary by sorting.
------ Original Message ------
From: "Ettore RIZZA" <ettorerizza(a)gmail.com>
To: "Discussion list for the Wikidata project."
<wikidata(a)lists.wikimedia.org>
Sent: 10/1/2018 5:03:59 PM
Subject: Re: [Wikidata] Wikidata HDT dump
what computer
did you use for this? IIRC it required >512GB of RAM to
function.
Hello Laura,
Sorry for my confusing message, I am not at all a member of the HDT
team. But according to its creator
<https://twitter.com/ciutti/status/1046849607114936320>, 100 GB "with
an optimized code" could be enough to produce an HDT like that.
On Mon, 1 Oct 2018 at 18:59, Laura Morales <lauretas(a)mail.com> wrote:
> > a new dump of Wikidata in HDT (with index) is
>available[http://www.rdfhdt.org/datasets/].
>
>Thank you very much! Keep it up!
>Out of curiosity, what computer did you use for this? IIRC it required
> >512GB of RAM to function.
>
> > You will see how Wikidata has become huge compared to other
>datasets. it contains about twice the limit of 4B triples discussed
>above.
>
>There is a 64-bit version of HDT that doesn't have this limitation of
>4B triples.
>
> > In this regard, what is in 2018 the most user friendly way to use
>this format?
>
>Speaking for me at least, Fuseki with a HDT store. But I know there
>are also some CLI tools from the HDT folks.
>
>_______________________________________________
>Wikidata mailing list
>Wikidata(a)lists.wikimedia.org
>https://lists.wikimedia.org/mailman/listinfo/wikidata