Re: [Wikidata] Wikidata HDT dump

2 Oct 2018

You shouldn't have to keep anything in RAM to HDT-ize something as you 
could make the dictionary by sorting on disk and also do the joins to 
look up everything against the dictionary by sorting.

------ Original Message ------
From: "Ettore RIZZA" &lt;ettorerizza(a)gmail.com&gt;
To: "Discussion list for the Wikidata project." 
&lt;wikidata(a)lists.wikimedia.org&gt;
Sent: 10/1/2018 5:03:59 PM
Subject: Re: [Wikidata] Wikidata HDT dump

...
   what computer
did you use for this? IIRC it required >512GB of RAM to  function.

Hello Laura,

Sorry for my confusing message, I am not at all a member of the HDT 
team. But according to its creator 
<https://twitter.com/ciutti/status/1046849607114936320>, 100 GB "with 
an optimized code" could be enough to produce an HDT like that.

On Mon, 1 Oct 2018 at 18:59, Laura Morales &lt;lauretas(a)mail.com&gt; wrote:
> > a new dump of Wikidata in HDT (with index) is 
>available[http://www.rdfhdt.org/datasets/].
>
>Thank you very much! Keep it up!
>Out of curiosity, what computer did you use for this? IIRC it required 
> >512GB of RAM to function.
>
> > You will see how Wikidata has become huge compared to other 
>datasets. it contains about twice the limit of 4B triples discussed 
>above.
>
>There is a 64-bit version of HDT that doesn't have this limitation of 
>4B triples.
>
> > In this regard, what is in 2018 the most user friendly way to use 
>this format?
>
>Speaking for me at least, Fuseki with a HDT store. But I know there 
>are also some CLI tools from the HDT folks.
>
>_______________________________________________
>Wikidata mailing list
&gt;Wikidata(a)lists.wikimedia.org
>https://lists.wikimedia.org/mailman/listinfo/wikidata 

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

Re: [Wikidata] Wikidata HDT dump