On 04.11.2014 18:18, Cristian Consonni wrote:
Hi Markus,
2014-11-01 0:29 GMT+01:00 Markus Krötzsch <markus(a)semantic-mediawiki.org>rg>:
Nice. We are running the RDF generation on a
shared cloud environment and I
am not sure we can really use a lot of RAM there. Do you have any guess how
much RAM you needed to get this done?
I didn't take any stats (my bad) but I would say that for the combined
dump, starting from the compressed (gz) file it took around 50GB.
I don't have time to re-run this experiment again now but I next time
I will take some measurements.
Ok, thanks, this is already a good indicator for us. I don't think we
could use up that much memory on Wikimedia Labs ...
Btw. there was a bug in our RDF exports that made them bigger than they
should have been (no wrong triples, but many duplicates). I have
corrected the issue now and uploaded new versions. Maybe this also will
make processing faster next time.
Markus