But 3MB uncompressed string data does not seem to be so big in absolute
terms, or are you referring to something else (I got this number from
the long pages special)? Parsing a 3MB string may need some extra
memory, but the data you get in the end should not be much bigger than
the original string, or should it?
Markus
On Fri, Nov 27, 2015 at 2:12 PM Markus Krötzsch
<markus(a)semantic-mediawiki.org <mailto:markus@semantic-mediawiki.org>>
wrote:
On 25.11.2015 16:05, Lydia Pintscher wrote:
On Mon, Nov 23, 2015 at 10:54 PM, Magnus Manske
<magnusmanske(a)googlemail.com
<mailto:magnusmanske@googlemail.com>>
wrote:
> Well, my import code chokes on the last two
JSON dumps (16th and
23rd). As
> it fails about half an hour or so in,
debugging is ...
inefficient. Unless
> there is something that has changed with the
dump itself (new
data type or
> so), and someone tells me, it will be quite
some time (days,
weeks) until I
figure it
out.
To update everyone here as well: Magnus has been able to pinpoint the
problem and fix the tools. They're catching up again. The issue was
one the extremely big pages that have have recently been created for
research papers:
https://www.wikidata.org/wiki/Special:LongPages
Thanks for explaining. This explains why we did not see any problems or
unusual behaviour in Wikidata Toolkit. I guess Java simply does not care
about how long pages are, as long as they are not very big in absolute
terms.
Markus
_______________________________________________
Wikidata mailing list
Wikidata(a)lists.wikimedia.org <mailto:Wikidata@lists.wikimedia.org>
https://lists.wikimedia.org/mailman/listinfo/wikidata
_______________________________________________
Wikidata mailing list
Wikidata(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata