Am 15.09.2017 um 19:49 schrieb Erik Zachte:
Compute the hashes on the fly for the offline analysis
doesn’t work for Wikistats 1.0, as it only parses the stub dumps, without article content,
just metadata.
Parsing the full archive dumps is a quite expensive, time-wise.
We can always compute the hash when outputting XML dumps that contain the full
content (it's already loaded, so no big deal), and then generate the XML dump
with only meta-data from the full dump.
--
Daniel Kinzler
Principal Platform Engineer
Wikimedia Deutschland
Gesellschaft zur Förderung Freien Wissens e.V.