On Tue, Jul 21, 2009 at 12:47 PM, Aryeh Gregor <
Simetrical+wikilist(a)gmail.com <Simetrical%2Bwikilist(a)gmail.com>> wrote:
On Tue, Jul 21, 2009 at 11:22 AM, Chengbin
Zheng<chengbinzheng(a)gmail.com>
wrote:
On a side note, if parsing the XML gets you the
static HTML version of
Wikipedia, why can't Wikimedia just parse it for us and save a lot of our
time (parsing and learning), and use that as the static HTML dump
version?
I'd assume it was a performance issue to parse all the pages for all
the dumps so often. It might have just used too much CPU to be worth
it at the time. Parsing some individual pages can take 20 seconds or
more, and there are millions of them (although most much faster to
parse than that). I'm sure it could be reinstituted with some effort,
though.
_______________________________________________
Wikitech-l mailing list
Wikitech-l(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l