On 8/17/07, Tim Starling tstarling@wikimedia.org wrote:
Anthony wrote:
On 8/15/07, Tim Starling tstarling@wikimedia.org wrote:
It would have been simpler if he used the static HTML dump instead of the XML. It's not hard to make a desktop reader out of it. I wrote a proof of concept a while back.
Unless I'm reading your idea incorrectly, the static HTML dump seems to be about 4 times as large and in 7zip format instead of bzip2.
It's 1.5 times larger than pages-meta-current.xml.bz2, which is the equivalent XML dump, or 2.7 times larger than pages-articles.xml.bz2.
Well, most significantly, it won't fit on a single layer DVD-R(W). And it's from April, rather than August.
pages-meta-current doesn't fit on a single layer DVD either, though.
Got a link to your proof of concept?
http://noc.wikimedia.org/~tstarling/static-dump-reader.php.html
Didn't have time to try it out yet, but it looks pretty nice.