[Foundation-l] [Wikipedia-l] moving forward on article validation
Magnus Manske
magnus.manske at web.de
Mon Jun 26 12:57:59 UTC 2006
I have made some new toys that might be of interest:
Instead of waiting for the next database dump and downloading the whole
thing, you can now get a XML dump for all articles in a category and its
subcategories:
http://tools.wikimedia.de/~magnus/cat2xml.php?category=History_Version_0.5_articles
Sadly, the toolserver is somewhat broken for use on en, so it doesn't
even know "Category:Wikipedia Version 0.5", but that is someone else's
problem (since ages, I might add). So until this is fixed, you can't get
the complete article texts for 0.5 in one swift motion, but then! ;-)
I have also found a web server (Server2Go) which runs directly from CD
(or the main directory of any drive, including USB sticks). I have
managed to integrate MediaWiki into it. While a MySQL variant is
available, I went for a new MediaWiki database type supporting
plain-text files (read-only at the moment). Back to the roots, I guess ;-)
Anyway, I have created what I hope is a standalone, zero-install
"History Wikipedia 0.5" bundle, currently consisting of a pityful 19
articles. This was done by converting the above mentioned
custom-XML-dump into plain-text articles using a script from my wiki2xml
toy.
Categories don't work, and I was too lazy to set up interlanguage links.
No images either.
Still interested? Try
http://tools.wikimedia.de/~magnus/history05.zip (23MB)
Best to burn the whole thing to a CD, as it /has/ to be in the root
directory. I *did* mention that it's currently Windows-only, right?
Magnus
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 249 bytes
Desc: OpenPGP digital signature
URL: <http://lists.wikimedia.org/pipermail/wikimedia-l/attachments/20060626/f75a94ab/attachment-0001.pgp>
More information about the wikimedia-l
mailing list