I have a decent server that is dedicated for a Wikipedia project that
depends on the fresh dumps. Can this be used anyway to speed up the process
of generating the dumps?
bilal
On Tue, Jan 27, 2009 at 2:24 PM, Christian Storm <storm(a)iparadigms.com>wrote;wrote:
> On 1/4/09
6:20 AM, yegg at
alum.mit.edu wrote:
> The current enwiki database dump (
http://download.wikimedia.org/enwiki/20081008/
) has
been crawling along since 10/15/2008.
The current dump system is not sustainable on
very large wikis and
is being replaced. You'll hear about it when we have the new one in
place. :)
-- brion
Following up on this thread:
http://lists.wikimedia.org/pipermail/wikitech-l/2009-January/040841.html
Brion,
Can you offer any general timeline estimates (weeks, months, 1/2
year)? Are there any alternatives to retrieving the article data
beyond directly crawling
the site? I know this is verboten but we are in dire need of
retrieving this data and don't know of any alternatives. The current
estimate of end of year is
too long for us to wait. Unfortunately, wikipedia is a favored source
for students to plagiarize from which makes out of date content a real
issue.
Is there any way to help this process along? We can donate disk
drives, developer time, ...? There is another possibility
that we could offer but I would need to talk with someone at the
wikimedia foundation offline. Is there anyone I could
contact?
Thanks for any information and/or direction you can give.
Christian
_______________________________________________
Wikitech-l mailing list
Wikitech-l(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l