Agree! I also propose to schedule dumps according their importance. For example by article counts.
All the best,
On Sat, Jan 17, 2015 at 7:59 PM, Richard Jelinek rj@petamem.com wrote:
Hi,
we're basically mirroring all the generated dumps, extract them, harvest data etc. Lately I came to examine some of the more exotic languages and to my surprise they were even more exotic than I thought. I propose to ditch them.
Afar (aa) Wikipedia
latest at our servers is aar-20141223.xml.bz with 22974 bytes (we convert into iso639-3)
It seems the wiki has been closed or moved into incubator:
http://meta.wikimedia.org/wiki/Proposals_for_closing_projects/Closure_of_Afa...
Nevertheless in the xmldumps this wiki keeps showing up and pretending something is there. I believe we'd be all better off if dums of this would cease.
Basically the same applies for Ndonga Wikipedia
http://meta.wikimedia.org/wiki/Proposals_for_closing_projects/Closure_of_Ndo...
But the xmldumps keep pouring in:
ndo-20141223.xml.bz2
etc. Same story with several other wikimedia projects in other languages.
So in general: Could we stop dumping closed projects?
kind regards,
Dipl.-Inf. Univ. Richard C. Jelinek
PetaMem GmbH - www.petamem.com Geschäftsführer: Richard Jelinek Language Technology - We Mean IT! Sitz der Gesellschaft: Fürth 2.58921 * 10^8 Mind Units Registergericht: AG Fürth, HRB-9201
Xmldatadumps-l mailing list Xmldatadumps-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/xmldatadumps-l