Look at this way: you can't get enwiki dumps more than once every six weeks.
Each one TAKES SIX WEEKS. (modulo lots of stuff, I'm simplifying a bit ;-)
The example I have used before is going into my bank: in the main Queensway
office, there will be 50-100 people on the queue. When there are 8-10
tellers, it will go well; except that some transactions (depositing some
cash) take a minute or so, and some take many, many minutes. If there are 8
tellers, and 8 people in front of you with 20-30 minute transactions, you
are toast. (They handle this by having fast lines for deposits and such ;-)
In general, one queue feeding multiple servers/threads works very nicely if
the tasks are about the same size.
But what we have here is projects that take less than a minute, in the same
queue with projects that take weeks. That is 5 orders of magnitude: in the
time in takes to do the enwiki dump, the same thread could do ONE HUNDRED
THOUSAND small projects.
Imagine walking into your bank with a 30 second transaction, and being told
it couldn't be completed for 6 weeks because there were 3 officers
available, and 5 people who needed complicated loan approvals on the queue
in front of you.
That's the way the dumps are set up right now.
On Sat, Oct 11, 2008 at 2:49 AM, Thomas Dalton <thomas.dalton(a)gmail.com>wrote;wrote:
I'm trying to work out if it is actually desirable
to separate the
larger projects onto one thread. The only way you can have a smaller
project dumped more often is the have the larger ones dumped less
often, but do we really want less frequent enwiki dumps? By
separateing them and sharing them fairly between the threads you can
get more regular dumps, but the significant number is surely the
amount of time between one dump of your favourite project and the
next, which will only change if you share the projects unfairly. Why
do we want small projects to be dumped more frequently than large
projects?
I guess the answer, really, is to get more servers doing dumps - I'm
sure that will come in time.
_______________________________________________
Wikitech-l mailing list
Wikitech-l(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l