2008/10/11 Robert Ullmann
<rlullmann(a)gmail.com>om>:
Look at this way: you can't get enwiki dumps
more than once every six weeks.
Each one TAKES SIX WEEKS. (modulo lots of stuff, I'm simplifying a bit ;-)
The example I have used before is going into my bank: in the main Queensway
office, there will be 50-100 people on the queue. When there are 8-10
tellers, it will go well; except that some transactions (depositing some
cash) take a minute or so, and some take many, many minutes. If there are 8
tellers, and 8 people in front of you with 20-30 minute transactions, you
are toast. (They handle this by having fast lines for deposits and such ;-)
Your analogy is flawed. In that analogy the desire is to minimise the
amount of time between walking in the door and completing your
transaction, but in our case we desire to minimise the amount of time
between a person completing one transaction and that person completing
their next transaction in an ever repeating loop. The circumstances
are not the same.
And you can have enwiki dumps less than 6 weeks apart, it will just
involve having more than one running at a time.
AIUI (but please correct me if I'm wrong), you can't. At least not
without throwing more hardware at it. Otherwise, if you try to run two
enwiki dumps concurrently on the same hardware, you'll find that they
both finish in _twelve_ weeks instead of six.
If not, let's just run _all_ the dumps in parallel simultaneously, and
the problem is solved! ...right?
--
Ilmari Karonen