Let's start with the minimum (1 thread?), with images spread apart as far as possible from each other during the day and see how it goes. We'll keep an eye on the server load every day and see if there's room for increasing the rate. Week days would be highly preferable for us.
On Tue, May 20, 2014 at 2:27 PM, Fæ faewik@gmail.com wrote:
On 20 May 2014 13:12, Gilles Dubuc gilles@wikimedia.org wrote:
I have some rather nice >100MB tiffs
How large is that batch?
We're still working on technical changes, nothing has been merged since
the
last outage.
It should be small, as it is the exception that is over 50MB, let alone 100MB. Some of it is tidying up where I skipped 100MB files previously (the 19thC. British Cartoons collection). I would *guess* no more than 100 or 200 in a day. I can actually choose my xml to limit the overall daily number if that is a concern and you would like to suggest a number. (Sidenote - preparing the xml metadata to discover which files to upload is slow due to LoC API limits of 15 requests per minute for "security" reasons - I was unaware of this until I contacted the LoC a couple of days ago. This is not a project that can be rushed through.)
I am happy to kick these off on 2 threads maximum, which should mean something like a maximum possible throughput rate for large files of less than c.500 in a day. 1 thread would presumably be half that.
I will put aside the small number of remaining NPYL map files - there is no hurry and it would be good to use these "trouble making" files to test out the technical changes when they are implemented.
PS Beta cluster is still not working for me today, I get the standard server down page every time I run GWT there - I have not tried the production environment in the last week.
Fae
faewik@gmail.com https://commons.wikimedia.org/wiki/User:Fae