these only contain >50MP images, and if I understand correctly, thumbnail requests for those are currently refused, so they can't generate any load. (They used to be refused only after sending the original through Swift, and that might have contributed to the scaler outage, but Bawolff fixed that in 135101.)

I would suggest continuing the NYPL upload; no problem if most of the images are small, I can just filter out the large ones and build a gallery out of them.
Or if you don't have time for that, and don't plan on working in the near future on any uploads that are affected, we can just do the whole thing another time.

On Thu, Jul 10, 2014 at 3:36 PM, Fæ <faewik@gmail.com> wrote:


On 10/07/2014, Gergo Tisza <gtisza@wikimedia.org> wrote:
> Hi all,
>
> with bug 65691 <https://bugzilla.wikimedia.org/show_bug.cgi?id=65691> fixed
> (the last patch was deployed today), now might be a good time to test large
> TIFF uploads again (the patch is limited to TIFF files for now). I was
> thinking of the following schedule:
>
> - wait until Monday (no breaking the site on the weekends)
> - launch an upload with large TIFF files (preferably the same one that
> caused issues earlier, ie. Fae's NYPL maps project)
> - make sure that the images are initially not categorized to avoid someone
> triggering 200 new thumbnail requests in parallel (GWToolset could add an
> emtpy template instead, and that template can be replaced with the category
> later).
> - initially use the minimum speed allowed by GWToolset (a single thread),
> to make sure Special:NewFiles and co. will also not be the source of many
> concurrent requests.
> - after a bunch of images have been uploaded, generate a gallery with 10
> thumbnails and monitor imagescaler load and Swift traffic in the process.
> Repeat with 20, 50 etc until we are satisfied that the scalers are
> resilient to many concurrent requests for large files.
> - if all works out, the upload project can continue with normal speed (20
> threads or whatever), and we can also relax throttle limits on GWToolset a
> bit.
>
> Does this sound reasonable? Fae, are you interested in doing this?

Two test sets suggested below. Is this list of TIFFs needing
thumbnails rendered enough? I'm hesitant to commit to putting together
a set of brand new files to test myself in the next few days as there
are several other things I need to get on with (as well as "RL"
stuff), for example I'd ought to test out a sample of the Wellcome
uploads I have been sent (the disk will be ready next week if the WMF
give me details of where to send it). Unfortunately the Library of
Congress TIFFs that I am in the middle of putting through GWT do not
tell me their resolution before I upload them - this makes it
impossible for me to suggest a set of 50MP+ files to upload from
scratch. The next couple on my backlog after the HABS collection
(Wellcome and Rijksmuseum) are all jpegs rather than tiffs.

SET ONE (HABS)

I have been uploading many TIFFs from the Library of Congress, many
well over 130 MP, however they are much smaller filesizes that the
NYPL collections for example:
* https://commons.wikimedia.org/wiki/File:Eastburn-Jeanes_Limekilns,_On_Papermill_Road_and_on_Pike_Creek_Road,_Newark,_New_Castle_County,_DE_HAER_DEL,2-CORNK.V,2-_%28sheet_1_of_2%29.tif

These have yet to be rendered with thumbnails and there are over 4,000
of them you can find listed at:
* https://commons.wikimedia.org/wiki/Category:Uploads_by_F%C3%A6_%28over_50_MP%29

They might be good as a speed test to blast through and to create the
thumbnails. I don't know if this means overwriting the current files
or if there is some trick to forcing the thumbnail recreation. If
these work well, then I'll stop creating PNGs for the TIFF drawings
I'm uploading (there are going to be something like another 20,000+ of
these to come as part of HABS uploading).

SET TWO (NYPL)

The un-rendered NYPL maps collection numbers around 1,100 and is at:
https://commons.wikimedia.org/wiki/Category:NYPL_maps_%28over_50_megapixels%29

These are both large in resolution and many are *very* large in filesize.