https://commons.wikimedia.org/wiki/Category:NYPL_maps_%28over_50_megapixels%... https://commons.wikimedia.org/wiki/Category:Uploads_by_F%C3%A6_%28over_50_MP...
these only contain >50MP images, and if I understand correctly, thumbnail requests for those are currently refused, so they can't generate any load. (They used to be refused only after sending the original through Swift, and that might have contributed to the scaler outage, but Bawolff fixed that in 135101 https://gerrit.wikimedia.org/r/#/c/135101/.)
I would suggest continuing the NYPL upload; no problem if most of the images are small, I can just filter out the large ones and build a gallery out of them. Or if you don't have time for that, and don't plan on working in the near future on any uploads that are affected, we can just do the whole thing another time.
On Thu, Jul 10, 2014 at 3:36 PM, Fæ faewik@gmail.com wrote:
On 10/07/2014, Gergo Tisza gtisza@wikimedia.org wrote:
Hi all,
with bug 65691 https://bugzilla.wikimedia.org/show_bug.cgi?id=65691
fixed
(the last patch was deployed today), now might be a good time to test
large
TIFF uploads again (the patch is limited to TIFF files for now). I was thinking of the following schedule:
- wait until Monday (no breaking the site on the weekends)
- launch an upload with large TIFF files (preferably the same one that
caused issues earlier, ie. Fae's NYPL maps project)
- make sure that the images are initially not categorized to avoid
someone
triggering 200 new thumbnail requests in parallel (GWToolset could add an emtpy template instead, and that template can be replaced with the
category
later).
- initially use the minimum speed allowed by GWToolset (a single thread),
to make sure Special:NewFiles and co. will also not be the source of many concurrent requests.
- after a bunch of images have been uploaded, generate a gallery with 10
thumbnails and monitor imagescaler load and Swift traffic in the process. Repeat with 20, 50 etc until we are satisfied that the scalers are resilient to many concurrent requests for large files.
- if all works out, the upload project can continue with normal speed (20
threads or whatever), and we can also relax throttle limits on GWToolset
a
bit.
Does this sound reasonable? Fae, are you interested in doing this?
Two test sets suggested below. Is this list of TIFFs needing thumbnails rendered enough? I'm hesitant to commit to putting together a set of brand new files to test myself in the next few days as there are several other things I need to get on with (as well as "RL" stuff), for example I'd ought to test out a sample of the Wellcome uploads I have been sent (the disk will be ready next week if the WMF give me details of where to send it). Unfortunately the Library of Congress TIFFs that I am in the middle of putting through GWT do not tell me their resolution before I upload them - this makes it impossible for me to suggest a set of 50MP+ files to upload from scratch. The next couple on my backlog after the HABS collection (Wellcome and Rijksmuseum) are all jpegs rather than tiffs.
SET ONE (HABS)
I have been uploading many TIFFs from the Library of Congress, many well over 130 MP, however they are much smaller filesizes that the NYPL collections for example:
https://commons.wikimedia.org/wiki/File:Eastburn-Jeanes_Limekilns,_On_Paperm...
These have yet to be rendered with thumbnails and there are over 4,000 of them you can find listed at:
https://commons.wikimedia.org/wiki/Category:Uploads_by_F%C3%A6_%28over_50_MP...
They might be good as a speed test to blast through and to create the thumbnails. I don't know if this means overwriting the current files or if there is some trick to forcing the thumbnail recreation. If these work well, then I'll stop creating PNGs for the TIFF drawings I'm uploading (there are going to be something like another 20,000+ of these to come as part of HABS uploading).
SET TWO (NYPL)
The un-rendered NYPL maps collection numbers around 1,100 and is at:
https://commons.wikimedia.org/wiki/Category:NYPL_maps_%28over_50_megapixels%...
These are both large in resolution and many are *very* large in filesize.
Fae
faewik@gmail.com https://commons.wikimedia.org/wiki/User:Fae