On 12 May 2014 11:27, dan-nl dan.entous.wikimedia@gmail.com wrote:
first, i suggest that we put off all large image uploads, > 10mb ( unless we have a concrete value that would work ), until we resolve the thumbnail issue.
during the zürich hackathon i spoke with aaron schultz, faiden liambotis, and brion vibber regarding approaches to dealing with this issue. in summary, the idea aaron came up with is to create initial thumbnails on download of the original mediafile to the wiki. this should block the appearance of the title on the new files page and anywhere else until the thumbnails and title creation/edit have completed. aaron thought, and faidon and i agree, that further throttling of gwtoolset will not help resolve the issue.
i am currently looking into implementing this approach.
with kind regards, dan
If you can set up an illustrative example (maybe doing it "by hand") so we can see how the file history and so forth would look, then it might be easier to discuss on-wiki. In the case of the Library of Congress, their database has "webpage quality" jpgs available as well as larger jpgs and tiffs. It might be possible to pass the GWT an xml file with a link to a thumbnail image as well as the tiff rather than relying on automated generation somewhere else. The tricky part (I think) would be doing this for a mass tiff upload, which is actually the only example we have of stressing the WMF servers, as the initial file would have to be formatted as a tiff rather than a jpeg.
I agree, from what we have seen, this is not a simple throttling issue. I suspect even 1 file every 5 minutes could cause an issue if there is a backlog of thumbnail creation at peak times.
Let me know if you would like my last NYPL maps xml file to play around with as an example (it can be emailed as it is only 750k). These have yet to be uploaded are were the cause of the most recent problem.
Fae