Hi everyone.
I have just started uploading my first large batch of images to Commons using the toolset (abt 5000 low res images). The first 10 uploaded fine but for the last 30 minutes no more have uploaded. Is this a sign that there is a problem? or is it normal for there to be gaps in the upload?
Thanks
Jason
Gwtoolset tends to start and stop. I'd wait a little longer (And I think more have uploaded as of this writing)
Note that I am seeing some errors for HTTP timeout for the following files (records 56-58, 60): http://dams.llgc.org.uk/behaviour/llgc-id:1128816/fedora-bdef:image/referenc... http://dams.llgc.org.uk/behaviour/llgc-id:1128815/fedora-bdef:image/referenc... http://dams.llgc.org.uk/behaviour/llgc-id:1128817/fedora-bdef:image/referenc... http://dams.llgc.org.uk/behaviour/llgc-id:1128819/fedora-bdef:image/referenc...
Cheers, Brian
On 7/24/15, Jason J. Evans jason.evans@llgc.org.uk wrote:
Hi everyone.
I have just started uploading my first large batch of images to Commons using the toolset (abt 5000 low res images). The first 10 uploaded fine but for the last 30 minutes no more have uploaded. Is this a sign that there is a problem? or is it normal for there to be gaps in the upload?
Thanks
Jason
Jason Evans Wicipediwr Preswyl / Wikipedian in Residence Llyfrgell Genedlaethol Cymru / National Library of Wales jason.evans@llgc.org.uk Ffon/Tel: +44 (0)1970 632405
Glamtools mailing list Glamtools@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/glamtools
Thanks Brian
Yes more have now uploaded, but it will take days at the current speed. Does the process time-out at all or will it keep going as long as our servers are running?
Also, do you have any suggestions as to why some images are timing out?
Thanks for your help.
Jason
On Friday, July 24, 2015 13:03 BST, Brian Wolff bawolff@gmail.com wrote:
Gwtoolset tends to start and stop. I'd wait a little longer (And I think more have uploaded as of this writing)
Note that I am seeing some errors for HTTP timeout for the following
files (records 56-58, 60): http://dams.llgc.org.uk/behaviour/llgc-id:1128816/fedora-bdef:image/referenc... http://dams.llgc.org.uk/behaviour/llgc-id:1128815/fedora-bdef:image/referenc... http://dams.llgc.org.uk/behaviour/llgc-id:1128817/fedora-bdef:image/referenc... http://dams.llgc.org.uk/behaviour/llgc-id:1128819/fedora-bdef:image/referenc...
Cheers, Brian
On 7/24/15, Jason J. Evans jason.evans@llgc.org.uk wrote:
Hi everyone.
I have just started uploading my first large batch of images to Commons using the toolset (abt 5000 low res images). The first 10 uploaded fine but for the last 30 minutes no more have uploaded. Is this a sign that there is a problem? or is it normal for there to be gaps in the upload?
Thanks
Jason
Jason Evans Wicipediwr Preswyl / Wikipedian in Residence Llyfrgell Genedlaethol Cymru / National Library of Wales jason.evans@llgc.org.uk Ffon/Tel: +44 (0)1970 632405
Glamtools mailing list Glamtools@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/glamtools
Glamtools mailing list Glamtools@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/glamtools
On 7/24/15, Jason J. Evans jason.evans@llgc.org.uk wrote:
Thanks Brian
Yes more have now uploaded, but it will take days at the current speed. Does the process time-out at all or will it keep going as long as our servers are running?
It should keep going forever.
Also, do you have any suggestions as to why some images are timing out?
Thanks for your help.
Jason
I don't really know. The rate limiting mechanism for gwtoolset isn't the most smooth. Perhaps its trys to load too many at one time (overloading things for that instant), and then does nothing for like 20 minutes.
--bawolff
Now some uploads run fast in the foreground and of course win over a background GWTs-job (~100s x slower). Perhaps we should share the room more equally, slow uploads can endanger projects, less to show to donors.
Regards, hans muller
Op Vr, 24 juli, 2015 5:02 pm schreef Brian Wolff:
On 7/24/15, Jason J. Evans jason.evans@llgc.org.uk wrote:
Thanks Brian
Yes more have now uploaded, but it will take days at the current speed. Does the process time-out at all or will it keep going as long as our servers are running?
It should keep going forever.
Also, do you have any suggestions as to why some images are timing out?
Thanks for your help.
Jason
I don't really know. The rate limiting mechanism for gwtoolset isn't the most smooth. Perhaps its trys to load too many at one time (overloading things for that instant), and then does nothing for like 20 minutes.
--bawolff
Glamtools mailing list Glamtools@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/glamtools
Sorry to keep asking so many questions, but i have noticed with our upload of Welsh landscapes that several different images often have the same file name. Each upload is just replacing the previous file by that name. Is there some way to give each 'version' of the file its own page on commons or would i have to manually identify each one and re-upload with new file names?
here is an example: https://commons.wikimedia.org/wiki/File:Brecon.jpeg
Thanks
Jason
On Monday, July 27, 2015 11:49 BST, "Hans Muller" j.m.muller@hccnet.nl wrote:
Now some uploads run fast in the foreground and of course win over a
background GWTs-job (~100s x slower). Perhaps we should share the room more equally, slow uploads can endanger projects, less to show to donors.
Regards, hans muller
Op Vr, 24 juli, 2015 5:02 pm schreef Brian Wolff:
On 7/24/15, Jason J. Evans jason.evans@llgc.org.uk wrote:
Thanks Brian
Yes more have now uploaded, but it will take days at the current speed. Does the process time-out at all or will it keep going as long as our servers are running?
It should keep going forever.
Also, do you have any suggestions as to why some images are timing out?
Thanks for your help.
Jason
I don't really know. The rate limiting mechanism for gwtoolset isn't the most smooth. Perhaps its trys to load too many at one time (overloading things for that instant), and then does nothing for like 20 minutes.
--bawolff
Glamtools mailing list Glamtools@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/glamtools
Glamtools mailing list Glamtools@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/glamtools
It would be a good idea to do some testing on the beta server before using the GWtoolset in production, it's a lot of work to make changes to large numbers of files on Commons. These type of mistakes will then be filtered out by the time you do a batch upload. Here is a write-up http://www.beeldengeluid.nl/en/blogs/research-amp-development-en/201407/first-batch-upload-wikimedia-commons-using-gwtoolset of my own workflow. As for filenames, each name must be unique. So try and create a unique name by using e.g. the title field, identifier field and date as a combination.
Good luck, Jesse
2015-07-27 16:14 GMT+02:00 Jason J. Evans jason.evans@llgc.org.uk:
Sorry to keep asking so many questions, but i have noticed with our upload of Welsh landscapes that several different images often have the same file name. Each upload is just replacing the previous file by that name. Is there some way to give each 'version' of the file its own page on commons or would i have to manually identify each one and re-upload with new file names?
here is an example: https://commons.wikimedia.org/wiki/File:Brecon.jpeg
Thanks
Jason
On Monday, July 27, 2015 11:49 BST, "Hans Muller" j.m.muller@hccnet.nl wrote:
Now some uploads run fast in the foreground and of course win over a
background GWTs-job (~100s x slower). Perhaps we should share the room more equally, slow uploads can endanger projects, less to show to donors.
Regards, hans muller
Op Vr, 24 juli, 2015 5:02 pm schreef Brian Wolff:
On 7/24/15, Jason J. Evans jason.evans@llgc.org.uk wrote:
Thanks Brian
Yes more have now uploaded, but it will take days at the current
speed.
Does the process time-out at all or will it keep going as long as our
servers
are running?
It should keep going forever.
Also, do you have any suggestions as to why some images are timing
out?
Thanks for your help.
Jason
I don't really know. The rate limiting mechanism for gwtoolset isn't the most smooth. Perhaps its trys to load too many at one time
(overloading
things for that instant), and then does nothing for like 20 minutes.
--bawolff
Glamtools mailing list Glamtools@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/glamtools
Glamtools mailing list Glamtools@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/glamtools
-- Jason Evans Wicipediwr Preswyl / Wikipedian in Residence Llyfrgell Genedlaethol Cymru / National Library of Wales jason.evans@llgc.org.uk Ffon/Tel: +44 (0)1970 632405
Glamtools mailing list Glamtools@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/glamtools
Jesse
The test files for Beta that the Systems team gave me had no duplicate file named so i assumed that their were none. Its my fault for not checking all the data.
I will simply have to spend time locating and re uploading each one of these files
Thanks
Jason
On Monday, July 27, 2015 15:20 BST, Jesse de Vos jdvos@beeldengeluid.nl wrote:
It would be a good idea to do some testing on the beta server before using the GWtoolset in production, it's a lot of work to make changes to large numbers of files on Commons. These type of mistakes will then be filtered out by the time you do a batch upload. Here is a write-up http://www.beeldengeluid.nl/en/blogs/research-amp-development-en/201407/first-batch-upload-wikimedia-commons-using-gwtoolset of my own workflow. As for filenames, each name must be unique. So try and create a unique name by using e.g. the title field, identifier field and date as a combination.
Good luck, Jesse
2015-07-27 16:14 GMT+02:00 Jason J. Evans jason.evans@llgc.org.uk:
Sorry to keep asking so many questions, but i have noticed with our upload of Welsh landscapes that several different images often have the same file name. Each upload is just replacing the previous file by that name. Is there some way to give each 'version' of the file its own page on commons or would i have to manually identify each one and re-upload with new file names?
here is an example: https://commons.wikimedia.org/wiki/File:Brecon.jpeg
Thanks
Jason
On Monday, July 27, 2015 11:49 BST, "Hans Muller" j.m.muller@hccnet.nl wrote:
Now some uploads run fast in the foreground and of course win over a
background GWTs-job (~100s x slower). Perhaps we should share the room more equally, slow uploads can endanger projects, less to show to donors.
Regards, hans muller
Op Vr, 24 juli, 2015 5:02 pm schreef Brian Wolff:
On 7/24/15, Jason J. Evans jason.evans@llgc.org.uk wrote:
Thanks Brian
Yes more have now uploaded, but it will take days at the current
speed.
Does the process time-out at all or will it keep going as long as our
servers
are running?
It should keep going forever.
Also, do you have any suggestions as to why some images are timing
out?
Thanks for your help.
Jason
I don't really know. The rate limiting mechanism for gwtoolset isn't the most smooth. Perhaps its trys to load too many at one time
(overloading
things for that instant), and then does nothing for like 20 minutes.
--bawolff
Glamtools mailing list Glamtools@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/glamtools
Glamtools mailing list Glamtools@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/glamtools
-- Jason Evans Wicipediwr Preswyl / Wikipedian in Residence Llyfrgell Genedlaethol Cymru / National Library of Wales jason.evans@llgc.org.uk Ffon/Tel: +44 (0)1970 632405
Glamtools mailing list Glamtools@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/glamtools
--
Met vriendelijke groet,
*Jesse de Vos* Researcher Interactive and New Media
*T* 035 - 677 39 37 *Aanwezig:* ma t/m do
*Nederlands Instituut voor Beeld en Geluid* *Media Parkboulevard 1, 1217 WE Hilversum | Postbus 1060, 1200 BB Hilversum | * *beeldengeluid.nl* http://www.beeldengeluid.nl/
Sorry to keep asking so many questions, but i have noticed with our upload
of Welsh landscapes that several different images often have the same file name. Each upload is just replacing the previous file by that name. Is there some way to give each 'version' of the file its own page on commons or would i have to manually identify each one and re-upload with new file names?
here is an example: https://commons.wikimedia.org/wiki/File:Brecon.jpeg
I will simply have to spend time locating and re uploading each one of these files
Sysops can perform a history split [1] for you. If you list me the files I’d be happy to do it for you.
[1] https://commons.wikimedia.org/wiki/Commons:History_merging_and_splitting#His...
Jean Frederic
That's great, Thank you for offering to split these files. Once the upload is finished i will prepare a list for you.
Regards
Jason
On Monday, July 27, 2015 16:21 BST, Jean-Frédéric jeanfrederic.wiki@gmail.com wrote:
Sorry to keep asking so many questions, but i have noticed with our upload
of Welsh landscapes that several different images often have the same file name. Each upload is just replacing the previous file by that name. Is there some way to give each 'version' of the file its own page on commons or would i have to manually identify each one and re-upload with new file names?
here is an example: https://commons.wikimedia.org/wiki/File:Brecon.jpeg
I will simply have to spend time locating and re uploading each one of these files
Sysops can perform a history split [1] for you. If you list me the files I’d be happy to do it for you.
[1] https://commons.wikimedia.org/wiki/Commons:History_merging_and_splitting#His...
i would encourage you to include a unique id number or accession number in a field so that wikidata could link to it; and included it in file name prevents duplicate uploads.
jim
On Mon, Jul 27, 2015 at 10:14 AM, Jason J. Evans jason.evans@llgc.org.uk wrote:
Sorry to keep asking so many questions, but i have noticed with our upload of Welsh landscapes that several different images often have the same file name. Each upload is just replacing the previous file by that name. Is there some way to give each 'version' of the file its own page on commons or would i have to manually identify each one and re-upload with new file names?
here is an example: https://commons.wikimedia.org/wiki/File:Brecon.jpeg
Thanks
Jason
On Monday, July 27, 2015 11:49 BST, "Hans Muller" j.m.muller@hccnet.nl wrote:
Now some uploads run fast in the foreground and of course win over a
background GWTs-job (~100s x slower). Perhaps we should share the room more equally, slow uploads can endanger projects, less to show to donors.
Regards, hans muller
Op Vr, 24 juli, 2015 5:02 pm schreef Brian Wolff:
On 7/24/15, Jason J. Evans jason.evans@llgc.org.uk wrote:
Thanks Brian
Yes more have now uploaded, but it will take days at the current
speed.
Does the process time-out at all or will it keep going as long as our
servers
are running?
It should keep going forever.
Also, do you have any suggestions as to why some images are timing
out?
Thanks for your help.
Jason
I don't really know. The rate limiting mechanism for gwtoolset isn't the most smooth. Perhaps its trys to load too many at one time
(overloading
things for that instant), and then does nothing for like 20 minutes.
--bawolff
Glamtools mailing list Glamtools@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/glamtools
Glamtools mailing list Glamtools@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/glamtools
-- Jason Evans Wicipediwr Preswyl / Wikipedian in Residence Llyfrgell Genedlaethol Cymru / National Library of Wales jason.evans@llgc.org.uk Ffon/Tel: +44 (0)1970 632405
Glamtools mailing list Glamtools@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/glamtools
On 27 July 2015 at 15:14, Jason J. Evans jason.evans@llgc.org.uk wrote:
Sorry to keep asking so many questions, but i have noticed with our upload of Welsh landscapes that several different images often have the same file name. Each upload is just replacing the previous file by that name. Is there some way to give each 'version' of the file its own page on commons or would i have to manually identify each one and re-upload with new file names?
You can find several years worth of project pages and discussion at https://commons.wikimedia.org/wiki/Commons:Batch_uploading
Schemes for unique filenames and best practices are discussed in detail. If none of this is easy to see in the GWT documentation, then the batch uploading pages are a good source to crib from.
Fae
2015-07-27 15:46 GMT+01:00 Fæ faewik@gmail.com:
On 27 July 2015 at 15:14, Jason J. Evans jason.evans@llgc.org.uk wrote:
Sorry to keep asking so many questions
No worries about that :)
Each upload is just replacing the previous file by that name. Is there some way to give each 'version' of the file its own page on commons or would i have to manually identify each one and re-upload with new file names?
You can find several years worth of project pages and discussion at https://commons.wikimedia.org/wiki/Commons:Batch_uploading
Schemes for unique filenames and best practices are discussed in detail. If none of this is easy to see in the GWT documentation, then the batch uploading pages are a good source to crib from.
Echoing this. In particular, several volunteers have spent quite some time writing up < https://commons.wikimedia.org/wiki/Commons:Guide_to_batch_uploading%3E. Probably not perfect, but certainly a good place to start.
On 7/27/15, Jason J. Evans jason.evans@llgc.org.uk wrote:
Sorry to keep asking so many questions, but i have noticed with our upload of Welsh landscapes that several different images often have the same file name. Each upload is just replacing the previous file by that name. Is there some way to give each 'version' of the file its own page on commons or would i have to manually identify each one and re-upload with new file names?
here is an example: https://commons.wikimedia.org/wiki/File:Brecon.jpeg
Thanks
Jason
It might make a good improvement to GWToolset to, in the case of a duplicate file name, have the option to automatically change the filename to something else, instead of just re-uploading over the taken name. But I don't think anyone is currently working on new features for gwtoolset.
-- bawolff