Dear all,
First off, congratulations to everybody on creating this tool, which is going to revolutionise uploading to WikiCommons.
Inevitably what follows is going to be largely a list of nit-picks (and I'm sorry if I haven't tried to find your project plans or bug-tracker first, in case some of the answers are already in the pipeline); but don't let any of the below take away from what is a great achievement.
So, what are some issues that struck me, when uploading the set now at
https://commons.wikimedia.org/wiki/Category:Images_released_by_British_Libra...
(cat name may move, but this is where it's at for the moment).
* Filenames -- already under discussion in a different thread, at least as regards character replacements.
I was a bit surprised to find the Artwork::title field automatically being built into the file name -- I hadn't expected this.
On the one hand, I can see that it's an important piece of WikiCommons culture to enforce: the name of the work comes first, because that is what people will first see. But in my case, I sometimes had some very long titles, so I wanted to be able to sometimes have a shortened version in the filename. As a result, to avoid this I found that I was having to move put the picture title into the first line of the description field -- not ideal. So you might want to consider adding an option to de-select this.
It would be nice for users to have a bit more information about how filenames will be created, but this will come.
* Staging area. -- I had had the impression that the initial 3 test uploads would be uploaded to a staging area, rather than the main live wiki. So I was a bit surprised when I found it was indeed the main live wiki they had been uploaded to.
Of course, this makes a lot of sense -- for example, seeing the effect of specialist templates etc. It's just about managing expectations -- and, perhaps, reassuring people that mistakes can be easily removed eg by tagging the wrongly named image with {{duplicate}}. (I ended up with the unexpected title duplication causing unwanted filenames, and then a ".jpg.jpg" set of uploads). I initially wasn't very comfortable with my mistakes happening on the live wiki for all to see, which made me feel quite stressed to start with; but then I relaxed, and started the full upload.
* Output. -- If outputting {{artwork}}, please include the standard fields in the standard order, even if some of them are empty. eg:
{{Artwork |artist = |title = |description = |date = |medium = |dimensions = |institution = |location = |references = |object history = |credit line = |inscriptions = |notes = |accession number = |source = |permission = |other_versions = }}
& further fields have their standard places in the order; which pretty much corresponds to the sequence they are output in, *not* alphabetical order.
This is important, because WikiCommons is not a "write once" medium -- pages are there to be easily edited and updated, by humans.
It is useful to have all the basic fields in place, even if they are not populated, because it makes it so much easier to fill something in later -- for example, in my case, to move some of the 'description' back into the 'title'; or to add references; or transcriptions of inscriptions; or other versions, already on the Wiki.
The empty fields also help to give the edit page order and structure when you look at it; otherwise it can get messy and harder to process, if the 'description' and 'source' fields are allowed to dominate, which can get quite long and free-form.
And please keep the fields in the standard order above, so that experienced editors know exactly where to expect to look for particular information, and where to edit it.
* GWtoolset fields.
The unexpected fields 'gwtoolset-title-identifier' and 'gwtoolset-url-to-the-media-file' are currently causing the template to throw warnings, which look unsightly.
If these are going to be placed in the artwork template, please edit that template, so that it doesn't throw warnings.
But is the artwork template actually the best place for these fields? They don't relate to a description of the artwork, rather a description of the upload process.
The standard place to describe the history of the upload process is in its own template, separate from the image description template -- compare for example the template left by the Flickr2Commons bot in the 'licensing' section of the page
https://commons.wikimedia.org/wiki/File:Furnival%27s_Inn,_Holborn_-_Shepherd...
The advantage of this is that the 'artwork' template can be kept to a very specific function, without having its code cluttered up by other stuff. Think what the effect would be if every upload process wanted to add its fields to the artwork template -- maintenance, or even reading the code, would become a nightmare. Instead, much better to put this content in your own template, to mark the GWtoolset upload process, perhaps with an additional master parameter to turn visible output from the template off or on.
* Category section
This is one of the most important sections for hand-editing. Yes there are nice methods to add/remove categories now built right into the interface; but these still also get edited by hand, too. Readability is therefore important.
Therefore, can you add linefeed characters, so that each [[Category:...]] directive starts on a new line.
It's a small thing. But without it the output from last night's version is almost unreadable.
* Whitespace
I can see it's useful at the moment, in the present beta stage of the code, to add a debugging dump of the tool's run-state to the end of the page.
But please can you add several lines of whitespace before it.
Normally, the category section is very easy to find, being the last thing on the page. But without whitespace, it gets buried in a big heap of text. So, fine to keep the debugging information there, but please add a few lines of whitespace before it, to make it easier to find the categories section.
* Markup
I wasn't sure how to get markup onto the page. For example, the <br /> tag can be useful if one only wants a newline, not a new paragraph. (It is only double newlines that the Wiki software treats as breaks, single newlines get rendered as spaces; so a <br /> tag is needed if you want to specify a linebreak).
However it appeared that <br /> tags were being eaten by the XML parser.
I also tried double single-quotes '' to indicate italicised text, but the software carefully turned these into Unicode escapes to preserve them. (I didn't try <i> or <em>, so maybe that would have been the way round this).
It can also be very useful to be able to add [[wikilinks]] at the offline, pre-upload stage. I presume the software will escape these as well. (Though there are workaround templates, which I presume may give a way to work round this, albeit at the expense of less readable wiki-pages).
* Enhancements
** {{DEFAULTSORT:}}
It would be nice to be able to specify a field in the XML to be put into a Defaultsort for the page.
For example, for anything over 100 years old, I tend to find that it's useful to specify a default sort-key of the form "DATE ITEM SEQ" -- where DATE is a 4-digit numerical date (perhaps with a suffix to indicate imprecision), ITEM is some identifier for the series or item, eg a book, that the images are drawn from; and SEQ is a padded number to indicate a sequence within that item.
Last night I got round this by smuggling my Defaultsort into one of the fields in the Artwork template; but really it ought to be placed immediately above the Category information, so it would be able to load it there directly.
** Free text
It might be good to also be able to have the general ability to load text (eg arbitrary templates) from the XML file into the various other parts of the page outside the Artwork template. For example, particular credit templates or notes, or bespoke 'permissions' templates.
Of course it would be nice if the tool already knew about such templates; but for when it doesn't, it would be a useful option to be able to place free text in different parts of the standard page.
** Compound fields
As well as Defaultsort above, there were a number of other entries in my upload last night that were compound fields.
For example, Description = Title + '
' + Description Filename = Short_Name - Short_Item_Name (Date), Page - Shelfmark
while 'Source' was built from two fields plus two further templates, each of which had various input fields.
Some of this is always going to be best pre-processed offline. But for simple cases, it would be nice to be able to specify multiple fields with separators, that could then be baked into the JSON file.
** Non-XML forms of input.
JSON seems increasingly popular; and might not have so many issues with escaped characters (and escapes for the escape mechanisms) as XML. Or perhaps it's just that I write simple XML by hand, but for JSON I tend to leave it to a library call to worry about...
So there are some issues. The (non-)allowed filename characters, and the presentation/layout of the final wikitext page were the ones that gave me actual unhappiness. The rest is there as a raw user's initial impressions.
But really I want to thank you for this tool, which makes batch uploading accessible really for anyone who can write an XML file, rather than having to write bespoke bots and get specific bot approval for each little thing.
Hope this is useful,
All best,
James.
Hi James, Great that the tool was useful to you! You're our first user "in the wild"!
And thanks for the constructive feedback. I'll go through it together with Dan Entous who developed the tool and create tickets based on them. The project doesn't have further funding right now but the fixes that are small we're happy to make. We'll get back to you on this list.
Cheers, David Haskiya
(who was the project manager for the GLAMwikitoolset project) ________________________________________ From: glamtools-bounces@lists.wikimedia.org [glamtools-bounces@lists.wikimedia.org] on behalf of James Heald [j.heald@ucl.ac.uk] Sent: 05 March 2014 19:44 To: Conversations revolving around the development of GLAM Digital Tools Subject: [Glamtools] Initial impressions after my first upload (longish)
Dear all,
First off, congratulations to everybody on creating this tool, which is going to revolutionise uploading to WikiCommons.
Inevitably what follows is going to be largely a list of nit-picks (and I'm sorry if I haven't tried to find your project plans or bug-tracker first, in case some of the answers are already in the pipeline); but don't let any of the below take away from what is a great achievement.
So, what are some issues that struck me, when uploading the set now at
https://commons.wikimedia.org/wiki/Category:Images_released_by_British_Libra...
(cat name may move, but this is where it's at for the moment).
* Filenames -- already under discussion in a different thread, at least as regards character replacements.
I was a bit surprised to find the Artwork::title field automatically being built into the file name -- I hadn't expected this.
On the one hand, I can see that it's an important piece of WikiCommons culture to enforce: the name of the work comes first, because that is what people will first see. But in my case, I sometimes had some very long titles, so I wanted to be able to sometimes have a shortened version in the filename. As a result, to avoid this I found that I was having to move put the picture title into the first line of the description field -- not ideal. So you might want to consider adding an option to de-select this.
It would be nice for users to have a bit more information about how filenames will be created, but this will come.
* Staging area. -- I had had the impression that the initial 3 test uploads would be uploaded to a staging area, rather than the main live wiki. So I was a bit surprised when I found it was indeed the main live wiki they had been uploaded to.
Of course, this makes a lot of sense -- for example, seeing the effect of specialist templates etc. It's just about managing expectations -- and, perhaps, reassuring people that mistakes can be easily removed eg by tagging the wrongly named image with {{duplicate}}. (I ended up with the unexpected title duplication causing unwanted filenames, and then a ".jpg.jpg" set of uploads). I initially wasn't very comfortable with my mistakes happening on the live wiki for all to see, which made me feel quite stressed to start with; but then I relaxed, and started the full upload.
* Output. -- If outputting {{artwork}}, please include the standard fields in the standard order, even if some of them are empty. eg:
{{Artwork |artist = |title = |description = |date = |medium = |dimensions = |institution = |location = |references = |object history = |credit line = |inscriptions = |notes = |accession number = |source = |permission = |other_versions = }}
& further fields have their standard places in the order; which pretty much corresponds to the sequence they are output in, *not* alphabetical order.
This is important, because WikiCommons is not a "write once" medium -- pages are there to be easily edited and updated, by humans.
It is useful to have all the basic fields in place, even if they are not populated, because it makes it so much easier to fill something in later -- for example, in my case, to move some of the 'description' back into the 'title'; or to add references; or transcriptions of inscriptions; or other versions, already on the Wiki.
The empty fields also help to give the edit page order and structure when you look at it; otherwise it can get messy and harder to process, if the 'description' and 'source' fields are allowed to dominate, which can get quite long and free-form.
And please keep the fields in the standard order above, so that experienced editors know exactly where to expect to look for particular information, and where to edit it.
* GWtoolset fields.
The unexpected fields 'gwtoolset-title-identifier' and 'gwtoolset-url-to-the-media-file' are currently causing the template to throw warnings, which look unsightly.
If these are going to be placed in the artwork template, please edit that template, so that it doesn't throw warnings.
But is the artwork template actually the best place for these fields? They don't relate to a description of the artwork, rather a description of the upload process.
The standard place to describe the history of the upload process is in its own template, separate from the image description template -- compare for example the template left by the Flickr2Commons bot in the 'licensing' section of the page
https://commons.wikimedia.org/wiki/File:Furnival%27s_Inn,_Holborn_-_Shepherd...
The advantage of this is that the 'artwork' template can be kept to a very specific function, without having its code cluttered up by other stuff. Think what the effect would be if every upload process wanted to add its fields to the artwork template -- maintenance, or even reading the code, would become a nightmare. Instead, much better to put this content in your own template, to mark the GWtoolset upload process, perhaps with an additional master parameter to turn visible output from the template off or on.
* Category section
This is one of the most important sections for hand-editing. Yes there are nice methods to add/remove categories now built right into the interface; but these still also get edited by hand, too. Readability is therefore important.
Therefore, can you add linefeed characters, so that each [[Category:...]] directive starts on a new line.
It's a small thing. But without it the output from last night's version is almost unreadable.
* Whitespace
I can see it's useful at the moment, in the present beta stage of the code, to add a debugging dump of the tool's run-state to the end of the page.
But please can you add several lines of whitespace before it.
Normally, the category section is very easy to find, being the last thing on the page. But without whitespace, it gets buried in a big heap of text. So, fine to keep the debugging information there, but please add a few lines of whitespace before it, to make it easier to find the categories section.
* Markup
I wasn't sure how to get markup onto the page. For example, the <br /> tag can be useful if one only wants a newline, not a new paragraph. (It is only double newlines that the Wiki software treats as breaks, single newlines get rendered as spaces; so a <br /> tag is needed if you want to specify a linebreak).
However it appeared that <br /> tags were being eaten by the XML parser.
I also tried double single-quotes '' to indicate italicised text, but the software carefully turned these into Unicode escapes to preserve them. (I didn't try <i> or <em>, so maybe that would have been the way round this).
It can also be very useful to be able to add [[wikilinks]] at the offline, pre-upload stage. I presume the software will escape these as well. (Though there are workaround templates, which I presume may give a way to work round this, albeit at the expense of less readable wiki-pages).
* Enhancements
** {{DEFAULTSORT:}}
It would be nice to be able to specify a field in the XML to be put into a Defaultsort for the page.
For example, for anything over 100 years old, I tend to find that it's useful to specify a default sort-key of the form "DATE ITEM SEQ" -- where DATE is a 4-digit numerical date (perhaps with a suffix to indicate imprecision), ITEM is some identifier for the series or item, eg a book, that the images are drawn from; and SEQ is a padded number to indicate a sequence within that item.
Last night I got round this by smuggling my Defaultsort into one of the fields in the Artwork template; but really it ought to be placed immediately above the Category information, so it would be able to load it there directly.
** Free text
It might be good to also be able to have the general ability to load text (eg arbitrary templates) from the XML file into the various other parts of the page outside the Artwork template. For example, particular credit templates or notes, or bespoke 'permissions' templates.
Of course it would be nice if the tool already knew about such templates; but for when it doesn't, it would be a useful option to be able to place free text in different parts of the standard page.
** Compound fields
As well as Defaultsort above, there were a number of other entries in my upload last night that were compound fields.
For example, Description = Title + '
' + Description Filename = Short_Name - Short_Item_Name (Date), Page - Shelfmark
while 'Source' was built from two fields plus two further templates, each of which had various input fields.
Some of this is always going to be best pre-processed offline. But for simple cases, it would be nice to be able to specify multiple fields with separators, that could then be baked into the JSON file.
** Non-XML forms of input.
JSON seems increasingly popular; and might not have so many issues with escaped characters (and escapes for the escape mechanisms) as XML. Or perhaps it's just that I write simple XML by hand, but for JSON I tend to leave it to a library call to worry about...
So there are some issues. The (non-)allowed filename characters, and the presentation/layout of the final wikitext page were the ones that gave me actual unhappiness. The rest is there as a raw user's initial impressions.
But really I want to thank you for this tool, which makes batch uploading accessible really for anyone who can write an XML file, rather than having to write bespoke bots and get specific bot approval for each little thing.
Hope this is useful,
All best,
James.
_______________________________________________ Glamtools mailing list Glamtools@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/glamtools
David,
Thank you so much for this.
For me the most pressing issues are: * allowing punctuation in the filenames * the layout of the Artwork, so that the fields occur in their usual standard order & missing fields are included * moving the 'gwtoolset-title-identifier' and 'gwtoolset-url-to-the-media-file' fields out of the artwork template, eg into a template of their own
I hope these are all fairly small changes, almost cosmetic, that can be sorted out quickly.
But they would make a huge difference -- I've already had a sharp note on my Commons talk page that the images have filled up the automatic "Artwork template with incorrect parameter" maintenance category, https://commons.wikimedia.org/wiki/Category:Pages_using_Artwork_template_wit... making the category useless for identifying other user's genuine mistakes because it's full of the 430 images that I uploaded.
As for the filenames, and the template fields, I really really want to get these sorted. Really the only sensible way for me to fix them is to re-run the entire upload, once the tool is patched.
But until I've done the re-upload that's blocking me from doing a lot of essential plumbing -- eg properly categorising the images; wikilinking their subjects, adding them into articles (including swapping the new images in instead of a lot of existing inferior versions -- which are exactly the things that are needed to make the upload look good, if the upload is going to be cited in the official release at the end of the month. But at the moment I'm blocked, because there is no point in doing any of those things, if I know that I'm going to do a batch re-upload that will wipe all of those things out.
So I hope these key things aren't big fixes, but if it could be possible to get a patched version of the tool up and running I'd be incredibly grateful.
All best,
James.
On 06/03/2014 16:58, David Haskiya wrote:
Hi James, Great that the tool was useful to you! You're our first user "in the wild"!
And thanks for the constructive feedback. I'll go through it together with Dan Entous who developed the tool and create tickets based on them. The project doesn't have further funding right now but the fixes that are small we're happy to make. We'll get back to you on this list.
Cheers, David Haskiya
(who was the project manager for the GLAMwikitoolset project)
A quick update.
I've been able to a find ways to help me clean up the layout of the wikitext on the description pages using the semi-automated AutoWikiBrowser tool; and also a less miserable approach to getting the page renaming done; so that I am *not* now planning any longer to do a full re-upload of the set, or indeed any re-uploading.
(In fact my hands were tied, because people were already starting to use and edit the pages, which a re-upload would have wiped).
So the wikitext layout on the pages is now all pretty much corrected, and I should have worked through renaming the remaining filenames by the end of Sunday.
A typical diff can be seen eg at:
https://commons.wikimedia.org/w/index.php?title=File%3ABodyguard_of_Ranjit_S...
From the top I have made the following changes:
* Added an =={{int:filedesc}}== header above the Artwork template
* Re-ordered the fields in the artwork template, and added blank ones
* Moved the gwtoolset fields into a separate template, {{Uploaded with GWtoolset}}, which currently produces no output, but could be adjusted to output whatever you wanted.
* Added whitespace before and after the new {{Uploaded with GWtoolset}} template
* Split the categories directives each onto their own line
* Added whitespace before the commented-out Metadata sections
* I have also turned all instances of ' back into apostrophes. (A single apostrophe has no significance for wiki-markup, and so does not need to be escaped. A double apostrophe may well be intentional).
I didn't get the change perfect -- an extra newline got in at the top that shouldn't have been there; and the artwork template is nicer with a single space before the pipe character, which I forgot. But it's good enough, and now feels to me like a proper WikiCommons page should.
A final thought about the inclusion of all the commented-out metadata. It's not ideal, because it can lead to category information being split between two places. The natural place for categories is soon after the description, so that an editor can quickly read down in the wikitext from the description to the categories.
However, a lot of the visual tools to assist in adding and editing categories tools assume that this will be at the bottom of the page -- so simply add new categories at the end of the page.
In this case, however, that would lead to the description page having category information in two different places -- some above the big metadata comment, some below it. It's not good for the information to be going to be split in this way.
So -- if the metadata is useful (which it may well be), a better place to put it might be in a separate sub-page. On a separate page, it would also be safe from automated edits -- for example my edits with AWB here.
My apologies that I got into a bit of a state about all this last night (and my relief that it's not the blocker I thought it would be). These issues may seem trivial, but in my view they are important (to me, a difference between acceptable and unacceptable output), so IMO they are things that *need* to be tidied up before any big launch.
All best,
James.
On 06/03/2014 19:28, James Heald wrote:
David,
Thank you so much for this.
For me the most pressing issues are:
- allowing punctuation in the filenames
- the layout of the Artwork, so that the fields occur in their usual
standard order & missing fields are included
- moving the 'gwtoolset-title-identifier' and
'gwtoolset-url-to-the-media-file' fields out of the artwork template, eg into a template of their own
I hope these are all fairly small changes, almost cosmetic, that can be sorted out quickly.
But they would make a huge difference -- I've already had a sharp note on my Commons talk page that the images have filled up the automatic "Artwork template with incorrect parameter" maintenance category, https://commons.wikimedia.org/wiki/Category:Pages_using_Artwork_template_wit...
making the category useless for identifying other user's genuine mistakes because it's full of the 430 images that I uploaded.
As for the filenames, and the template fields, I really really want to get these sorted. Really the only sensible way for me to fix them is to re-run the entire upload, once the tool is patched.
But until I've done the re-upload that's blocking me from doing a lot of essential plumbing -- eg properly categorising the images; wikilinking their subjects, adding them into articles (including swapping the new images in instead of a lot of existing inferior versions -- which are exactly the things that are needed to make the upload look good, if the upload is going to be cited in the official release at the end of the month. But at the moment I'm blocked, because there is no point in doing any of those things, if I know that I'm going to do a batch re-upload that will wipe all of those things out.
So I hope these key things aren't big fixes, but if it could be possible to get a patched version of the tool up and running I'd be incredibly grateful.
All best,
James.
nice work. one thing to think about is parts of the description field that could be broken out into the medium field, or title field (for example as i manually did in your example) https://commons.wikimedia.org/w/index.php?title=File%3ABodyguard_of_Ranjit_S...
it's probably going to be different for each institution, how they input their metadata, and how we structure it.
jim hayes
On Fri, Mar 7, 2014 at 8:30 AM, James Heald j.heald@ucl.ac.uk wrote:
A quick update.
I've been able to a find ways to help me clean up the layout of the wikitext on the description pages using the semi-automated AutoWikiBrowser tool; and also a less miserable approach to getting the page renaming done; so that I am *not* now planning any longer to do a full re-upload of the set, or indeed any re-uploading.
(In fact my hands were tied, because people were already starting to use and edit the pages, which a re-upload would have wiped).
So the wikitext layout on the pages is now all pretty much corrected, and I should have worked through renaming the remaining filenames by the end of Sunday.
A typical diff can be seen eg at:
https://commons.wikimedia.org/w/index.php?title=File% 3ABodyguard_of_Ranjit_Singh-_1838-1839_-_BL_Add.Or.1385. jpg&diff=118380423&oldid=118137724
From the top I have made the following changes:
Added an =={{int:filedesc}}== header above the Artwork template
Re-ordered the fields in the artwork template, and added blank ones
Moved the gwtoolset fields into a separate template, {{Uploaded with
GWtoolset}}, which currently produces no output, but could be adjusted to output whatever you wanted.
- Added whitespace before and after the new {{Uploaded with GWtoolset}}
template
Split the categories directives each onto their own line
Added whitespace before the commented-out Metadata sections
I have also turned all instances of ' back into apostrophes.
(A single apostrophe has no significance for wiki-markup, and so does not need to be escaped. A double apostrophe may well be intentional).
I didn't get the change perfect -- an extra newline got in at the top that shouldn't have been there; and the artwork template is nicer with a single space before the pipe character, which I forgot. But it's good enough, and now feels to me like a proper WikiCommons page should.
A final thought about the inclusion of all the commented-out metadata. It's not ideal, because it can lead to category information being split between two places. The natural place for categories is soon after the description, so that an editor can quickly read down in the wikitext from the description to the categories.
However, a lot of the visual tools to assist in adding and editing categories tools assume that this will be at the bottom of the page -- so simply add new categories at the end of the page.
In this case, however, that would lead to the description page having category information in two different places -- some above the big metadata comment, some below it. It's not good for the information to be going to be split in this way.
So -- if the metadata is useful (which it may well be), a better place to put it might be in a separate sub-page. On a separate page, it would also be safe from automated edits -- for example my edits with AWB here.
My apologies that I got into a bit of a state about all this last night (and my relief that it's not the blocker I thought it would be). These issues may seem trivial, but in my view they are important (to me, a difference between acceptable and unacceptable output), so IMO they are things that *need* to be tidied up before any big launch.
All best,
James.
On 06/03/2014 19:28, James Heald wrote:
David,
Thank you so much for this.
For me the most pressing issues are:
- allowing punctuation in the filenames
- the layout of the Artwork, so that the fields occur in their usual
standard order & missing fields are included
- moving the 'gwtoolset-title-identifier' and
'gwtoolset-url-to-the-media-file' fields out of the artwork template, eg into a template of their own
I hope these are all fairly small changes, almost cosmetic, that can be sorted out quickly.
But they would make a huge difference -- I've already had a sharp note on my Commons talk page that the images have filled up the automatic "Artwork template with incorrect parameter" maintenance category, https://commons.wikimedia.org/wiki/Category:Pages_using_ Artwork_template_with_incorrect_parameter
making the category useless for identifying other user's genuine mistakes because it's full of the 430 images that I uploaded.
As for the filenames, and the template fields, I really really want to get these sorted. Really the only sensible way for me to fix them is to re-run the entire upload, once the tool is patched.
But until I've done the re-upload that's blocking me from doing a lot of essential plumbing -- eg properly categorising the images; wikilinking their subjects, adding them into articles (including swapping the new images in instead of a lot of existing inferior versions -- which are exactly the things that are needed to make the upload look good, if the upload is going to be cited in the official release at the end of the month. But at the moment I'm blocked, because there is no point in doing any of those things, if I know that I'm going to do a batch re-upload that will wipe all of those things out.
So I hope these key things aren't big fixes, but if it could be possible to get a patched version of the tool up and running I'd be incredibly grateful.
All best,
James.
Glamtools mailing list Glamtools@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/glamtools
James, awesome that you went through the whole process of documenting your whole experience and where things could be better. I think that's a great resource for anyone still starting out and noticing all the 'known bugs'. Maybe it would be good to have a page on Commons as well with all of these points? Not everyone that uses the tool is on this list.
-- Hay
On Fri, Mar 7, 2014 at 6:37 PM, Jim Hayes slowking4@gmail.com wrote:
nice work. one thing to think about is parts of the description field that could be broken out into the medium field, or title field (for example as i manually did in your example) https://commons.wikimedia.org/w/index.php?title=File%3ABodyguard_of_Ranjit_S...
it's probably going to be different for each institution, how they input their metadata, and how we structure it.
jim hayes
On Fri, Mar 7, 2014 at 8:30 AM, James Heald j.heald@ucl.ac.uk wrote:
A quick update.
I've been able to a find ways to help me clean up the layout of the wikitext on the description pages using the semi-automated AutoWikiBrowser tool; and also a less miserable approach to getting the page renaming done; so that I am *not* now planning any longer to do a full re-upload of the set, or indeed any re-uploading.
(In fact my hands were tied, because people were already starting to use and edit the pages, which a re-upload would have wiped).
So the wikitext layout on the pages is now all pretty much corrected, and I should have worked through renaming the remaining filenames by the end of Sunday.
A typical diff can be seen eg at:
https://commons.wikimedia.org/w/index.php?title=File%3ABodyguard_of_Ranjit_S...
From the top I have made the following changes:
Added an =={{int:filedesc}}== header above the Artwork template
Re-ordered the fields in the artwork template, and added blank ones
Moved the gwtoolset fields into a separate template, {{Uploaded with
GWtoolset}}, which currently produces no output, but could be adjusted to output whatever you wanted.
- Added whitespace before and after the new {{Uploaded with GWtoolset}}
template
Split the categories directives each onto their own line
Added whitespace before the commented-out Metadata sections
I have also turned all instances of ' back into apostrophes.
(A single apostrophe has no significance for wiki-markup, and so does not need to be escaped. A double apostrophe may well be intentional).
I didn't get the change perfect -- an extra newline got in at the top that shouldn't have been there; and the artwork template is nicer with a single space before the pipe character, which I forgot. But it's good enough, and now feels to me like a proper WikiCommons page should.
A final thought about the inclusion of all the commented-out metadata. It's not ideal, because it can lead to category information being split between two places. The natural place for categories is soon after the description, so that an editor can quickly read down in the wikitext from the description to the categories.
However, a lot of the visual tools to assist in adding and editing categories tools assume that this will be at the bottom of the page -- so simply add new categories at the end of the page.
In this case, however, that would lead to the description page having category information in two different places -- some above the big metadata comment, some below it. It's not good for the information to be going to be split in this way.
So -- if the metadata is useful (which it may well be), a better place to put it might be in a separate sub-page. On a separate page, it would also be safe from automated edits -- for example my edits with AWB here.
My apologies that I got into a bit of a state about all this last night (and my relief that it's not the blocker I thought it would be). These issues may seem trivial, but in my view they are important (to me, a difference between acceptable and unacceptable output), so IMO they are things that *need* to be tidied up before any big launch.
All best,
James.
On 06/03/2014 19:28, James Heald wrote:
David,
Thank you so much for this.
For me the most pressing issues are:
- allowing punctuation in the filenames
- the layout of the Artwork, so that the fields occur in their usual
standard order & missing fields are included
- moving the 'gwtoolset-title-identifier' and
'gwtoolset-url-to-the-media-file' fields out of the artwork template, eg into a template of their own
I hope these are all fairly small changes, almost cosmetic, that can be sorted out quickly.
But they would make a huge difference -- I've already had a sharp note on my Commons talk page that the images have filled up the automatic "Artwork template with incorrect parameter" maintenance category,
https://commons.wikimedia.org/wiki/Category:Pages_using_Artwork_template_wit...
making the category useless for identifying other user's genuine mistakes because it's full of the 430 images that I uploaded.
As for the filenames, and the template fields, I really really want to get these sorted. Really the only sensible way for me to fix them is to re-run the entire upload, once the tool is patched.
But until I've done the re-upload that's blocking me from doing a lot of essential plumbing -- eg properly categorising the images; wikilinking their subjects, adding them into articles (including swapping the new images in instead of a lot of existing inferior versions -- which are exactly the things that are needed to make the upload look good, if the upload is going to be cited in the official release at the end of the month. But at the moment I'm blocked, because there is no point in doing any of those things, if I know that I'm going to do a batch re-upload that will wipe all of those things out.
So I hope these key things aren't big fixes, but if it could be possible to get a patched version of the tool up and running I'd be incredibly grateful.
All best,
James.
Glamtools mailing list Glamtools@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/glamtools
Glamtools mailing list Glamtools@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/glamtools
Thanks from me too James, I hope to give the Glamtools upload a spin of my own soon with a mass upload, and reading about your experience has been very helpful.
Cheers, Craig
On 10 March 2014 20:07, Hay (Husky) huskyr@gmail.com wrote:
James, awesome that you went through the whole process of documenting your whole experience and where things could be better. I think that's a great resource for anyone still starting out and noticing all the 'known bugs'. Maybe it would be good to have a page on Commons as well with all of these points? Not everyone that uses the tool is on this list.
-- Hay
On Fri, Mar 7, 2014 at 6:37 PM, Jim Hayes slowking4@gmail.com wrote:
nice work. one thing to think about is parts of the description field that could be broken out into the medium field, or title field (for example as i manually did in your example)
https://commons.wikimedia.org/w/index.php?title=File%3ABodyguard_of_Ranjit_S...
it's probably going to be different for each institution, how they input their metadata, and how we structure it.
jim hayes
On Fri, Mar 7, 2014 at 8:30 AM, James Heald j.heald@ucl.ac.uk wrote:
A quick update.
I've been able to a find ways to help me clean up the layout of the wikitext on the description pages using the semi-automated
AutoWikiBrowser
tool; and also a less miserable approach to getting the page renaming
done;
so that I am *not* now planning any longer to do a full re-upload of the set, or indeed any re-uploading.
(In fact my hands were tied, because people were already starting to use and edit the pages, which a re-upload would have wiped).
So the wikitext layout on the pages is now all pretty much corrected,
and
I should have worked through renaming the remaining filenames by the
end of
Sunday.
A typical diff can be seen eg at:
https://commons.wikimedia.org/w/index.php?title=File%3ABodyguard_of_Ranjit_S...
From the top I have made the following changes:
Added an =={{int:filedesc}}== header above the Artwork template
Re-ordered the fields in the artwork template, and added blank ones
Moved the gwtoolset fields into a separate template, {{Uploaded with
GWtoolset}}, which currently produces no output, but could be adjusted
to
output whatever you wanted.
- Added whitespace before and after the new {{Uploaded with GWtoolset}}
template
Split the categories directives each onto their own line
Added whitespace before the commented-out Metadata sections
I have also turned all instances of ' back into apostrophes.
(A single apostrophe has no significance for wiki-markup, and so does
not
need to be escaped. A double apostrophe may well be intentional).
I didn't get the change perfect -- an extra newline got in at the top
that
shouldn't have been there; and the artwork template is nicer with a
single
space before the pipe character, which I forgot. But it's good enough,
and
now feels to me like a proper WikiCommons page should.
A final thought about the inclusion of all the commented-out metadata. It's not ideal, because it can lead to category information being split between two places. The natural place for categories is soon after the description, so that an editor can quickly read down in the wikitext
from
the description to the categories.
However, a lot of the visual tools to assist in adding and editing categories tools assume that this will be at the bottom of the page --
so
simply add new categories at the end of the page.
In this case, however, that would lead to the description page having category information in two different places -- some above the big
metadata
comment, some below it. It's not good for the information to be going
to be
split in this way.
So -- if the metadata is useful (which it may well be), a better place
to
put it might be in a separate sub-page. On a separate page, it would
also
be safe from automated edits -- for example my edits with AWB here.
My apologies that I got into a bit of a state about all this last night (and my relief that it's not the blocker I thought it would be). These issues may seem trivial, but in my view they are important (to me, a difference between acceptable and unacceptable output), so IMO they are things that *need* to be tidied up before any big launch.
All best,
James.
On 06/03/2014 19:28, James Heald wrote:
David,
Thank you so much for this.
For me the most pressing issues are:
- allowing punctuation in the filenames
- the layout of the Artwork, so that the fields occur in their usual
standard order & missing fields are included
- moving the 'gwtoolset-title-identifier' and
'gwtoolset-url-to-the-media-file' fields out of the artwork template,
eg
into a template of their own
I hope these are all fairly small changes, almost cosmetic, that can be sorted out quickly.
But they would make a huge difference -- I've already had a sharp note on my Commons talk page that the images have filled up the automatic "Artwork template with incorrect parameter" maintenance category,
https://commons.wikimedia.org/wiki/Category:Pages_using_Artwork_template_wit...
making the category useless for identifying other user's genuine mistakes because it's full of the 430 images that I uploaded.
As for the filenames, and the template fields, I really really want to get these sorted. Really the only sensible way for me to fix them is
to
re-run the entire upload, once the tool is patched.
But until I've done the re-upload that's blocking me from doing a lot
of
essential plumbing -- eg properly categorising the images; wikilinking their subjects, adding them into articles (including swapping the new images in instead of a lot of existing inferior versions -- which are exactly the things that are needed to make the upload look good, if the upload is going to be cited in the official release at the end of the month. But at the moment I'm blocked, because there is no point in doing any of those things, if I know that I'm going to do a batch re-upload that will wipe all of those things out.
So I hope these key things aren't big fixes, but if it could be
possible
to get a patched version of the tool up and running I'd be incredibly grateful.
All best,
James.
Glamtools mailing list Glamtools@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/glamtools
Glamtools mailing list Glamtools@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/glamtools
Glamtools mailing list Glamtools@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/glamtools
hi james,
below are the first two bugs and their associated patches. i will continue to work on the fantastic list you sent us as time becomes available.
* https://bugzilla.wikimedia.org/show_bug.cgi?id=62909 * https://bugzilla.wikimedia.org/show_bug.cgi?id=62870
with kind regards, dan
another bug/patch dealing with the wikitext format
* https://bugzilla.wikimedia.org/show_bug.cgi?id=63168
with kind regards, dan