David Gerard wrote:
I'll be leaving Commons categorisation until it's tags rather than ridiculously specific subcategories.
Commons has tags right now: they're called categories. Or is there a distinction you're making? :-)
Tim and I discussed this a few weeks ago and I was mostly on your side, but when he asked what would be different, I had difficulty articulating a great response. It seems to really come down to a social problem on Commons. Some Commoners seem to have very specific views of what categories should be for and how they should be constructed and named. But this isn't a technical problem, per se. Poor labeling or other interface design problems (or outright limitations) in MediaWiki may contribute to this problem, but is there a larger technical issue here? It seems to primarily be a social issue, from what I've seen, not a technical issue. I'd be interested in your thoughts on this.
There are specific features we'd like to have (such as built-in intersections), but is there a fundamental difference between categories and tags? Or perhaps put another way: what are we waiting for, exactly?
MZMcBride
On Mon, May 19, 2014 at 9:16 PM, MZMcBride z@mzmcbride.com wrote:
David Gerard wrote:
I'll be leaving Commons categorisation until it's tags rather than ridiculously specific subcategories.
Commons has tags right now: they're called categories. Or is there a distinction you're making? :-)
Tim and I discussed this a few weeks ago and I was mostly on your side, but when he asked what would be different, I had difficulty articulating a great response. It seems to really come down to a social problem on Commons. Some Commoners seem to have very specific views of what categories should be for and how they should be constructed and named. But this isn't a technical problem, per se. Poor labeling or other interface design problems (or outright limitations) in MediaWiki may contribute to this problem, but is there a larger technical issue here? It seems to primarily be a social issue, from what I've seen, not a technical issue. I'd be interested in your thoughts on this.
There are specific features we'd like to have (such as built-in intersections), but is there a fundamental difference between categories and tags? Or perhaps put another way: what are we waiting for, exactly?
MZMcBride
Sure - ease of use for tagging and the sometimes complex hierarchical nature of categories. Tagging is also common web technology that a large proportion of users should be familiar with. At the moment, I suspect almost all new users to Wikimedia projects find categories difficult to navigate and to apply. There are many other differences off the top of my head, which I'm sure you grasp better than I do, so is there a deeper meaning to your question that I'm missing?
Nathan wrote:
Sure - ease of use for tagging and the sometimes complex hierarchical nature of categories.
For ease of use (adding and removing), I think most wikis have HotCat (https://commons.wikimedia.org/wiki/HotCat). Is that insufficient?
Regarding hierarchy, there's absolutely no technical reason, as far as I'm aware, that categories must be hierarchal. It's certainly an intended feature that categories have subcategories and the capability to be hierarchal (i.e., you can have subcategories), but you can concurrently use categories with a flat structure. Right?
Tagging is also common web technology that a large proportion of users should be familiar with.
It's a common term, sure. It's been suggested that this may simply be a user interface labeling problem, though. If we renamed "Categories" at the bottom of the page to "Tags", what else needs doing?
MZMcBride
On Mon, May 19, 2014 at 9:44 PM, MZMcBride z@mzmcbride.com wrote:
Nathan wrote:
Sure - ease of use for tagging and the sometimes complex hierarchical nature of categories.
For ease of use (adding and removing), I think most wikis have HotCat (https://commons.wikimedia.org/wiki/HotCat). Is that insufficient?
Regarding hierarchy, there's absolutely no technical reason, as far as I'm aware, that categories must be hierarchal. It's certainly an intended feature that categories have subcategories and the capability to be hierarchal (i.e., you can have subcategories), but you can concurrently use categories with a flat structure. Right?
Tagging is also common web technology that a large proportion of users should be familiar with.
It's a common term, sure. It's been suggested that this may simply be a user interface labeling problem, though. If we renamed "Categories" at the bottom of the page to "Tags", what else needs doing?
MZMcBride
I know I'm accustomed to the type of interaction that modern keyword tagging provides. Simple "Add a tag" or just "tag", predictable and easy to understand results when you click on the tag, etc. Right now Hotcat isn't (I don't think?) enabled by default, and even the Hotcat interface is sort of clunky and weird. Then when you click on a category you get lists of subcategories, thumbnails of media, and then a mess of links to pages with no organization whatsoever beyond alphabetization.
So perhaps the technical difference between cats and tags are not that great, but the larger point that tagging is better rests on the substantial implementation and interface differences between typical tagging and MediaWiki cats.
On 20 May 2014 02:44, MZMcBride z@mzmcbride.com wrote:
Regarding hierarchy, there's absolutely no technical reason, as far as I'm aware, that categories must be hierarchal. It's certainly an intended feature that categories have subcategories and the capability to be hierarchal (i.e., you can have subcategories), but you can concurrently use categories with a flat structure. Right?
Yeah, the software-culture divide around Wikimedia has always been fluid.
I may be complaining about culture rather than software. But either way, the current way Commons categorisation works is a source of endless frustration for me as a user. If it turns out what I'm asking for literally already exists, then it's the interface that's failed me. Aaaaand, perhaps I'm just weird or perhaps the present system is failing others. USER TEST TIME!!
- d.
- d.
Hoi, Easy and obvious when you look at it with eyes that do not expect English. A tag will be linked to Wikidata. Consequently it will show differently depending on the language you have selected for yourself.
It is just these other people who will be serviced. Another reason is that there are artificial constraints with categories that are no longer valid.. Who cares that we have more than one million items in a category for instance? Having tags with over one million items is for instance "human". Thanks, GerardM
On 20 May 2014 03:16, MZMcBride z@mzmcbride.com wrote:
David Gerard wrote:
I'll be leaving Commons categorisation until it's tags rather than ridiculously specific subcategories.
Commons has tags right now: they're called categories. Or is there a distinction you're making? :-)
Tim and I discussed this a few weeks ago and I was mostly on your side, but when he asked what would be different, I had difficulty articulating a great response. It seems to really come down to a social problem on Commons. Some Commoners seem to have very specific views of what categories should be for and how they should be constructed and named. But this isn't a technical problem, per se. Poor labeling or other interface design problems (or outright limitations) in MediaWiki may contribute to this problem, but is there a larger technical issue here? It seems to primarily be a social issue, from what I've seen, not a technical issue. I'd be interested in your thoughts on this.
There are specific features we'd like to have (such as built-in intersections), but is there a fundamental difference between categories and tags? Or perhaps put another way: what are we waiting for, exactly?
MZMcBride
Wikimedia-l mailing list, guidelines at: https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines Wikimedia-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, mailto:wikimedia-l-request@lists.wikimedia.org?subject=unsubscribe
MZMcBride - Categories are hierarchical and people worry about them overlapping. Tags have no hierarchy.
The major problem is that labor is wasted because there is no easy way to search intersections of categories. Instead of having a category for 18th century French painters, it would be ideal to just have tags for "people in the 18th century" "French people" and "painters" and let the users remix those tags instead of being forced to look in only that branch.
A proposal is at https://meta.wikimedia.org/wiki/Beyond_categories
The problem has two parts - WMF cannot coordinate this kind of search with existing categories, and there is no need to reform categories unless there is a commitment to make this search capacity.
I see this as a serious problem.
On Mon, May 19, 2014 at 9:57 PM, Gerard Meijssen gerard.meijssen@gmail.comwrote:
Hoi, Easy and obvious when you look at it with eyes that do not expect English. A tag will be linked to Wikidata. Consequently it will show differently depending on the language you have selected for yourself.
It is just these other people who will be serviced. Another reason is that there are artificial constraints with categories that are no longer valid.. Who cares that we have more than one million items in a category for instance? Having tags with over one million items is for instance "human". Thanks, GerardM
On 20 May 2014 03:16, MZMcBride z@mzmcbride.com wrote:
David Gerard wrote:
I'll be leaving Commons categorisation until it's tags rather than ridiculously specific subcategories.
Commons has tags right now: they're called categories. Or is there a distinction you're making? :-)
Tim and I discussed this a few weeks ago and I was mostly on your side, but when he asked what would be different, I had difficulty articulating
a
great response. It seems to really come down to a social problem on Commons. Some Commoners seem to have very specific views of what categories should be for and how they should be constructed and named.
But
this isn't a technical problem, per se. Poor labeling or other interface design problems (or outright limitations) in MediaWiki may contribute to this problem, but is there a larger technical issue here? It seems to primarily be a social issue, from what I've seen, not a technical issue. I'd be interested in your thoughts on this.
There are specific features we'd like to have (such as built-in intersections), but is there a fundamental difference between categories and tags? Or perhaps put another way: what are we waiting for, exactly?
MZMcBride
Wikimedia-l mailing list, guidelines at: https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines Wikimedia-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, mailto:wikimedia-l-request@lists.wikimedia.org?subject=unsubscribe
Wikimedia-l mailing list, guidelines at: https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines Wikimedia-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, mailto:wikimedia-l-request@lists.wikimedia.org?subject=unsubscribe
Lane Rasberry wrote:
MZMcBride - Categories are hierarchical and people worry about them overlapping. Tags have no hierarchy.
Categories _can be_ hierarchical, but categories can simultaneously be flat. People worry about a lot of things, but that doesn't mean there are substantive issues to be addressed. Sometimes there are just worries. :-)
The major problem is that labor is wasted because there is no easy way to search intersections of categories. Instead of having a category for 18th century French painters, it would be ideal to just have tags for "people in the 18th century" "French people" and "painters" and let the users remix those tags instead of being forced to look in only that branch.
So why not put the image in "Category:Painters" and "Category:French people" and "Category:People in the 18th century"? I don't think there's any technical reason not to. It would certainly be nice to have built-in category intersections (previously mentioned in this mailing list thread), but for now Commons has external tools, I believe? Why not use categories as tags today?
A proposal is at https://meta.wikimedia.org/wiki/Beyond_categories
The problem has two parts - WMF cannot coordinate this kind of search with existing categories, and there is no need to reform categories unless there is a commitment to make this search capacity.
I see this as a serious problem.
Thanks for the link!
It lists "Over- and under-categorization" and links to https://commons.wikimedia.org/wiki/Commons:Categories which states (in part):
--- The general rule is always place an image in the most specific categories, and not in the levels above those. ---
The rest of "Commons:Categories" is somewhat illuminating... this really does seem to be a social issue with Commons, not a problem in MediaWiki. It's possible there's ongoing miscommunication here: people on Commons are waiting around for tags to magically appear one day, while MediaWiki developers are looking at the current categories system and wondering why Commoners have developed such an oddly rigid categorization scheme.
I'm not sure what you see as a serious problem. Is it just the lack of built-in intersections? We already have the ability to add categories/tags to media. Are intersections the missing piece before it's acceptable to have "Category:Red" to tag every image that contains the color red? Or put "Category:Man" on every picture that contains a picture of a man? Why can't we do this right now?
MZMcBride
I think intersection is the most significant cause of the current categorisation system.
My understanding of the current reasoning behind categorisation as seen on Commons and elsewhere is that:
1) the lack of category intersection causes the very specific categories, which are essentially saved category intersections.
and
2) the category list at the bottom of the page being very literally the category names as listed in the page wikitext is why pages are only included in non-overlapping leaf node categories.
On en.wp there are many useful specific categories which are deleted because they would only 'clutter' the category section of the page content footer. E.g. Spanish Paralympic competitors at the 2012 Summer Games. Most other multisport cohorts/intersections can have a category, but 'by country; by games' cohort is currently not permitted.
Fair enough. I've seen stubs on en.wp where the category section of the page is larger than the page prose.
IMO the 'future' of categories in wikimedia projects would be to replace the very specific 'intersection categories' with saved queries in wikidata, with the category list at the bottom of the page dynamically including the list of saved wikidata search query results that the page is a member of, if the local project has more than <d> pages that are members of the query.
-- John
The difference between categories and tags is semantic but those semantics determine how the feature is used.
I suppose from an abstract technical perspective what is needed is different classes of category-like objects based on the purpose it should serve and displayed separately and possibly differently On May 19, 2014 9:28 PM, "John Mark Vandenberg" jayvdb@gmail.com wrote:
I think intersection is the most significant cause of the current categorisation system.
My understanding of the current reasoning behind categorisation as seen on Commons and elsewhere is that:
- the lack of category intersection causes the very specific categories,
which are essentially saved category intersections.
and
- the category list at the bottom of the page being very literally the
category names as listed in the page wikitext is why pages are only included in non-overlapping leaf node categories.
On en.wp there are many useful specific categories which are deleted because they would only 'clutter' the category section of the page content footer. E.g. Spanish Paralympic competitors at the 2012 Summer Games. Most other multisport cohorts/intersections can have a category, but 'by country; by games' cohort is currently not permitted.
Fair enough. I've seen stubs on en.wp where the category section of the page is larger than the page prose.
IMO the 'future' of categories in wikimedia projects would be to replace the very specific 'intersection categories' with saved queries in wikidata, with the category list at the bottom of the page dynamically including the list of saved wikidata search query results that the page is a member of, if the local project has more than <d> pages that are members of the query.
-- John _______________________________________________ Wikimedia-l mailing list, guidelines at: https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines Wikimedia-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, mailto:wikimedia-l-request@lists.wikimedia.org?subject=unsubscribe
On Mon, May 19, 2014 at 7:07 PM, Lane Rasberry lane@bluerasberry.comwrote:
The major problem is that labor is wasted because there is no easy way to search intersections of categories. Instead of having a category for 18th century French painters, it would be ideal to just have tags for "people in the 18th century" "French people" and "painters" and let the users remix those tags instead of being forced to look in only that branch.
The search engine (new, as well as old) supports category intersection. So actually, searching intersections of categories is very easy.
Don't believe me? Here's a totally random category intersection from Commons:
https://commons.wikimedia.org/w/index.php?title=Special%3ASearch&profile...
Finding new ways to expose this data *outside* of the search results page and api.php would be cool/interesting/nice.
-Chad
2014-05-20 8:41 GMT+02:00 Chad Horohoe chorohoe@wikimedia.org:
The search engine (new, as well as old) supports category intersection. So actually, searching intersections of categories is very easy.
Our definitiions of "very easy" are not intersecting :)
It is possible yes, but to qualify for very easy I would suggest a GUI for modifying a search and a hotcat like functionality for selecting interescting categories. Such addition to Special:Search would be awesome.
/Jan Ainali
On Tue, May 20, 2014 at 3:06 AM, Jan Ainali jan.ainali@wikimedia.se wrote:
2014-05-20 8:41 GMT+02:00 Chad Horohoe chorohoe@wikimedia.org:
The search engine (new, as well as old) supports category intersection.
So
actually, searching intersections of categories is very easy.
Our definitiions of "very easy" are not intersecting :)
It is possible yes, but to qualify for very easy I would suggest a GUI for modifying a search and a hotcat like functionality for selecting interescting categories. Such addition to Special:Search would be awesome.
I think of most the syntax that Special:Search supports as for experts/power users. Pretty much everything beyond quoting phrases is non-intuitive. I'd describe it as useful but not discoverable.
I remember seeing on a draft backlog a mention of writing some kind of more discoverable interface for complex category queries. I don't remember which backlog (so no link, sorry) but I recall it being scheduled reasonably high on the list. I don't know what that means for when work starts, much less when a first copy is released. I don't even know how well it'd work with categories being "leaves" rather than tags or declarations of facts like I imagine you'd get with an ontology based solution. And I don't know how you'd get from the categories we have now to something more like tags or facts. I don't know lots of things....
It might be worth it to jump over category queries and implement it directly against wikidata. I'll be sure to talk about this with the wikidata team when I see them later this week
One advantage that categories do have is that they are built in so whatever more intuitive intersection mechanism we make would be useful to all mediawiki installs willing to install the search backend. If it is hitched directly to wikidata the installation burden goes up considerably. Not to mention it'd be easier for me to test locally with categories then with wikidata. On the other hand having some mechanism where facts in wikidata are reflected into the local wiki sounds a bit jangly and breakable. On the other other hand reflecting the facts into the local wiki would translate them into that wiki's language which would delay the need for some kind of translation integration (probably with wikidata as well). On the other other other hand that doesn't help commons be multilingual. Or do anything about toothbrush.
I'm going to stop rambling now and go work on something else and let my subconscious filter through this.
Nik
Hello,
Here is another perspective on this same issue and an actionable remedy for a lot of the problems we are discussing here. http://lists.wikimedia.org/pipermail/gendergap/2014-May/004287.html
That email describes a game in which people use a game on Wikidata to tag biographies with a gender.
MzMcbride identified the major problem in the old system - "The general rule is always place an image in the most specific categories, and not in the levels above those." Because of this, we had infrastructure which precluded the development of finding all kinds of intersections. It did not have to be that way, but that is how we used categories.
Read the above email in the link to see an example of how this new system will prevent problems, make things simpler, and be more fair to people by not defining them so discreetly.
Also, play the Wikidata game.
yours,
On Tue, May 20, 2014 at 8:56 AM, Nikolas Everett neverett@wikimedia.orgwrote:
On Tue, May 20, 2014 at 3:06 AM, Jan Ainali jan.ainali@wikimedia.se wrote:
2014-05-20 8:41 GMT+02:00 Chad Horohoe chorohoe@wikimedia.org:
The search engine (new, as well as old) supports category intersection.
So
actually, searching intersections of categories is very easy.
Our definitiions of "very easy" are not intersecting :)
It is possible yes, but to qualify for very easy I would suggest a GUI
for
modifying a search and a hotcat like functionality for selecting interescting categories. Such addition to Special:Search would be
awesome.
I think of most the syntax that Special:Search supports as for experts/power users. Pretty much everything beyond quoting phrases is non-intuitive. I'd describe it as useful but not discoverable.
I remember seeing on a draft backlog a mention of writing some kind of more discoverable interface for complex category queries. I don't remember which backlog (so no link, sorry) but I recall it being scheduled reasonably high on the list. I don't know what that means for when work starts, much less when a first copy is released. I don't even know how well it'd work with categories being "leaves" rather than tags or declarations of facts like I imagine you'd get with an ontology based solution. And I don't know how you'd get from the categories we have now to something more like tags or facts. I don't know lots of things....
It might be worth it to jump over category queries and implement it directly against wikidata. I'll be sure to talk about this with the wikidata team when I see them later this week
One advantage that categories do have is that they are built in so whatever more intuitive intersection mechanism we make would be useful to all mediawiki installs willing to install the search backend. If it is hitched directly to wikidata the installation burden goes up considerably. Not to mention it'd be easier for me to test locally with categories then with wikidata. On the other hand having some mechanism where facts in wikidata are reflected into the local wiki sounds a bit jangly and breakable. On the other other hand reflecting the facts into the local wiki would translate them into that wiki's language which would delay the need for some kind of translation integration (probably with wikidata as well). On the other other other hand that doesn't help commons be multilingual. Or do anything about toothbrush.
I'm going to stop rambling now and go work on something else and let my subconscious filter through this.
Nik _______________________________________________ Wikimedia-l mailing list, guidelines at: https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines Wikimedia-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, mailto:wikimedia-l-request@lists.wikimedia.org?subject=unsubscribe
On 20 May 2014 02:16, MZMcBride z@mzmcbride.com wrote:
David Gerard wrote:
I'll be leaving Commons categorisation until it's tags rather than ridiculously specific subcategories.
Commons has tags right now: they're called categories. Or is there a distinction you're making? :-)
We've been talking about this for years ... picking the precise subsubsubcategory is a pain in the backside. Most or all would also fall out as Boolean queries on tags.
I've been here since 2004 and I still go to Commons and have trouble finding the subsubsubcategory I'm actually looking for in the search.
Expecting users to know our category tree is user-obnoxious. Tags are common.
As Gerard notes, they would also solve the problem that everything is categorised in English.
I vaguely recall a proof-of-concept was coded and working in Postgres several years ago, but the same thing had unusable performance in MySQL so the idea was shelved.
- d.
Hoi, OmegaWiki did a proof of concept for tagging images with multilingual tags years ago..
Categories are broken by design. It is not only that they are unilingual, it is also that they are always in the plural. Why look for horses when you look for a picture of a horse? Thanks, GerardM
On 20 May 2014 09:34, David Gerard dgerard@gmail.com wrote:
On 20 May 2014 02:16, MZMcBride z@mzmcbride.com wrote:
David Gerard wrote:
I'll be leaving Commons categorisation until it's tags rather than ridiculously specific subcategories.
Commons has tags right now: they're called categories. Or is there a distinction you're making? :-)
We've been talking about this for years ... picking the precise subsubsubcategory is a pain in the backside. Most or all would also fall out as Boolean queries on tags.
I've been here since 2004 and I still go to Commons and have trouble finding the subsubsubcategory I'm actually looking for in the search.
Expecting users to know our category tree is user-obnoxious. Tags are common.
As Gerard notes, they would also solve the problem that everything is categorised in English.
I vaguely recall a proof-of-concept was coded and working in Postgres several years ago, but the same thing had unusable performance in MySQL so the idea was shelved.
- d.
Wikimedia-l mailing list, guidelines at: https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines Wikimedia-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, mailto:wikimedia-l-request@lists.wikimedia.org?subject=unsubscribe
wikimedia-l@lists.wikimedia.org