Chris,

I think that there is broad consensus that the category problem is one of the worst problems with Mediawiki software and that all projects suffer major harm in many ways because of it. I do not think that there is anyone who defends the current system except to say that it is what we have now and fixing the problem is a major undertaking which will take a lot of time for many developers.

I do not think GLAM could really solve this problem, but it would be nice for GLAM to support the development of community infrastructure wherein community members get more opportunity to express concerns to developers. Another problem is that while many surveys show that the community supports WMF efforts to do software development, there is a huge amount of bad blood between developers and anyone who participates in the community processes around developing. Just recently in IRC meetings and other efforts there have been complaints about echo, for example. I am sure that you all saw that you have a notification box by your names now, but when developers rolled this out, they also removed the orange "You have new messages" box that used to appear. This caused problems, like for example for IP users who really need an orange alert to know that they are doing something wrong, and for other people who just do not know to look at the new box. I am in favor of change but developers can be heavy-handed, and for them to do work they need community support.

The way things ought to be is that there should be a layman discussion space for projects which gives explanations of what is proposed and where we are in the process of fixing things. This would link to developer discussions. Chris, in answer to your question about where to find more info, I think that such a place does not exist and that the archives are mostly a mess. This category problem is a big issue that lots of people notice but no one knows where to put their energy. There are other big problems, but from my perspective, the category issue is one of the biggest. I do not understand Wikidata but it seems to me that the solution should come from there, because I also think that categories should transcend each language and by default be uniform cross-Wikipedia.

We did not talk about this much at the boot camp, but our category system just got attention because it inherently seems prejudiced and there is not a good way to make it clean.
<http://www.nybooks.com/blogs/nyrblog/2013/apr/29/wikipedia-women-problem/>
It would be great and empowering for women if any user could immediately, for example, get a list of women authors or X demographic who work in Y field. It seems hateful to some people to have categories for "women writers", but having tags for "women" and tags for "writers" is exactly analogous from the Wikipedians' perspective but not hateful.

If anyone wanted to organize an effort to schedule development then I think it would be well-received by the community, external protesting organizations, the WMF, the developers, all organizations like GLAM, and all other stakeholders.

yours,




On Sat, May 4, 2013 at 12:38 AM, Chris Maloney <voldrani@gmail.com> wrote:
This isn't specifically on the topic of GLAM, but is a follow up to
one of the discussions that we had at last weekend's GLAM wiki
bootcamp, and something I'm very interested in, and maybe others will
be, too.  I don't know a lot about this topic -- I'm hoping that
others can provide me with some pointers to the communities and
discussions on the mediawikis, where I might learn more.

Jarek, during his presentation, talked about the problems with
categories on Wikimedia Commons, using the example, "Painted portraits
of men of France".  The main problem he described had to do with all
of the possible permutations of subcategories, and the difficulty of
consistently maintaining that subcategory tree.

A related problem also applies to a category we've been hearing a lot
about recently, "American women writers".  In this case, there are (at
least) two problems that have been discussed:  that this is an
asymmetric category (there is no "American men writers"); and that
people who are in this category are not also in "American writers",
although they should be.

Another problem with this method of categorization is that it doesn't
allow easy access to lists that might not correspond to a direct
category, but might be interesting, and could be derived from the
other categories.  For example, all women writers who are *not*
American.

IMO, this is a fundamentally flawed way of adding semantic information
to articles.  Jarek suggested that these should be tags, and I agree
that that would be simpler.  So, for example, instead of having a
category "American women writers", there would be separate tags for
"American", "female", and "writer".  Then, the original category would
be the union of these three.  That would also solve the problem of the
asymmetry in the way male writers and female writers are categorized,
which is part of what led to the recent uproar.

But even better, IMO, would be true semantic tagging, in the sense of
the Semantic Web.  (I'm not an expert in this topic, but have read
just enough to be dangerous.)  Semantic linking lets you express
simple truths about things, such as, "this is a person", "this is
female", "this person is the author of Wuthering Heights", etc.
Furthermore, ''ontologies'' allow people to express relationships
among relationships.  For example, "anyone who is the author of x is a
writer", "all authors are also humans", etc.

Having heard a little bit about the Semantic MediaWiki and Wikidata, I
just dug a little bit to see if there is work afoot to try to
re-implement Wikipedia categories in some kind of scheme like this.
The Semantic MediaWiki is a MediaWiki extension that's designed to
allow this kind of tagging, but it was never integrated with any of
the Wikipedias.  Instead, Wikidata is being developed and integrated.
There's some info on the relationships among Wikipedia, SMW, and
Wikidata on the Wikipedia entry for SMW [1] and on the SMW FAQ [2].

I know Wikidata has tackled inter-language links first, and next on
the agenda are infoboxes, but I'm not sure if anyone anywhere is
planning on reimplementing categories with this kind of scheme.  Of
course it would be an enormous undertaking, and wouldn't happen
anytime soon.  But, it does seem to me there's the potential of a much
more robust system.  If anyone had any pointers to where I might learn
more, or join in some of those discussions, I'd be very interested.  I
just posted a question to the wikidata-l mailing list.

----
[1] http://en.wikipedia.org/wiki/Semantic_MediaWiki#Semantic_MediaWiki_and_Wikidata
[2] http://www.semantic-mediawiki.org/wiki/FAQ#What_is_the_relationship_between_Semantic_MediaWiki_and_Wikidata.3F

----

Cheers!
Chris

_______________________________________________
GLAM-US mailing list
GLAM-US@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/glam-us



--
Lane Rasberry
206.801.0814
lane@bluerasberry.com