Hi Pine,
On Wed, May 23, 2018 at 9:46 PM, Pine W wiki.pine@gmail.com wrote:
Hi Research-l,
My impression is that volunteers on Commons and ENWP spend a lot of time on categorization. I have seen references to analyses of how categorization is done, but I can't recall seeing an analysis of how much use readers make of categories on Commons and ENWP. My guess is that readers often use categories on Commons for media searches, but that ENWP categories are rarely used by readers, although maybe WMF Discovery uses categories to inform search results. Is there data that shows how extensively readers on ENWP and Commons use categories?
I don't know of recent (or old) studies on this topic, but there are at least a few other things we know that can help you think about whether it's useful to work on the category network in different projects.
Categories are used by (at least) three different groups: * Editors * Readers * Machines
We don't know all the use-cases that categories have for these groups. It seems that generally editors use them to organize their work and make the article space more navigable, readers use them to explore content (in a more serendipitous way), and machines use them extensively for a variety of applications. [We do miss published work about what I just said, btw, and I really hope us or someone else writes more about it in the coming year or two.:)]
While we're trying to figure out what the exact answer for the two first groups are, it's helpful to think about the last group:
Wikipedia category network, with its known caveats, has been used extensively by researchers to build new insights and technologies. A lot of research on alignment of text across languages (which is in turn used in building dictionaries and automatic translation tools) takes advantage of this (for the most part) human curated categorization of articles. It's an important side-product of building the encyclopedia (and other projects). I'll give you a couple of examples (non-comprehensive), feel free to dig in the literature review of these papers for more:
* The usage of Wikipedia category network for telling apart classes from instances: https://dl.acm.org/authorize.cfm?key=N655914 (a necessary step in knowledge base creation)
* In building YAGO: http://www2007.wwwconference.org/papers/paper391.pdf
* Using Wikipedia category network for building section recommendation systems for Wikipedia: https://arxiv.org/pdf/1804.05995.pdf , Check for example, http://gapfinder.wmflabs.org/en.wikipedia.org/v1/section/article/Barack_Obam...
There is significant value in Wikipedia Category Network, I would not discourage editors from building it. I do hope they know what value this work brings to, at least, the research and scientific community.
Best, Leila
Thanks,Pine ( https://meta.wikimedia.org/wiki/User:Pine ) _______________________________________________ Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l