Hi Pine,
On Wed, May 23, 2018 at 9:46 PM, Pine W <wiki.pine(a)gmail.com> wrote:
Hi Research-l,
My impression is that volunteers on Commons and ENWP spend a lot of time on
categorization. I have seen references to analyses of how categorization is done, but I
can't recall seeing an analysis of how much use readers make of categories on Commons
and ENWP. My guess is that readers often use categories on Commons for media searches, but
that ENWP categories are rarely used by readers, although maybe WMF Discovery uses
categories to inform search results. Is there data that shows how extensively readers on
ENWP and Commons use categories?
I don't know of recent (or old) studies on this topic, but there are
at least a few other things we know that can help you think about
whether it's useful to work on the category network in different
projects.
Categories are used by (at least) three different groups:
* Editors
* Readers
* Machines
We don't know all the use-cases that categories have for these groups.
It seems that generally editors use them to organize their work and
make the article space more navigable, readers use them to explore
content (in a more serendipitous way), and machines use them
extensively for a variety of applications. [We do miss published work
about what I just said, btw, and I really hope us or someone else
writes more about it in the coming year or two.:)]
While we're trying to figure out what the exact answer for the two
first groups are, it's helpful to think about the last group:
Wikipedia category network, with its known caveats, has been used
extensively by researchers to build new insights and technologies. A
lot of research on alignment of text across languages (which is in
turn used in building dictionaries and automatic translation tools)
takes advantage of this (for the most part) human curated
categorization of articles. It's an important side-product of building
the encyclopedia (and other projects). I'll give you a couple of
examples (non-comprehensive), feel free to dig in the literature
review of these papers for more:
* The usage of Wikipedia category network for telling apart classes
from instances:
https://dl.acm.org/authorize.cfm?key=N655914 (a
necessary step in knowledge base creation)
* In building YAGO:
http://www2007.wwwconference.org/papers/paper391.pdf
* Using Wikipedia category network for building section recommendation
systems for Wikipedia:
https://arxiv.org/pdf/1804.05995.pdf , Check
for example,
http://gapfinder.wmflabs.org/en.wikipedia.org/v1/section/article/Barack_Oba…
There is significant value in Wikipedia Category Network, I would not
discourage editors from building it. I do hope they know what value
this work brings to, at least, the research and scientific community.
Best,
Leila
Thanks,Pine
(
https://meta.wikimedia.org/wiki/User:Pine )
_______________________________________________
Wiki-research-l mailing list
Wiki-research-l(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l