Hello,
A very interesting question. From my experience and talks with readers, I
have the impression that readers usually take no notice of the categories.
I could not find out why, because the category system may be indeed useful
for at least some use cases.
When it comes to Commons, I would be very interested to learn how many
readers (or recipients) are actually non Wikipedia editors.
Kind regards
Ziko
2018-05-24 19:09 GMT+02:00 Leila Zia <leila(a)wikimedia.org>rg>:
Hi Pine,
On Wed, May 23, 2018 at 9:46 PM, Pine W <wiki.pine(a)gmail.com> wrote:
Hi Research-l,
My impression is that volunteers on Commons and ENWP spend a lot of time
on
categorization. I have seen references to analyses of how categorization
is done, but I can't recall seeing an analysis of how much use readers
make of categories on Commons and ENWP. My guess is that readers often use
categories on Commons for media searches, but that ENWP categories are
rarely used by readers, although maybe WMF Discovery uses categories to
inform search results. Is there data that shows how extensively readers on
ENWP and Commons use categories?
I don't know of recent (or old) studies on this topic, but there are
at least a few other things we know that can help you think about
whether it's useful to work on the category network in different
projects.
Categories are used by (at least) three different groups:
* Editors
* Readers
* Machines
We don't know all the use-cases that categories have for these groups.
It seems that generally editors use them to organize their work and
make the article space more navigable, readers use them to explore
content (in a more serendipitous way), and machines use them
extensively for a variety of applications. [We do miss published work
about what I just said, btw, and I really hope us or someone else
writes more about it in the coming year or two.:)]
While we're trying to figure out what the exact answer for the two
first groups are, it's helpful to think about the last group:
Wikipedia category network, with its known caveats, has been used
extensively by researchers to build new insights and technologies. A
lot of research on alignment of text across languages (which is in
turn used in building dictionaries and automatic translation tools)
takes advantage of this (for the most part) human curated
categorization of articles. It's an important side-product of building
the encyclopedia (and other projects). I'll give you a couple of
examples (non-comprehensive), feel free to dig in the literature
review of these papers for more:
* The usage of Wikipedia category network for telling apart classes
from instances:
https://dl.acm.org/authorize.cfm?key=N655914 (a
necessary step in knowledge base creation)
* In building YAGO:
http://www2007.wwwconference.org/papers/paper391.pdf
* Using Wikipedia category network for building section recommendation
systems for Wikipedia:
https://arxiv.org/pdf/1804.05995.pdf , Check
for example,
http://gapfinder.wmflabs.org/en.wikipedia.org/v1/section/
article/Barack_Obama
There is significant value in Wikipedia Category Network, I would not
discourage editors from building it. I do hope they know what value
this work brings to, at least, the research and scientific community.
Best,
Leila
Thanks,Pine
(
https://meta.wikimedia.org/wiki/User:Pine )
_______________________________________________
Wiki-research-l mailing list
Wiki-research-l(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
_______________________________________________
Wiki-research-l mailing list
Wiki-research-l(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l