Since OpenSym was in San Francisco this year, we welcomed researchers
working in the Wikimedia space to join us for breakfast at the Wikimedia
Foundation on the morning after the conference. During this event, we
shook hands with made a few quick presentations about ongoing projects and
stuff that's right around the corner.
I took some notes on what was presented and I figured that many on this
list might appreciate the notes as well.
*Wikimedia research (*Collaborate with us!
- Research and data
Data science and experimental systems development.
- Design Research
Generative and evaluative research support for product development.
(With much more overlap than is implied by the distinction)
- IRC: #wikimedia-research on freenode.net
- This is "the office" for us. It's an excellent channel for asking
a quick question or discussing an idea.
- Mailing list: wiki-research-l(a)lists.wikimedia.org (signup
- WikiResearch Showcase
<https://www.mediawiki.org/wiki/Wikimedia_Research/Showcase> -- Monthly
event covering WMF research results and invited speakers researching WMF
- Revision scoring as a service
-- State-of-the-art AI (vandalism & article quality prediction) as a web
service. See ORES
for current system capabilities and Wiki labels
<https://meta.wikimedia.org/wiki/Wiki_labels>, our crowdsourced data
- Scholarly article citations
-- An open-licensed dataset of scholarly identifiers in Wikipedia which
notes when, historically, and identifier was first added.
- Activity sessions
<https://meta.wikimedia.org/wiki/Research:Activity_session> -- (coming
soon) A dataset of sessionized editing activity. Useful for measuring
labor hours or studying work patterns.
- Measuring value-added
(coming soon) A dataset of robust measurements of editor productivity and
value-added. See also Content persistence
- Clickstream dataset
<http://ewulczyn.github.io/Wikipedia_Clickstream_Getting_Started/> -- An
open-licensed dataset containing page view pair counts (as inferred by the
- Increasing article coverage
-- This research aims to identify important content available in one
language edition but missing from another and recommend the work to editors
who would be most interested in translating.
- Improving link coverage
<https://meta.wikimedia.org/wiki/Research:Improving_link_coverage> -- an
approach for automatically finding useful hyperlinks to add to a website by
analyzing server access logs.
I'm sure I missed some stuff. I invite my colleagues to supplement my
notes in their replies. Thanks to all who joined us!