On Mon, Jan 23, 2017 at 10:50 PM, Kerry Raymond <kerry.raymond(a)gmail.com>
wrote:
Yes, but when you are one of many English-speaking
nations and in a world
where English is widely spoken as a 2nd language, it’s hard to know if
outreach from your chapter has any impact on en.WP. WMF asks for success
metrics / KPIs or whatever you like to call them. Right now it’s hard to
gather any evidence.
What I say below is pretty much just process and pointers, not research
related:
To the one specific point you raised above: I agree with you, and looking
around me I can say that many in WMF do agree with you (and we also know
that agreeing is not enough, and we hope to address that soon.).
There is a task to track the work
https://phabricator.wikimedia.org/T131280
to release relevant data. Please subscribe to the task to monitor progress
if you're interested, and if you want, consider giving it a token if you
feel strongly about it. :) Regarding timelines (and keep in mind I'm not in
Analytics) Nuria from Analytics says in
https://lists.wikimedia.org/pipermail/analytics/2017-January/005654.html
that there is hope to have such data out by April 2017. I would keep an eye
on
https://www.mediawiki.org/wiki/Wikimedia_Engineering/2016-17_Q4_Goals#Analy…
to see if this task gets prioritized for Q4 which is the quarter starting
April 2017 and ending June 2017. Even if it doesn't get prioritized, it may
get done, but it's always more assuring if it does.
<https://lists.wikimedia.org/pipermail/analytics/2017-January/005654.html>
<https://lists.wikimedia.org/pipermail/analytics/2017-January/005654.html>
Best,
Leila
Kerry
*From:* Gerard Meijssen [mailto:gerard.meijssen@gmail.com]
*Sent:* Tuesday, 24 January 2017 3:46 PM
*To:* Kerry Raymond <kerry.raymond(a)gmail.com>om>; Research into Wikimedia
content and communities <wiki-research-l(a)lists.wikimedia.org>
*Subject:* Re: [Wiki-research-l] regional KPIs
Hoi,
What Wikipedia? It is highly likely that articles written about any
subject are written by people who know the language involved. This means
that all articles about the United States are most likely written in
Indonesia when the language is Javanese or in the Netherlands when the
language is Dutch. We know from research that was done in them olden days
that for some languages there are emigre community that writes a lot; this
was true for Napoleatan.
While I understand the interest in the question, what is it we will
benefit from researching this? There is plenty of actionable research we
could do. Or to put it more bluntly, when we seek parameters that may drive
more editing/ quality edits research will be of benefit. When we want to
ensure a more consistent point of view over all our Wikipedias I would
understand the need for research (have ideas on that one).
Thanks,
GerardM
On 24 January 2017 at 02:12, Kerry Raymond <kerry.raymond(a)gmail.com>
wrote:
As previously came up in discussion about chapters, it would be very
useful to have national data about Wikipedia activities, which can be
determined (generally) from IP addresses. Now I understand the privacy
argument in relation to logged-in users (not saying I agree with it though
in relation to aggregate data). However, can we find a proxy that does not
have the privacy considerations.
My hypothesis is that national content is predominantly written by users
resident in that nation. And that therefore activity on national content
can be used as a proxy for national user editing activity.
In the case of Australia, we could describe Australian national content in
either of two ways: articles within the closure of the
[[Category:Australia]] and/or those tagged as {{WikiProject Australia}}.
There are arguments for/against either (neither is perfect, in my
experience the category closure will tend to have false positives and the
project will tend to have false negatives).
I would like to know what correlation exists between national editor
activity (as determined from IP addresses mapped to location) and national
content edits and if/how it changes over time for various nations. This is
research that only WMF can do because WMF has the IP addresses and the rest
of us can’t have them for privacy reasons.
If we could establish that a strong-enough correlation existed between
them, we could use national content activity (for which there is no privacy
consideration) as a proxy for national editing activity. And we might even
be able to come up with a multiplier for each nation to provide comparable
data for national editing activity.
Now, it may be that we need to restrict the edits themselves in some way
to maximise the correlations between national content and same-nation
editor activity.
My second hypothesis is “semantic” edits (e.g. edits that add large
amounts of content or citation) to national content will be more highly
correlated with same-nation editors than “syntactic” edits (e.g. fix
spelling, punctuation or Manual of Style issues) will be. I suspect most
bots and other automated/semi-automated edits are doing syntactic edits.
Now, some of you will probably be aware of [
https://en.wikipedia.org/
wiki/Wikipedia:Wikipedia_Signpost/2017-01-17/Recent_research Female
Wikipedians aren't more likely to edit women biographies]. So it may well
be that my patriotic-editing hypothesis is also untrue. But it would be
nice to know one way or the other.
Kerry
_______________________________________________
Wiki-research-l mailing list
Wiki-research-l(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
_______________________________________________
Wiki-research-l mailing list
Wiki-research-l(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l