http://www.rcasts.com/2012/12/software-engineers-guide-to-getting.html
For the developers on this list who are just getting started with data
analytics, a "Software engineer's guide to getting started with data
science" including online materials, recommended courses, ideas on how
to accelerate learning, and "what didn't work for me".
--
Sumana Harihareswara
Engineering Community Manager
Wikimedia Foundation
I saw this on the UK list. That is a very nice & pretty tool, and
there are other goodies also on that site that I hadn't see before.
http://global-dev.wmflabs.org/
It would be great if we can obtain similar graphs for other
language-country pairs.
---------- Forwarded message ----------
From: Asaf Bartov <abartov(a)wikimedia.org>
Date: Thu, Jan 31, 2013 at 3:52 AM
Subject: Re: [Wikimediauk-l] Polish becomes England's second language
To: UK Wikimedia mailing list <wikimediauk-l(a)lists.wikimedia.org>
Another bit of data - active editors of PLWP outside Poland, in
several likely countries:
http://global-dev.wmflabs.org/graphs/plwp_active
(slow to load; hover over the lines to get the legend)
for comparison, PLWP enjoys ~1.3k active editors from Poland.
A.
On Wed, Jan 30, 2013 at 8:39 AM, Peter Coombe
<thewub.wiki(a)googlemail.com> wrote:
>
> The detailed breakdowns (quite large pages):
> http://stats.wikimedia.org/wikimedia/squids/SquidReportPageViewsPerCountryB…
> http://stats.wikimedia.org/wikimedia/squids/SquidReportPageEditsPerCountryB…
>
> About 0.6% of both views and edits in the United Kingdom are to the Polish Wikipedia, indeed putting it in second place. I think this has been the case for a while, I remember noticing it mid-2011 when looking up language data for fundraising.
>
> A couple of other interesting things to note from these statistics:
> Welsh gets 0.4% of all UK edits, putting it in 5th place, but doesn't even register for number of views (i.e. is less than 0.1%). Maybe we need to do more to promote its visibility.
> None of the South Asian languages mentioned in the Guardian article appear in either the views or edits.
>
>
> Pete / the wub
>
>
> On 30 January 2013 15:12, Damokos Bence <damokos.bence(a)wikimedia.hu> wrote:
>>
>> To give a sense of scale until someone finds more accurate data, according to http://stats.wikimedia.org/wikimedia/squids/SquidReportPageEditsPerCountryB… and http://stats.wikimedia.org/wikimedia/squids/SquidReportPageViewsPerCountryB…
>> 94% of edits from the UK are on the English Wikipedia, and 6% to the other languages; and about the same is true for readers.
>>
>> Also, according to: http://stats.wikimedia.org/wikimedia/squids/SquidReportPageEditsPerLanguage…, http://stats.wikimedia.org/wikimedia/squids/SquidReportPageViewsPerLanguage…
>> 2% of edits to the Polish Wikipedia come from the UK, and about 1.5% of the readers; if I read the page right.
>>
>> Best regards,
>> Bence
>>
>>
>> On Wed, Jan 30, 2013 at 4:04 PM, David Gerard <dgerard(a)gmail.com> wrote:
>>>
>>> http://www.guardian.co.uk/uk/2013/jan/30/polish-becomes-englands-second-lan…
>>>
>>> Anything we can do to get more UK-resident Poles editing?
>>>
>>> Do we have any idea of edits to pl:wp from the UK?
>>>
>>>
>>> - d.
>>>
>>> _______________________________________________
>>> Wikimedia UK mailing list
>>> wikimediauk-l(a)wikimedia.org
>>> http://mail.wikimedia.org/mailman/listinfo/wikimediauk-l
>>> WMUK: http://uk.wikimedia.org
>>
>>
>>
>>
>> --
>> Damokos Bence
>> Wikimédia Magyarország
>> http://wikimedia.hu
>>
>> _______________________________________________
>> Wikimedia UK mailing list
>> wikimediauk-l(a)wikimedia.org
>> http://mail.wikimedia.org/mailman/listinfo/wikimediauk-l
>> WMUK: http://uk.wikimedia.org
>>
>
>
> _______________________________________________
> Wikimedia UK mailing list
> wikimediauk-l(a)wikimedia.org
> http://mail.wikimedia.org/mailman/listinfo/wikimediauk-l
> WMUK: http://uk.wikimedia.org
>
--
Asaf Bartov
Wikimedia Foundation
Imagine a world in which every single human being can freely share in
the sum of all knowledge. Help us make it a reality!
https://donate.wikimedia.org
_______________________________________________
Wikimedia UK mailing list
wikimediauk-l(a)wikimedia.org
http://mail.wikimedia.org/mailman/listinfo/wikimediauk-l
WMUK: http://uk.wikimedia.org
--
John Vandenberg
I have a stupid cold that stopped me from getting much sleep last night. I'm going to take the day off to sleep.
--
David Schoonover
dsc(a)wikimedia.org
Thanks for making the page hit data available, it was fascinating to have
it and play around with it.
A follow up question: is there any way to also figure out how many access
go to api.php, as this is our main entry point for bots?
Cheers,
Denny
--
Project director Wikidata
Wikimedia Deutschland e.V. | Obentrautstr. 72 | 10963 Berlin
Tel. +49-30-219 158 26-0 | http://wikimedia.de
Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e.V.
Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter
der Nummer 23855 B. Als gemeinnützig anerkannt durch das Finanzamt für
Körperschaften I Berlin, Steuernummer 27/681/51985.
Hi all,
There has been some recent discussion [1] on wikidata-l about making
sure wikidata.org page views are part of the stats that are being
collected at stats.grok.se. I did a quick check to see if page views
on www.wikidata.org were showing up, and they don't appear to be.
curl --silent http://dumps.wikimedia.org/other/pagecounts-raw/2013/2013-01/pagecounts-201…
| zcat - | egrep ' Q\d+ '
de Q100 2 15502
de Q5 2 95286
en Q0 1 7815
en Q1 6 70841
en Q10 2 32918
en Q101 1 7709
en Q102 1 8128
en Q107 1 26148
en Q11 1 387
en Q19 2 14304
en Q2 7 44454
en Q3 1 8624
en Q35 1 22856
en Q4 3 40588
en Q4000 2 28866
en Q5 3 16698
en Q612 1 7152
en Q6600 2 29929
en Q7 2 16604
en Q9450 2 59452
fr Q4 2 24352
fr Q400 1 17339
hu Q10 3 105745
ja Q0 1 8685
ja Q10 36 781763
ja Q4 1 8873
ja Q400 1 25208
ko Q3 1 22350
nl Q65 2 24562
ru Q10 1 31355
ru Q4000 1 11140
sv Q10 1 365
zh Q1 1 10950
zh Q10 2 17738
Does anyone have any idea how to get stats.grok.se updated?
I think I've seen from previous conversation here that an alternate
source for stats is being created? Is there any information available
on how to use that if it's available yet?
//Ed
[1] http://lists.wikimedia.org/pipermail/wikidata-l/2013-January/001646.html
Heya,
Last week I resurrected Wikihadoop from the Summer of Research 2011 when we
wrote a Hadoop-based input parser for bzipped XML dumps of Wikipedia and
the ability to create diffs between revisions. This work was mainly done by
Yusuke Matsubara, Aaron Halfaker and Fabian Kaelin.
This is working again and if you have input / suggestions on how to expose
this data, then please let me know!
Best,
Diederik
Heya,
Since yesterday afternoon we have been having intermittent packetloss
issues with Emery. Ori and myself wrote yesterday a simple patch to enable
sampling on two filters to reduce the workload. That seemed to work however
this morning packetloss came back. Paravoid pointed out that it was
probably due to the fact that Emery was running for 208 days and that older
versions of the kernel start doing weird things.
Rebooting Emery seemed to have resolved the issue and we are planning to
upgrade to Precise soon.
Best,
Diederik
I wanted to follow up and second on Magnus' December
request<http://lists.wikimedia.org/pipermail/analytics/2012-December/000272.html>for
more usable way to access page view stats. Mining these stats is
attracting an increasing amount of attention from researchers (
http://www.l3s.de/~kanhabua/papers/ECIR2013-WikiEvents.pdf,
http://arxiv.org/pdf/1212.5943v1.pdf, http://arxiv.org/pdf/1211.0970.pdf)
even as the current approaches for extracting them from stats.grok.se or
the dumps are slow and inhumane (respectively).
I'm also interested in looking at bursts of pageview activity on articles
and then examining the extent to which this pageview activity diffuses over
the local wiki-link network. I suspect this has strong implications for
understanding patterns of editing activity; namely, editing activity may be
non-trivially coupled with sudden attention to articles that are a few
degrees of separation away. I'd be happy to chat with folks inside or
outside of WMF about getting access to the relevant view stats and
beginning such an analysis.
Best,
Brian
http://arxiv.org/pdf/1208.4171.pdf
This is a pretty interesting and accessible description of best practices and design decisions driven by practical problems they had to solve at Twitter in the area of client-side event logging, funnel analysis, user modeling.
E3: check out section "3.2 Client Events" in particular, which is quite relevant to EventLogging.
Dario