While doing some analysis, I found a strange inconsistency.
The ServerSideAccountCreation
<https://meta.wikimedia.org/wiki/Schema:ServerSideAccountCreation> event
logs show a user with the ID *26048397* and the name *Dhava2nd* being
created on the English Wikipedia at 2015-08-20 01:38:57. (Those logs do *not
*record autocreations).
However, that user ID doesn't exist in `enwiki.users`. The user Dhava2nd does
exist <https://en.wikipedia.org/wiki/Special:Contributions/Dhava2nd>, but
with the user ID *26048409 *(twelve numbers later) and a registration date
of 2015-08-20 01:41:07 (two minutes later). Special:Log says that
<https://en.wikipedia.org/wiki/Special:Log/Dhava2nd> that the enwiki
account was autocreated, but the centralauth database says the enwiki
account was the initial creation, and that occurred at 2015-08-20 01:38:59.
In my cohort, this was the first user out of about 80,000 to have a problem
like this.
Does anyone know what's going on? Is this issue documented anywhere?
--
Neil P. Quinn <https://meta.wikimedia.org/wiki/User:Neil_P._Quinn-WMF>,
product analyst
Wikimedia Foundation
Hello!
The final goals of the analytics team for next quarter are published:
https://www.mediawiki.org/wiki/Wikimedia_Engineering/2015-16_Q3_Goals#Analy…
TL:DR: We are going to mostly work on replacing reports on
stats.wikimedia.org plus do operational maintenance and upgrades on the
analytics infrastructure we maintain: hadoop, eventlogging, pageview API
and wikimetrics. We expect to have time to come up with a suitable
sanitization strategy for pageview hourly data and have preliminary reports
of unique users (or rather devices) daily and monthly per wikipedia
project. These last two items fall in our "research" bucket so we are still
working on possible implementations.
Thanks,
Nuria
Hi all,
Specifically what I am looking for is page view data for these pages, preferably for all months on http://dumps.wikimedia.org/other/pagecounts-raw/ (appeared as named 4 Dec):
Abacarus hystrix
Acarus siro
Aceria tosichella
Acyrthosiphon pisum
Ahasverus advena
Anthrenus flavipes
Aphis craccivora
Arhopalus
Balaustium medicagoense
Bemisia tabaci
Brevicoryne brassicae
Bruchus
Ceratitis capitata
Cicadulina
Cryptolestes
Daktulosphaira vitifoliae
Delia
Ephestia elutella
Ephestia kuehniella
Etiella behrii
Frankliniella occidentalis
Frankliniella
Henosepilachna vigintioctopunctata
Heteronychus arator
Lachesilla quercus
Lasioderma serricorne
Liposcelis bostrychophila
Macrosiphum euphorbiae
Marchalina hellenica
Myzus persicae
Naupactus
Nezara viridula
Oligonychus ununguis
Oryzaephilus surinamensis
Panonychus ulmi
Penthaleus
Pieris rapae
Piezodorus
Plodia interpunctella
Plutella xylostella
Rhopalosiphon
rhopalosiphum maidis
Rhopalosiphum padi
Rhyzopertha dominica
Sirex noctilio
Sitophilus granarius
Sitophilus oryzae
Sitotroga cerealella
Sminthurus viridis
Spodoptera exempta
Stegobium paniceum
Tetranychus
Thrips palmi
Thrips
Tribolium castaneum
Tribolium confusum
Trogoderma granarium
Trogoderma
I then also want a total number of page views to standardise the individual page views.
I have looked at stats.gronk.se and wikitrends and I have two issues:
1. The data is only month by month and I want as many years of data as possible.
2. Some pages have too few page views for wikitrends.
Thanks for your help!
-----Original Message-----
From: Analytics [mailto:analytics-bounces@lists.wikimedia.org] On Behalf Of analytics-request(a)lists.wikimedia.org
Sent: Tuesday, 15 December 2015 4:11 AM
To: analytics(a)lists.wikimedia.org
Subject: Analytics Digest, Vol 46, Issue 23
Send Analytics mailing list submissions to
analytics(a)lists.wikimedia.org
To subscribe or unsubscribe via the World Wide Web, visit
https://lists.wikimedia.org/mailman/listinfo/analytics
or, via email, send a message with subject or body 'help' to
analytics-request(a)lists.wikimedia.org
You can reach the person managing the list at
analytics-owner(a)lists.wikimedia.org
When replying, please edit your Subject line so it is more specific than "Re: Contents of Analytics digest..."
Today's Topics:
1. Re: Readership metrics for the fortnight until December 6,
2015 (Federico Leva (Nemo))
2. Re: Data collection (Erik Zachte)
3. Re: Data collection (Federico Leva (Nemo))
4. Re: Python client for the new pageview API (Dan Andreescu)
5. Re: mobile and zero legacy tsvs on stat1002 (Oliver Keyes)
----------------------------------------------------------------------
Message: 1
Date: Mon, 14 Dec 2015 13:08:11 +0100
From: "Federico Leva (Nemo)" <nemowiki(a)gmail.com>
To: A mailing list for the Analytics Team at WMF and everybody who has
an interest in Wikipedia and "analytics."
<analytics(a)lists.wikimedia.org>
Subject: Re: [Analytics] Readership metrics for the fortnight until
December 6, 2015
Message-ID: <566EB12B.5080201(a)gmail.com>
Content-Type: text/plain; charset=utf-8; format=flowed
Interesting country breakdown!
Tilman Bayer, 14/12/2015 12:32:
>
> For the top three, I looked at how pageviews developed on a daily
> basis during the last three month including the week after this large
> change (until Dec 6):
>
>
> In Greece, the +21.6% rise was the result of an isolated spike from
> November 23-25. This can be traced to a single page on the Greek
> Wiktionary which on most days before and after only saw a single-digit
> number of pageviews, but on these three days received more than 2.8
> million: τάλε κουάλε
> <https://el.wiktionary.org/wiki/%CF%84%CE%AC%CE%BB%CE%B5_%CE%BA%CE%BF%CF%85%…>.
> It’s about an expression that apparently comes from Latin via Italian
> (“tale quale”) <https://en.wiktionary.org/wiki/tale_e_quale>and means
> something like “exactly the same” or “spitting image”. From the form
> of the spike, it was likely not the result of actual human interest,
> rather an undetected bot trying to learn exactly the same about exactly the same.
>
>
>
> In Ireland, the -20.6% drop marked the end of a plateau whose start
> had actually shown up in the report for the week until November 1
> <https://lists.wikimedia.org/pipermail/mobile-l/2015-November/009919.h
> tml>already, where the country was the top changer with a 40.2% rise.
>
>
> For South Africa, the -20.6% drop does not form part of a clear pattern.
>
------------------------------
Message: 2
Date: Mon, 14 Dec 2015 14:14:17 +0100
From: "Erik Zachte" <ezachte(a)wikimedia.org>
To: "'A mailing list for the Analytics Team at WMF and everybody who
has an interest in Wikipedia and analytics.'"
<analytics(a)lists.wikimedia.org>
Subject: Re: [Analytics] Data collection
Message-ID: <007901d13671$56855cc0$03901640$(a)wikimedia.org>
Content-Type: text/plain; charset="utf-8"
Hi Caitlin,
Here is a breakdown of categories within Phytopathology on English wikipedia: http://ow.ly/VQNVL
and the articles within those categories ranked by page view for Oct 2015 : http://ow.ly/VQNCv
I can run similar reports for earlier months.
Cheers,
Erik
From: Analytics [mailto:analytics-bounces@lists.wikimedia.org] On Behalf Of Alex Druk
Sent: Monday, December 14, 2015 10:44
To: A mailing list for the Analytics Team at WMF and everybody who has an interest in Wikipedia and analytics.
Subject: Re: [Analytics] Data collection
Hi Caitlin,
If you have a list of relevant articles and understanding what time period you would like to research, contact me of the list and I probably can help you.
Also my advise: have a look at wikipediatrends.com or stats.grok.se and try some of your queries to get a better undestanding of possible results.
Best wishes,
On Mon, Dec 14, 2015 at 12:04 AM, <Caitlin.Gardner(a)csiro.au> wrote:
Hi All,
I am a summer research intern with the Commonwealth Scientific and Industrial Research Organisation (CSIRO) in Australia. I am studying a statistics degree and so I don’t really have skills in the type of data collection required to access the Wiki data for my research. I was wondering if someone might be able to give me a hand (by pointing me in the right direction)?
I have a list of pest species that I wish to find the total number of page views via stats.grok.se or https://dumps.wikimedia.org/other/pagecounts-raw/ . There must be a good method to go through and pick out page views by name rather than by hand (which obviously isn’t feasible)? I’d also need to be able to find the total number of page views for each period in order to standardize the response to account for the increase in traffic over the years.
We are in the process of gathering similar data through a Plant Pest database but due to privacy concerns, the organisation is arranging to reconcile the data on our behalf and so I do not have a part in that.
Any help would be really appreciated!
Kind regards,
Caitlin Gardner
_______________________________________________
Analytics mailing list
Analytics(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/analytics
--
Thank you.
Alex Druk, PhD
wikipediatrends.com
alex.druk(a)gmail.com
(775) 237-8550 Google voice
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.wikimedia.org/pipermail/analytics/attachments/20151214/9ec9b2…>
------------------------------
Message: 3
Date: Mon, 14 Dec 2015 15:25:03 +0100
From: "Federico Leva (Nemo)" <nemowiki(a)gmail.com>
To: A mailing list for the Analytics Team at WMF and everybody who has
an interest in Wikipedia and "analytics."
<analytics(a)lists.wikimedia.org>
Subject: Re: [Analytics] Data collection
Message-ID: <566ED13F.20502(a)gmail.com>
Content-Type: text/plain; charset=utf-8; format=flowed
Erik Zachte, 14/12/2015 14:14:
> I can run similar reports for earlier months.
Thanks for publishing that code too!
https://github.com/wikimedia/analytics-wikistats/tree/master/dammit.lt/bash
Nemo
------------------------------
Message: 4
Date: Mon, 14 Dec 2015 09:32:24 -0500
From: Dan Andreescu <dandreescu(a)wikimedia.org>
To: Analytics List <analytics(a)lists.wikimedia.org>, Research into
Wikimedia content and communities
<wiki-research-l(a)lists.wikimedia.org>
Subject: Re: [Analytics] Python client for the new pageview API
Message-ID:
<CA+aepCS4n-Z4Qd-WZW7V_J5Aipb01NcwZrLUhTBiWwy_OFcoOA(a)mail.gmail.com>
Content-Type: text/plain; charset="utf-8"
I wasn't aware of some conventions that came before me, so I moved the project from milimetric/wmf to mediawiki-utilities/python-mwviews. I promise it'll stay there, sorry for the inconvenience. Updated links:
PyPI: https://pypi.python.org/pypi/mwviews/0.0.2
code: https://github.com/mediawiki-utilities/python-mwviews (PRs still welcome, thanks for the 2 you already helped with!)
On Fri, Dec 11, 2015 at 10:36 PM, Dan Andreescu <dandreescu(a)wikimedia.org>
wrote:
> Along the same lines as Oliver's great R client [1], I just started
> work on a python version:
>
> PyPI: https://pypi.python.org/pypi/wmf/0.1
> code: https://github.com/milimetric/wmf (PRs welcome)
>
> And if you're trying to skip past all the setup repository cruft, the
> meat:
> https://github.com/milimetric/wmf/blob/master/wmf/analytics/api/pagevi
> ews.py
>
>
> [1] https://github.com/Ironholds/pageviews
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.wikimedia.org/pipermail/analytics/attachments/20151214/c0a9ad…>
------------------------------
Message: 5
Date: Mon, 14 Dec 2015 12:10:50 -0500
From: Oliver Keyes <okeyes(a)wikimedia.org>
To: "A mailing list for the Analytics Team at WMF and everybody who
has an interest in Wikipedia and analytics."
<analytics(a)lists.wikimedia.org>
Subject: Re: [Analytics] mobile and zero legacy tsvs on stat1002
Message-ID:
<CAAUQgdADvcJdt_6+PgELg0P6nxM3C=6Uydv3PbOxjCPFFC3frg(a)mail.gmail.com>
Content-Type: text/plain; charset=UTF-8
Gotcha! Long as it's set for every request, perfect :)
On 14 December 2015 at 04:50, Joseph Allemandou <jallemandou(a)wikimedia.org> wrote:
> @Oliver: I think the closest we'll have is the access-method field,
> that can take values desktop, mobile-web, mobile-app.
>
> On Sun, Dec 13, 2015 at 8:37 PM, Oliver Keyes <okeyes(a)wikimedia.org> wrote:
>>
>> Not an answer to the question, but a question of my own; will the
>> nature of the content being served still be present as /some/ field?
>> FWIW I've found it very helpful to be able to use webrequest_source
>> to trivially distinguish mobile and desktop requests.
>>
>> On 11 December 2015 at 12:40, Andrew Otto <aotto(a)wikimedia.org> wrote:
>> > Hi all,
>> >
>> > Soon, we will be merging the mobile web cache requests with the
>> > text cache requests. text caches will now serve requests for
>> > mobile web[1].
>> >
>> > This means that the webrequest_source=‘mobile’ partition in the
>> > webrequest table in Hive will soon be empty, and all data that was
>> > previously in it will be found in the webrequest_source=‘text’
>> > partition.
>> >
>> > There are only 3 datasets that currently only use the
>> > webrequest_source=‘mobile’ partition:
>> >
>> > - /a/log/webrequest/archive/mobile
>> > - /a/log/webrequest/archive/5xx-mobile
>> > - /a/log/webrequest/archive/zero
>> >
>> > (These are paths on stat1002, but they also exist in HDFS.)
>> >
>> > These datasets originally came from udp2log, but since early last
>> > year they have been generated from Hadoop. With the upcoming cache
>> > merge, these jobs will have to parse through all text requests,
>> > which will make Hadoop busier.
>> >
>> > Do we know if these are being used? Would anyone be upset if we no
>> > longer generated these datasets?
>> >
>> > Thanks!
>> > -Andrew
>> >
>> > [1] https://phabricator.wikimedia.org/T109286
>> >
>> >
>> > _______________________________________________
>> > Analytics mailing list
>> > Analytics(a)lists.wikimedia.org
>> > https://lists.wikimedia.org/mailman/listinfo/analytics
>> >
>>
>>
>>
>> --
>> Oliver Keyes
>> Count Logula
>> Wikimedia Foundation
>>
>> _______________________________________________
>> Analytics mailing list
>> Analytics(a)lists.wikimedia.org
>> https://lists.wikimedia.org/mailman/listinfo/analytics
>
>
>
>
> --
> Joseph Allemandou
> Data Engineer @ Wikimedia Foundation
> IRC: joal
>
> _______________________________________________
> Analytics mailing list
> Analytics(a)lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/analytics
>
--
Oliver Keyes
Count Logula
Wikimedia Foundation
------------------------------
Subject: Digest Footer
_______________________________________________
Analytics mailing list
Analytics(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/analytics
------------------------------
End of Analytics Digest, Vol 46, Issue 23
*****************************************
Hi Analytics,
Next Tuesday, Dec 15, between 10:00 and 12:00 UTC
EventLogging's database m4-master will be down for maintenance.
Impact of the outage:
- Events generated during the mentioned time window will wait until the
outage is over to be inserted in the m4-master database. These events and
the ones generated immediately after the outage can take a couple hours to
catch up and be inserted.
Everything else should work as normal during the outage:
- Events can be sent normally to EventLogging server, they'll be
buffered in Kafka.
- Events will be written to the logs, and forwarded to other systems
normally. The only process that will stop is the mysql-consumer.
- Queries to analytics-store can be executed normally. The outage is
local to m4-master and does not affect replicas.
Feel free to ask any questions you have on this.
Cheers!
--
*Marcel Ruiz Forns*
Analytics Developer
Wikimedia Foundation
Hey folks,
I'm emailing to let you know that we won't be holding a Wikimedia Research
Showcase for this month due to the holidays. We'll kick it back in gear
for January 2016 though.
Watch this page for updates:
https://www.mediawiki.org/wiki/Wikimedia_Research/Showcase
Happy holidays!
-Aaron
Hi All,
I am a summer research intern with the Commonwealth Scientific and Industrial Research Organisation (CSIRO) in Australia. I am studying a statistics degree and so I don't really have skills in the type of data collection required to access the Wiki data for my research. I was wondering if someone might be able to give me a hand (by pointing me in the right direction)?
I have a list of pest species that I wish to find the total number of page views via stats.grok.se or https://dumps.wikimedia.org/other/pagecounts-raw/ . There must be a good method to go through and pick out page views by name rather than by hand (which obviously isn't feasible)? I'd also need to be able to find the total number of page views for each period in order to standardize the response to account for the increase in traffic over the years.
We are in the process of gathering similar data through a Plant Pest database but due to privacy concerns, the organisation is arranging to reconcile the data on our behalf and so I do not have a part in that.
Any help would be really appreciated!
Kind regards,
Caitlin Gardner
Hi all,
here is the usual look at our most important readership metrics. This time
examining, among other things, pageview changes in various countries, such
as the effect of a brief block in China and of the sudden popularity of a
Greek expression derived from Latin. The Android has been seeing a
prolonged decrease in downloads, which fortunately were offset recently by
a more positive development.
With this issue we are changing the scope of this report from a timespan of
one week to two weeks (a fortnight
<https://en.wikipedia.org/wiki/FFF_system>) or four weeks, see further
remarks below. Most of the metrics here are now being updated weekly on the new
Product page <https://www.mediawiki.org/wiki/Wikimedia_Product#Reading>.
Before we go to the usual data, a note (for those who haven’t seen it yet)
that Wikistats <https://stats.wikimedia.org/> has now been upgraded to the
new pageview definition, see the announcement
<http://infodisiac.com/blog/2015/12/wikistats-upgraded-to-new-page-view-defi…>
by Erik Zachte.
(All numbers below are averages for November 23-December 6, 2015 unless
otherwise noted.)
Pageviews
Total: 527 million/day (-2.4% from the previous fortnight)
Context (April 2015-December 2015):
(see also the Vital Signs dashboard
<https://vital-signs.wmflabs.org/#projects=all/metrics=Pageviews>)
A notable weekly drop of -2.2% in the week until November 29, followed by
another -0.2% in the week until December 6.
Desktop: 56.6% (week until Nov 22: 57.2%)
Mobile web: 42.2% (week until Nov 22: 41.6%)
Apps: 1.2% (week until Nov 22: 1.2%)
Context (April 2015-December 2015):
<https://commons.wikimedia.org/wiki/File:Wikimedia_daily_pageviews,_mobile_p…>
Global North ratio: 77.6% of total pageviews (week until Nov 22: 77.3%)
Context (April 2015-December 2015):
<https://commons.wikimedia.org/wiki/File:Percentage_of_Wikimedia_pageviews_f…>
Out of curiosity about the -2.2% drop, I ran a query for the week until
November 29 to find the countries with the largest changes from the
previous week (as in some
<https://lists.wikimedia.org/pipermail/mobile-l/2015-November/009919.html>
previous reports; restricted to those with >1 millions views/day in the
week until Nov 29):
Greece +21.6% 2.3 m/day
South Africa -21.6% 1.2 m/day
Ireland -20.6% 2.3 m/day
Colombia -15.7% 3.3 m/day
Venezuela -10.9% 2.3 m/day
Argentina -8.3% 4.9 m/day
Philippines -8.3% 3.5 m/day
New Zealand -7.3% 1.3 m/day
Vietnam -7.1% 2.1 m/day
Taiwan -6.9% 6.2 m/day
For the top three, I looked at how pageviews developed on a daily basis
during the last three month including the week after this large change
(until Dec 6):
In Greece, the +21.6% rise was the result of an isolated spike from
November 23-25. This can be traced to a single page on the Greek Wiktionary
which on most days before and after only saw a single-digit number of
pageviews, but on these three days received more than 2.8 million: τάλε
κουάλε
<https://el.wiktionary.org/wiki/%CF%84%CE%AC%CE%BB%CE%B5_%CE%BA%CE%BF%CF%85%…>.
It’s about an expression that apparently comes from Latin via Italian
(“tale quale”) <https://en.wiktionary.org/wiki/tale_e_quale> and means
something like “exactly the same” or “spitting image”. From the form of the
spike, it was likely not the result of actual human interest, rather an
undetected bot trying to learn exactly the same about exactly the same.
In Ireland, the -20.6% drop marked the end of a plateau whose start had
actually shown up in the report for the week until November 1
<https://lists.wikimedia.org/pipermail/mobile-l/2015-November/009919.html>
already, where the country was the top changer with a 40.2% rise.
For South Africa, the -20.6% drop does not form part of a clear pattern.
Turning to the week until December 6, there were user reports
<https://zh.wikipedia.org/wiki/Wikipedia:%E4%BA%92%E5%8A%A9%E5%AE%A2%E6%A0%8…>
starting around December 4 that Wikimedia projects were being blocked in
China on desktop more widely than before (the Chinese Wikipedia has been
blocked since May 2015). However, as can be seen in the chart below, the
block appears to have been short-lived. It has also been mentioned in
various media reports (English-language examples: China Digital Times
<http://chinadigitaltimes.net/2015/12/minitrue-weekend-block-of-wikipedia/>,
TIME
<http://time.com/4142305/xi-jinping-china-censorship-world-internet-conferen…>).
New app installations
Android: 30.6k/day (-28.9% from the previous fortnight)
Daily installs per device, from Google Play
Context (last two months):
Instead of recovering to the level from before the App Store feature from
Nov 5-12, installation numbers dropped further until early December. It now
appears that this is connected to a sharp decrease in Google search
referrals to the app’s store page. We’re looking further into it. The good
news though is that the app is currently featured among the Best Apps of
2015
<https://lists.wikimedia.org/pipermail/mobile-l/2015-December/009995.html>
in various countries, with a clear effect on download numbers from December
3 on. (We’ll do a full assessment of the impact of this promotion in early
January after it ends.)
iOS: 4.44k/day (-4.6% from the previous fortnight)
Download numbers from App Annie
Context (last three months):
[image:
Wikipedia_iOS_app_daily_downloads%2C_Sep_8%2C_2015_~_Dec_6%2C_2015_(App_Annie).png]
<https://commons.wikimedia.org/wiki/File:Day-7_retention_of_Wikipedia_iOS_ap…>
There was a 5.4% drop between the week until November 22 and the week until
November 29. But this is still higher than in the first week of November,
say.
App user retention
Android: 14.8% (previous fortnight: 15.0%)
(Ratio of app installs opened again 7 days after installation, average of
daily percentages for installs from the previous week. 1:100 sample)
Context (last three months):
(NB: In order to have the chart show any potential causal connections more
clearly, I’ve changed the x-axis to the installation date instead of the
date on which the survival is measured, and added annotations.)
iOS: N/A
(Ratio of app installs opened again 7 days after installation, average of
daily ratios for installs from the previous week. From iTunes Connect,
opt-in only = ca. 20-30% of all users)
Unfortunately I encountered data quality issues with this metric (again
<https://lists.wikimedia.org/pipermail/mobile-l/2015-November/009942.html>).
We are in contact with Apple about this. I will leave the metric out of
these reports until this is resolved or we have a suitable substitute (we
could conceivably track this ourselves via EventLogging, like on
Android).Unique
app users
Android: 1.158 million / day (-3.8% from the previous week)
Context (last two months):
By now it looks like the App Store feature from Nov 5-12 indeed raised
active users numbers a bit, although they have since been steadily
decreasing, possibly for other reasons (see above).
iOS: 284k / day (+1.0% from the previous fortnight)
Context (last two months):
The week until November 29 saw an increase of 3.5% compared to the previous
week. And in the chart, a little bump can be discerned from November 26 on,
which could be related to Thanksgiving in the US (as we know, mobile use
tends to be higher on weekends and presumably also holidays).
Schedule of this report
As mentioned last time
<https://lists.wikimedia.org/pipermail/mobile-l/2015-November/009942.html>,
we have been rethinking the schedule of this report. New issues will now
cover either two weeks or four weeks each (keeping it to a multiple of
seven days, considering that most of these metrics show a strong weekly
seasonality and comparisons are hence best taking that period into
account). The main purpose remains the same as laid out earlier
<https://lists.wikimedia.org/pipermail/mobile-l/2015-September/009773.html>:
to raise awareness about how these metrics are developing, call out the
impact of any unusual events in the preceding week, and facilitate thinking
about core metrics in general. Having iterated the selection and
presentation of these metrics on a weekly basis since launching this report
in September, they have now stabilized a bit. Also, it now appears that the
newsworthy items uncovered by examining the development of these metrics
can also be covered on a less frequent than weekly schedule. On the other
hand, most of these metrics are now also updated weekly as part of the
Reading team’s section on the new Wikimedia Product page on mediawiki.org
<https://www.mediawiki.org/wiki/Wikimedia_Product>. Feedback and discussion
continue to be welcome.
----
For reference, the queries and source links used are listed below (access
is needed for each). Unless otherwise noted, all content of this report is
authored on behalf of the Wikimedia Foundation and released under the CC
BY-SA 3.0 <https://creativecommons.org/licenses/by-sa/3.0/> license. Most
of the above charts are available on Commons
<https://commons.wikimedia.org/w/index.php?title=Special:ListFiles&offset=20…>,
where I’m also uploading PDF versions of previous issues of this report.
hive (wmf)> SELECT SUM(view_count)/14000000 AS avg_daily_views_millions
FROM wmf.projectview_hourly WHERE agent_type = 'user' AND
CONCAT(year,"-",LPAD(month,2,"0"),"-",LPAD(day,2,"0")) BETWEEN "2015-11-23"
AND "2015-12-06";
hive (wmf)> SELECT year, month, day,
CONCAT(year,"-",LPAD(month,2,"0"),"-",LPAD(day,2,"0")) as date,
sum(IF(access_method <> 'desktop', view_count, null)) AS mobileviews,
SUM(view_count) AS allviews FROM wmf.projectview_hourly WHERE year=2015 AND
agent_type = 'user' GROUP BY year, month, day ORDER BY year, month, day
LIMIT 1000;
hive (wmf)> SELECT access_method, SUM(view_count)/14 FROM
wmf.projectview_hourly WHERE agent_type = 'user' AND
CONCAT(year,"-",LPAD(month,2,"0"),"-",LPAD(day,2,"0")) BETWEEN "2015-11-23"
AND "2015-12-06" GROUP BY access_method;
hive (wmf)> SELECT SUM(IF (FIND_IN_SET(country_code,
'AD,AL,AT,AX,BA,BE,BG,CH,CY,CZ,DE,DK,EE,ES,FI,FO,FR,FX,GB,GG,GI,GL,GR,HR,HU,IE,IL,IM,IS,IT,JE,LI,LU,LV,MC,MD,ME,MK,MT,NL,NO,PL,PT,RO,RS,RU,SE,SI,SJ,SK,SM,TR,VA,AU,CA,HK,MO,NZ,JP,SG,KR,TW,US')
> 0, view_count, 0))/SUM(view_count) FROM wmf.projectview_hourly WHERE
agent_type = 'user' AND
CONCAT(year,"-",LPAD(month,2,"0"),"-",LPAD(day,2,"0")) BETWEEN "2015-11-23"
AND "2015-12-06";
hive (wmf)> SELECT year, month, day,
CONCAT(year,"-",LPAD(month,2,"0"),"-",LPAD(day,2,"0")), SUM(view_count) AS
all, SUM(IF (FIND_IN_SET(country_code,
'AD,AL,AT,AX,BA,BE,BG,CH,CY,CZ,DE,DK,EE,ES,FI,FO,FR,FX,GB,GG,GI,GL,GR,HR,HU,IE,IL,IM,IS,IT,JE,LI,LU,LV,MC,MD,ME,MK,MT,NL,NO,PL,PT,RO,RS,RU,SE,SI,SJ,SK,SM,TR,VA,AU,CA,HK,MO,NZ,JP,SG,KR,TW,US')
> 0, view_count, 0)) AS Global_North_views FROM wmf.projectview_hourly
WHERE year = 2015 AND agent_type='user' GROUP BY year, month, day ORDER BY
year, month, day LIMIT 1000;
SELECT country_code, changeratio, ROUND(milliondailyviewsthisweek,1) AS
milliondailyviewsthisweek FROM
(SELECT country_code, ROUND(100*SUM(IF(day>22 AND day<30, view_count,
null))/SUM(IF(day>15 AND day < 23, view_count, null))-100,1) AS
changeratio, SUM(IF(day>22 AND day<30, view_count, null))/7000000 AS
milliondailyviewsthisweek
FROM wmf.projectview_hourly
WHERE
year = 2015
AND month = 11
AND agent_type = "user"
GROUP BY country_code)
AS countrylist
WHERE milliondailyviewsthisweek > 1 GROUP BY country_code, changeratio,
milliondailyviewsthisweek ORDER BY ABS(changeratio) DESC LIMIT 10;
https://console.developers.google.com/storage/browser/pubsite_prod_rev_0281…
(“overview”)
https://www.appannie.com/dashboard/252257/item/324715238/downloads/?breakdo…
(select “Total”)
SELECT LEFT(timestamp, 8) AS date, SUM(IF(event_appInstallAgeDays = 0, 1,
0)) AS day0_active, SUM(IF(event_appInstallAgeDays = 7, 1, 0)) AS
day7_active FROM log.MobileWikiAppDailyStats_12637385 WHERE userAgent LIKE
'%-r-%' AND userAgent NOT LIKE '%Googlebot%' GROUP BY date ORDER BY DATE;
(with the retention rate calculated as day7_active seven days later divided
by day0_active from the install date)
https://analytics.itunes.apple.com/#/retention?app=324715238
hive (wmf)> SELECT SUM(IF(platform = 'Android',unique_count,0))/7 AS
avg_Android_DAU_last_week, SUM(IF(platform = 'iOS',unique_count,0))/7 AS
avg_iOS_DAU_last_week FROM wmf.mobile_apps_uniques_daily WHERE
CONCAT(year,LPAD(month,2,"0"),LPAD(day,2,"0")) BETWEEN 20151123 AND
20151206;
hive (wmf)> SELECT CONCAT(year,"-",LPAD(month,2,"0"),"-",LPAD(day,2,"0"))
as date, unique_count AS Android_DAU FROM wmf.mobile_apps_uniques_daily
WHERE platform = 'Android';
hive (wmf)> SELECT CONCAT(year,"-",LPAD(month,2,"0"),"-",LPAD(day,2,"0"))
as date, unique_count AS iOS_DAU FROM wmf.mobile_apps_uniques_daily WHERE
platform = 'iOS';
--
Tilman Bayer
Senior Analyst
Wikimedia Foundation
IRC (Freenode): HaeB