Hi,
we are currently serving a few hundred graphs, dashboards, ... at
http://gp.wmflabs.org/
, but running the various scripts that generate them is a bit shaky
and their maintenance is eating up a considerable amount of time.
So in order to better use resources, and limit maintenance work, we're
curious about which parts, URLs, dashboards, graphs, datasources of
the site are actually in use by people in one way or the other.
If you rely on parts, URLs, dashboards, graphs, datasources of
http://gp.wmflabs.org/
please let us know by August 30.
Best regards,
Christian
P.S.: We may think about removing unused parts or stop even trying to
update them. So if you are using some parts, please do let us know :-)
P.P.S.: We already reached out to the users that we know of. So do not
feel pressed to reply again, if you have already replied to the
private email about this issue.
--
---- quelltextlich e.U. ---- \\ ---- Christian Aistleitner ----
Companies' registry: 360296y in Linz
Christian Aistleitner
Gruendbergstrasze 65a Email: christian(a)quelltextlich.at
4040 Linz, Austria Phone: +43 732 / 26 95 63
Fax: +43 732 / 26 95 63
Homepage: http://quelltextlich.at/
---------------------------------------------------------------
+Analytics
On Thu, Sep 19, 2013 at 1:57 PM, Adam Baso <abaso(a)wikimedia.org> wrote:
> A run on yesterday's valid Wikipedia Zero hits showed that user agents NOT
> supporting HTML (i.e., only supporting WAP) is only 0.098 - 0.108 *percent*.
>
> Assuming a bunch of complaints don't come in (e.g., "I'm getting tag
> soup!", as Max might say), I think we could make a reasonable case to stop
> supporting WAP through the formal channels (blog, mailing list(s), etc.).
>
> -Adam
>
>
> On Tue, Sep 17, 2013 at 1:11 PM, Arthur Richards <arichards(a)wikimedia.org>wrote:
>
>> That's awesome - thanks Max and Adam; it's great to see the last vestiges
>> of X-Device finally disappear!
>>
>>
>> On Tue, Sep 17, 2013 at 1:07 PM, Max Semenik <maxsem.wiki(a)gmail.com>wrote:
>>
>>> After looking at Varnish VCL with Adam, we discovered a bug in regex
>>> resulting in many phones being detected as WAP when they shouldn't be.
>>> Since the older change[1] simplifying detection had also fixed this bug,
>>> Brandon Black deployed it and since today the usage share of WAP should
>>> seriously drop. We will be monitoring the situation and revisit the issue
>>> of WAP popularity once we have enough data.
>>>
>>> [1] https://gerrit.wikimedia.org/r/83919
>>>
>>> On Tue, Sep 10, 2013 at 4:39 PM, Adam Baso <abaso(a)wikimedia.org> wrote:
>>>
>>>> Thanks. 7-9% of responses on Wikipedia Zero being WAP is pretty
>>>> substantial.
>>>>
>>>>
>>>> On Tue, Sep 10, 2013 at 2:01 PM, Andrew Otto <otto(a)wikimedia.org>wrote:
>>>>
>>>>> > These
>>>>> > zero.tsv.log*
>>>>> > files to which I refer seem to be, basically Varnish log lines that
>>>>> > correspond to Wikipedia Zero-targeted traffic.
>>>>> Yup! Correct. zero.tsv.log* files are captured unsampled and based
>>>>> on the presence of a "zero=" tag in the X-Analytics header:
>>>>>
>>>>>
>>>>> http://git.wikimedia.org/blob/operations%2Fpuppet.git/37ffb0ccc1cd7d3f5612d…
>>>>>
>>>>> > Do I understand correctly that field as Content-Type?
>>>>> Yup again! The varnishncsa format string that is currently being
>>>>> beamed at udp2log is here:
>>>>>
>>>>>
>>>>> http://git.wikimedia.org/blob/operations%2Fpuppet.git/37ffb0ccc1cd7d3f5612d…
>>>>>
>>>>
>>>
>>> --
>>> Best regards,
>>> Max Semenik ([[User:MaxSem]])
>>>
>>> _______________________________________________
>>> Mobile-l mailing list
>>> Mobile-l(a)lists.wikimedia.org
>>> https://lists.wikimedia.org/mailman/listinfo/mobile-l
>>>
>>>
>>
>>
>> --
>> Arthur Richards
>> Software Engineer, Mobile
>> [[User:Awjrichards]]
>> IRC: awjr
>> +1-415-839-6885 x6687
>>
>
>
Aaron has been a major contributor to our understanding of why we have a
problem retaining new editors, during the 2011 Summer of Research and
beyond. Many of our key
findings<https://meta.wikimedia.org/wiki/Research:Wikimedia_Summer_of_Research_2011/…>are
from his research. Check out
https://en.wikipedia.org/wiki/User:EpochFail for more.
Welcome, Aaron!
Sumana Harihareswara
Engineering Community Manager
Wikimedia Foundation
---------- Forwarded message ----------
From: Toby Negrin <tnegrin(a)wikimedia.org>
Date: Tue, Sep 17, 2013 at 2:03 PM
To: WMF staff
Hi Everybody,
It's been a long time coming, but I'd like to welcome Aaron Halfaker to the
Foundation! He is starting today as a Research Analyst on the Analytics
team, working out of Minneapolis.
Aaron is a newly minted PhD out of the (well-known for Wikipedia studies)
GroupLens Research lab. His thesis work is basically all about Wikipedia.
In his research, he uses a mixture of behavioral modeling, participatory
design, psychology and social science to understand, affect, and extend
complex socio-technical systems like Wikipedia.
He's actually been working as a contractor for the Wikimedia Foundation
since 2011. Most notably, he's worked with Dario to design experiments for
Article Feedback, Notifications (aka Echo) and VisualEditor.
It's a pretty good fit and we're really happy that Aaron is joining us full
time! Please extend him a warm welcome!
-Toby
On 09/18/2013 08:30 PM, Fabrice Florin wrote:
> It's a lack of automated tools.
>
> Right now, Dario has to create each of them manually and it's not
> practical for him to support hundreds of sites, given his workload.
Yeah, it certainly doesn't make sense to do them all manually. But I
think it would be great to be able to script this.
> Someday, when we have more resources, our analytics team may be able to
> automate this process, so we can support more sites.
Agreed, I'm CCing Analytics on this. For feature requests like this, is
it best to file an enhancement in Bugzilla, email the Analytics list, or
something else?
Matt Flaschen
Hi!
We recorded our Sprint Showcase for the first time: you can watch it at
http://www.youtube.com/watch?v=vRAMBk0EBkg
Slidedeck is available at
https://docs.google.com/a/wikimedia.org/presentation/d/1k-ookW_C8wUeA3ut0aR…
## Defects & Features completed (Ready for Showcase/Shipping/Done) during
Sprint ending 2013-09-18 ##
(Name, Project Name, Customer, Estimate)
1130 - Migrate Kraken to openJDK 7 - Kraken - Diederik van Liere -
Analytics - 1
734 - Reinstall frontend node (hue, oozie, hive, etc.) - Kraken -
Operations - 5
736 - Reinstall Kafka Broker nodes - Kraken - Operations -
1114 - Hadoop: Run and retain results of Hadoop benchmarks - Kraken -
Diederik van Liere - Analytics -
1097 - [Bug 53238] Inconsistencies in "Pageviews for All Wikimedia
Projects" report card - Limn Dashboards - Tilman Bayer - Communications - 1
1079 - Monitoring geowiki dashboards - Limn Dashboards - Jessie Wild -
Learning & Evaluation - 8
1138 - varnishkafka puppet module - Logging Infrastructure - Operations - 1
777 - Varnishkafka output format - Logging Infrastructure - Operations - 8
1135 - [Bug 53804] Tests fail when running against MySQL - Wikimetrics -
Diederik van Liere - Analytics - 1
1140 - [Bug 53892] metrics.wmflabs.org got an invalid certificate -
Wikimetrics - Diederik van Liere - Analytics - 1
1122 - Metrics start/end date should be a timestamp - Wikimetrics - Jaime
Anstee - Grantmaking - 2
1136 - [Bug 53805] Mediawiki Timestamps are not mapped satisfactorily -
Wikimetrics - Diederik van Liere - Analytics - 2
1154 - Enable SSL by default - Wikimetrics - Diederik van Liere -
Analytics - 2
1120 - Time-series support - Wikimetrics - Dario Taraborelli - Analytics -
5
1068 - [Bug 52349] Editor counts for Italian Wikipedia all zero -
Wikistats - Community -
1134 - Investigate Wikipedia Zero bump around 2013-09-03 - Logging
Infrastructure - Dan Foy - Wikipedia Zero - N/E
## Current Sprint (ending 2013-10-02) ##
(Name, Customer, Estimate, Project Name)
1133 - Hive Partitioning w Camus and JSON SerDe - Diederik van Liere -
Analytics - N/E - Kraken
1091 - [Bug 53177] Create graph feature not working - Jessie Wild -
Learning & Evaluation - 2 - Limn
1088 - [Bug 53155] Search datasource is not working - Jessie Wild -
Learning & Evaluation - 1 - Limn
1067 - [Bug 51566] Limn: expose tab names in URL - Dario Taraborelli -
Analytics - 3 - Limn
1155 - Total Active Editors - Erik Moeller - Executive Office - 1 - Limn
Dashboards
1152 - Kafka Monitoring - Diederik van Liere - Analytics - 5 - Logging
Infrastructure
1124 - Kafka multiple DC setup - Diederik van Liere - Analytics - N/E -
Logging Infrastructure
1121 - Setup Dev Env Canary Monitoring - Diederik van Liere - Analytics - 5
- Logging Infrastructure
1109 - [Bug 53417] Deduplication of usernames should happen on combination
of username,project - Diederik van Liere - Analytics - 1 - Wikimetrics
701 - Measure survival of an editor - Dario Taraborelli - Analytics - 5 -
Wikimetrics
699 - Measure milestones achieved by an editor - Dario Taraborelli -
Analytics - 5 - Wikimetrics
Any mingle card can be accessed using the base url
https://mingle.corp.wikimedia.org/projects/analytics/cards/XYZ where XYZ is
the Mingle card id.
If you have any questions, comments or feedback: please let us know!
Apologies for cross-posting; ideally you should receive this on the
Analytics Mailinglist so we can have one focal point for conversation. If
you are not on the Analytics list then please subscribe at
https://lists.wikimedia.org/mailman/listinfo/analytics
Best,
Diederik
Heya,
Today we enabled HTTPS by default for Wikimetrics which means that all
communication between your browser and our server is encrypted. This means
more privacy and security!
Please let us know if it caused any issues and we'll fix them :)
Best,
D
Hi,
while we obviously do not see too much IPv6 traffic, I am getting more
and more concerned about the fact that we do not see updates of the
GeoIPv6 databases on at least stat1 and stat1002.
Those databases still agree with the file Diederik gave me back in
July.
How often do we expect to see updates of the GeoIPv6 database?
Best regards,
Christian
--
---- quelltextlich e.U. ---- \\ ---- Christian Aistleitner ----
Companies' registry: 360296y in Linz
Christian Aistleitner
Gruendbergstrasze 65a Email: christian(a)quelltextlich.at
4040 Linz, Austria Phone: +43 732 / 26 95 63
Fax: +43 732 / 26 95 63
Homepage: http://quelltextlich.at/
---------------------------------------------------------------
Hi there analytics,
Wanted to get your input on something. Our grantee partner Jake Orlowitz
(cc'd) is planning out the pilot evaluation for The Wikipedia Adventure.
We haven't hammered out the details yet, but it looks like he will be
comparing editing behavior between at least 2 cohorts: editors who were
invited to play TWA and who completed at least the first mission, and a
control group of editors who met the same basic criteria (joined around the
same time, met a minimum edit threshold). The TWA pilot will last at least
a month, and new editors will be invited on a rolling basis throughout that
month. We'd like to examine the editing behavior of each editor AFTER the
date they were invited to TWA (or would have been, in the case of the
control group).
Currently, it looks like Wikimetrics only lets you specify a date range at
the level of the cohort; that won't work for this analysis, since we want
to exclude edits made before a given date, which will vary user-by-user.
Could WikiMetrics be updated to allow researchers to set user-level date
ranges? I'm thinking potentially this could be an optional field in the
upload CSV.
I think this feature would be useful beyond TWA. The current setup works
well for offline events, where everyone in a cohort is receiving the same
"treatment" at exactly the same time. But for many online initiatives--such
as volunteer-driven email and social media campaigns & editor engagement
experiments like TWA, Teahouse, Image and translation drives, etc.--the
cohorts won't necessarily fit neatly into single-date buckets.
We have a little time to talk this through: the TWA pilot hasn't started
yet, and we won't be analyzing data for *at least* a month and a half, but
I wanted to get the conversation started.
Cheers,
Jonathan
--
Jonathan T. Morgan
Learning Strategist
Wikimedia Foundation
Hi all,
Pine just pinged me and asked if we were willing to do a researcher IRC
office hour this month. It sounds like a good idea to me. Maybe
week-after-next?
Who's on board?
- Jonathan
--
Jonathan T. Morgan
Learning Strategist
Wikimedia Foundation
Hi, I have a few questions regarding mobile stats.
I need to determine a real percentage of WAP browsers. At first glance,
[1] looks interesting: ratio of text/html to text/vnd.wap.wml
is 92M / 3987M = 2.3% on m.wikipedia.org. However, this contradicts
the stats at [2] which have different numbers and a different ratio.
I did my own research: because during browser detection in Varnish
WAPness is detected mostly by looking at accept header and because our
current analytics infrastructure doesn't log it, I quickly whipped up
a code that recorded user-agent and accept of every 10,000th request
for mobile page views hitting apaches.
According to several days worth of data, out of 14917 logged requests
1445 contained vnd.wap.wml in Accept: headers in any form. That's more
than what is logged for frontend responses, however it is expected as
WAP should have worse cache hit rate and thus should hit apaches more
often.
Next, our WAP detection code is very simple: user-agent is
checked against a few major browser IDs (all of them are HTML-capable
and this check is not actually needed anymore and will go away soon)
and if still not known, we consider every device that sends Accept:
header "vnd.wap.wml" (but not "application/vnd.wap.xhtml+xml"), to be
WAP-only. If we apply these rules, we get only 68 entries that qualify
as WAP which is 0.05% of all mobile requests.
The question is, what's wrong: my research or stats.wikimedia.org?
And if it's indeed just 0.05%, we should probably^W definitely kill
WAP support on our mobile site as it's virtually unmaintained.
-----
[1] http://stats.wikimedia.org/wikimedia/squids/SquidReportRequests.htm
[2] http://stats.wikimedia.org/wikimedia/squids/SquidReportClients.htm
--
Best regards,
Max Semenik ([[User:MaxSem]])