+analytics
On Wed, Nov 13, 2013 at 5:09 PM, Jon Robson <jrobson(a)wikimedia.org> wrote:
Thanks so much Juliusz for exploring this and great
work fixing the
schema (apologies for me not predicting that might be an issue) and
sorry for all the pain this must have caused you.
We can't be the only teams using Limn in the Foundation. It might be
worth pulling everyone together. Am I right in thinking that Limn is a
child of the analytics team? Maybe we should at least spend some with
them getting our use case resolved.. I guess this is why we have an
analytics department? I can raise this issue in the next Scrum of
Scrums if it is not resolved by then.
On Wed, Nov 13, 2013 at 3:54 PM, Juliusz Gonera <jgonera(a)wikimedia.org>
wrote:
For the past few days (or more) graphs at
http://mobile-reportcard.wmflabs.org/ stopped updating. The dashboard
consists of two parts: Limn, which displays the data, and backend scripts
that generate the graph data based on Event Logging data. The issue was
caused by two independent problems in the second component:
1. A change of MobileWebEditing schema was incorrectly addressed in the
scripts' config and caused the script to throw an exception.
2. Backend scripts are stupid and not optimized at all.
The first thing is fixed. To work around the second thing I had to
disable
updates of "Editors registered on mobile who
made 5+ edits on enwiki
(mobile+desktop)" graph [1] for now (the query was timing out and
causing an
exception too) and removed the performance graph,
since we'll be using
ganglia (and soon graphite) for that [2]. Graphs should get updated soon.
So why are those backend scripts stupid? Because they run every hour and
recalculate _all_ the values for every single graph. For example, even
though total unique editors for June 2013 will never change, they are
still
recalculated every hour. This was a quick and
easy solution for
generating
graphs, but as Event Logging tables keep growing,
we add more graphs and
those graphs show more and more data, it's no longer performing.
I discussed this briefly with Ori and I think we agree on the general
direction. We should definitely schedule some time for working on this.
We
could start with a spike investigating if there
is a framework for
aggregating the sums that we could use and asking what other teams in the
foundation use for generating their graph data. The results of this spike
and possible following work could be useful not only for the mobile team.
[1]
https://gerrit.wikimedia.org/r/#/c/95298/
[2]
http://ganglia.wikimedia.org/latest/?r=month&cs=&ce=&tab=v&…
--
Juliusz
--
Arthur Richards
Software Engineer, Mobile
[[User:Awjrichards]]
IRC: awjr
+1-415-839-6885 x6687