[Steven] avenue to explore for internal dashboard
needs.
So true. It sure causes me pain and suffering seeing every js library known
to mankind being used there. :)
We will definitely take a look at the tool on the dashboard research we are
doing. By all means send us everything that catches your eye.
One of the major problems with limn is "dashboard discovery". That is not a
visualization problem, but rather an information architecture one.Tessera
does not aim to solve that either. Also our other "hard" problem not solved
by Tessera currently is retrieval of data, we serve data from many
datasources.
[Steven] To back up... As a consumer of numerous
dashboards and someone
[Steven] who has to decide when/how to request creation of them
[Steven] I care about getting a readable new dashboard set up and
[Steven] with as little developer or researcher time
as possible.
Understood. Anything we come up with we will run with other developers to
make sure it is easy to use by them. We have enlisted Yuvi as our Guinea
Pig in residence.
Now, to set expectations right, our first dashboard is only to going to
solve the issues regarding editor metrics while setting up groundwork
others can benefit from to roll out new visualizations. That's it. We are
not solving the whole dashboard problem quite yet.
Please take a look at the prototype of the editor vital signs dashboard as
that makes the point of what is what we are doing in the near term:
the data and metadata into any arbitrary >[Dario]
dashboard/visualization
frontend – whether custom-built, off-the-shelf or even hosted
[Dario] evaluating visualization solutions against
these priorities.
We are doing #2 now, but only for editor vital signs metrics using public
data. Data will be available for anyone to graph served via wikimetrics as
json files. Temporary data is available now in staging:
As for Event Logging is still private data, that might be changing mid term
but not short term.
On Thu, Jul 10, 2014 at 3:56 PM, Dario Taraborelli <
dtaraborelli(a)wikimedia.org> wrote:
Much as I love the idea of adding charting capability
in MediaWiki
(especially if it were to be integrated with a data namespace
<https://meta.wikimedia.org/wiki/DataNamespace> and version controlled
JSON annotations) – I agree with Steven that this seems to solve a
different problem.
The biggest pain points of using Limn to me (on top of the usability
issues mentioned in this thread [1]) are its poor information architecture
and its limited support for data documentation/metadata. We know that it’s
hard at the moment for people to find the data they are looking for or to
be able to navigate in an intuitive form a large set of dashboards. For
example: the first metric we modeled for the vital signs project (newly
registered users), when combined with a single breakdown by platform
(desktop site, mobile site, apps), would result in ~2.5K data series. I
can’t quite figure out how these series would look like and be discoverable
on Limn.
I think the best investment of our time would be to:
(1) give Wikimetrics and EventLogging a standard interface to plug the
data and metadata into any arbitrary dashboard/visualization frontend –
whether custom-built, off-the-shelf or even hosted
(2) start solving the visualization problem incrementally, moving from the
most urgent customer needs and evaluating visualization solutions against
these priorities.
That would give us ample time to bring data (and immediate value) to the
users, while testing the best approach for visualizing it and supporting
more sophisticated requirements for presenting and rendering the data (we
could abandon the first frontend when it stops serving our needs and
migrate to something more sophisticated).
I like the look and feel of Tessera and the fact that it can easily
consume Graphite data, but I share Dan’s concerns about storage.
Dan, I think it would be valuable to put your thoughts on a wiki page, if
you have bandwidth to do so.
Dario
[1] I also want to add that whatever solution we settle on, it needs to be
mobile friendly.
On Jul 9, 2014, at 11:55 PM, Dan Andreescu <dandreescu(a)wikimedia.org>
wrote:
On Wed, Jul 9, 2014 at 4:23 PM, Steven Walling <swalling(a)wikimedia.org>
wrote:
On Wed, Jul 9, 2014 at 1:01 PM, Dan Andreescu <dandreescu(a)wikimedia.org>
wrote:
By the way, if this at all sounds like I'm
proposing a "new" monster
codebase, that is not at all the case. Most of the hard problems will be
out-sourced to promising projects. Like Vega is in the top running to
handle the visualizations themselves and the dashboarding around it will be
very simplistic but solve problems we've encountered with Limn. But again,
very early days.
Yeah to be honest I'm pretty skeptical of such a plan.
To back up... As a consumer of numerous dashboards and someone who has to
decide when/how to request creation of them, I care about getting a
readable new dashboard set up and maintained to run indefinitely with as
little developer or researcher time as possible.
Agreed, Limn fails at this pretty miserably, and it's definitely one of
our top problems to solve.
The main problem with Limn is that to set up a suite of dashboards takes a
very large initial investment.
There are many other problems, a few relevant examples: discovery of
dashboards, documentation of visualization capabilities, lack of
annotations, ease of contributing to the code base
I'm not really sure how shoehorning a
dashboard service on top of
MediaWiki really solves this problem better than just setting up one of the
many existing solutions out there. I don't care about transparent
versioning and authentication, which seems to be the two things that
MediaWiki is really good at in this context.
I'm not sure this is true. You may not care about it, but storage needs
to happen, and I'd rather outsource that problem. Limn's idea of using
file-backed storage made it very inefficient and clumsy to work with. A
custom database, like Tessera is using, is much better but also requires
someone to maintain it and manage access, etc. So more ops burden but less
up-front development. And the definitions would be "further" from our
community. Meaning, for example, if someone defaces a graph, we'd have to
build a "watch this page" mechanism to help us deal with it. I started
where you're starting with Tesera and as I thought of these problems I
slowly migrated to Mediawiki. But I'll try to explain below why I don't
think this is a big undertaking at all. MediaWiki is really easy to use as
a service.
Building a custom tool from scratch is also part
of what got us in this
mess with Limn to begin with.
I see that I have caused a bit of a misunderstanding. So, Limn is well
over 10,000 lines of Coco. This is a dense language that transpiles to
roughly 20,000 lines of Javascript. The tool I'm proposing here is
basically ignoring 90% of the problems that Limn dealt with. Visualization
is the main problem, and that is solved by Vega JS [1]. Of the remaining
problems, we're ignoring about half of them by making this server-less. So
let's examine the points I made above:
* Getting a Dashboard up Quickly. EventLogging is well liked, so I
figured if we did something simpler than that, we'd be starting off on the
right foot. A dashboard could be a simple JSON document on mediawiki,
rendered the way EventLogging schemas are. A page called Dashboard:Growth
could have something like { graphs: [ {name: 'example', data-url: '
http://.../'} ] }. This would be viewable at
http://dashiki.wmflabs.org/dashboards/growth. The JSON could be created
from the dashboarding tool itself, but we can start bare-bones. If this
seems weird or not as fast as you'd like, please describe an ideal scenario
and let's talk about it.
* Dashboard discovery. Pau is designing the beginning of a solution to
this. Tessera says they have not dealt with this yet, and I don't see how
this is a generic problem (but I'd be glad to be wrong). But this is not
some giant project that we'd have to implement. It would organize the data
available in a friendly intuitive way, 'cause Pau is amazing as we all
know. If there's a generic solution out there for this, I haven't found it.
* Documentation of visualization capabilities. This is very well
documented on the Vega wiki, so it should take almost no effort if we
integrate it well. Most other tools we've tried to use are too limiting.
Timeseries only, maps only, etc. I think we would end up stitching a few
out-of-the-box solutions together and that seems more headache than it's
worth.
* Lack of annotations. I think it makes sense to store annotations in
MediaWiki, in a JSON document that's tied to a datafile. This way everyone
using that datafile can share the annotations, and anyone interested in the
history of the document can take advantage of MediaWiki's revision history.
In most other solutions I've seen, annotation is not a social activity,
but something that researchers do to explain their data. In our case, I've
heard many many people ask for something much richer when they talk about
annotations.
* Ease of contributing to the code base. I don't like big code bases. If
a simple dashboard layout built around getting metadata from mediawiki via
json and rendering Vega graphs gets anywhere *near* Limn's size and
incomprehensibility, you can burn me at the stake. I'll light the fire.
The other thing about Coco is that like 100 people in the world can read
it so this would be purely Javascript. And in gerrit.
[1] trifacta.github.io/vega/
p.s. Seriously, I'm not trying to reinvent the wheel here, please let me
know if you think out-of-the-box solves more problems than I think. The
whole idea behind this project is to offload as much as possible, I have a
miles-long backlog I can busy myself with, so I don't need to invent work.
And all of what I said above I'm making up on the spot, so I'm thinking
out loud much more than trying to sell a solution. Would a wiki page
explaining the pros and cons of this decision be a good use of our time?
_______________________________________________
Analytics mailing list
Analytics(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/analytics
_______________________________________________
Analytics mailing list
Analytics(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/analytics