I was a bit irritated yesterday to learn that we can automate the creation of Limn graphs and speed up the process.
I had become so tired of manually copying and pasting existing graphs and manually editing them to work for a new graph that I knocked up a script to do this for me. The script simply took an SQL query and the config file and generates all the necessary JSON files for it so that it shows up on the Limn dashboard.
With this script I was able to generate 5 graphs in the time it takes me to generate 1.
However since uploading the script [1] I have now learnt other scripts like this exist. Please can we standardise on a way to generate these graphs (either locally or on the server) and detail it in the README to make this whole process of graph generation nicer for everyone involved?
I've added some graphs (which should update soon) that show activity in the left navigation menu, on the watchlist page and on the diff page. We had this data so it seemed silly not to display it somewhere. When the data becomes available you'll notice that interestingly 'Home' link in the main menu is our most widely used feature. It will be great to see how that changes when search becomes available on special pages. Likewise random is a very widely used feature - we should continue experimenting with that and try and use it to engage new editors.
[1] https://gerrit.wikimedia.org/r/#/c/110271/2/generate-graph.py [2] http://mobile-reportcard.wmflabs.org/#other-graphs-tab
Jon, I think I added confusion to this because I forgot the history. Basically Yuvi originally wrote this with limnpy. When we deployed to stat1 we had to get rid of limnpy because it depends on pandas. I agree with you, we should standardize how people should interact with limn, and I have an idea that I think might work. Let's talk about it offline, but the gist is:
Instead of adding this to the dashboard:
"graph_ids": [ "thanks-daily", "menu-daily", "watchlist-activity", "diff-activity" ]
you could add this:
"graph_ids": [ {"title": "Thanks Daily", "datafile_url": " http://stat1001.wikimedia.org/limn-public-data/mobile/datafiles/thanks-daily... "}, {"title": "Menu Daily", "datafile_url": " http://stat1001.wikimedia.org/limn-public-data/mobile/datafiles/menu-daily.c... "}, {"title": "Watchlist Activity", "datafile_url": " http://stat1001.wikimedia.org/limn-public-data/mobile/datafiles/watchlist-ac... "}, {"title": "Diff Activity", "datafile_url": " http://stat1001.wikimedia.org/limn-public-data/mobile/datafiles/diff-activit... "} ]
And limn could do the rest for you using the logic on this page: http://mobile-reportcard.wmflabs.org/datasources
What do you think? Would this be useful? It seems like a simple change and I'd love to get it done for you, if it'll make your life easier. If it's not simple I'll abandon it and we'll all dislike limn just a little bit more :)
Some more context for people unfamiliar:
Right now, to get a graph:
1. write some sql and maybe python to extract data from EventLogging and MediaWiki dbs into a simple csv datafile 2. change the yaml file to define how often the datafile is generated 3. deploy the sql / python / yaml changes to stat1 4. create a datasource that is metadata for the datafile 5. create a graph that is visualizing one datasource 6. edit the dashboard to reference the new graph 7. deploy the metadata files to limn0
We are working on making wikimetrics replace 1, 2, and 3. This will not happen right away but card 1376 in mingle is a big step towards it. The idea above makes 4 and 5 go away.
On Thu, Jan 30, 2014 at 1:25 PM, Jon Robson jrobson@wikimedia.org wrote:
I was a bit irritated yesterday to learn that we can automate the creation of Limn graphs and speed up the process.
I had become so tired of manually copying and pasting existing graphs and manually editing them to work for a new graph that I knocked up a script to do this for me. The script simply took an SQL query and the config file and generates all the necessary JSON files for it so that it shows up on the Limn dashboard.
With this script I was able to generate 5 graphs in the time it takes me to generate 1.
However since uploading the script [1] I have now learnt other scripts like this exist. Please can we standardise on a way to generate these graphs (either locally or on the server) and detail it in the README to make this whole process of graph generation nicer for everyone involved?
I've added some graphs (which should update soon) that show activity in the left navigation menu, on the watchlist page and on the diff page. We had this data so it seemed silly not to display it somewhere. When the data becomes available you'll notice that interestingly 'Home' link in the main menu is our most widely used feature. It will be great to see how that changes when search becomes available on special pages. Likewise random is a very widely used feature - we should continue experimenting with that and try and use it to engage new editors.
[1] https://gerrit.wikimedia.org/r/#/c/110271/2/generate-graph.py [2] http://mobile-reportcard.wmflabs.org/#other-graphs-tab
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
This looks like a great improvement. Technically it should be possible to setup a graph just from a single config file and an SQL query. Everything else should just work magically :)
On Thu, Jan 30, 2014 at 12:43 PM, Dan Andreescu dandreescu@wikimedia.org wrote:
Jon, I think I added confusion to this because I forgot the history. Basically Yuvi originally wrote this with limnpy. When we deployed to stat1 we had to get rid of limnpy because it depends on pandas. I agree with you, we should standardize how people should interact with limn, and I have an idea that I think might work. Let's talk about it offline, but the gist is:
Instead of adding this to the dashboard:
"graph_ids": [ "thanks-daily", "menu-daily", "watchlist-activity", "diff-activity" ]
you could add this:
"graph_ids": [ {"title": "Thanks Daily", "datafile_url": "http://stat1001.wikimedia.org/limn-public-data/mobile/datafiles/thanks-daily..., {"title": "Menu Daily", "datafile_url": "http://stat1001.wikimedia.org/limn-public-data/mobile/datafiles/menu-daily.c..., {"title": "Watchlist Activity", "datafile_url": "http://stat1001.wikimedia.org/limn-public-data/mobile/datafiles/watchlist-ac..., {"title": "Diff Activity", "datafile_url": "http://stat1001.wikimedia.org/limn-public-data/mobile/datafiles/diff-activit... ]
And limn could do the rest for you using the logic on this page: http://mobile-reportcard.wmflabs.org/datasources
What do you think? Would this be useful? It seems like a simple change and I'd love to get it done for you, if it'll make your life easier. If it's not simple I'll abandon it and we'll all dislike limn just a little bit more :)
Some more context for people unfamiliar:
Right now, to get a graph:
- write some sql and maybe python to extract data from EventLogging and
MediaWiki dbs into a simple csv datafile 2. change the yaml file to define how often the datafile is generated 3. deploy the sql / python / yaml changes to stat1 4. create a datasource that is metadata for the datafile 5. create a graph that is visualizing one datasource 6. edit the dashboard to reference the new graph 7. deploy the metadata files to limn0
We are working on making wikimetrics replace 1, 2, and 3. This will not happen right away but card 1376 in mingle is a big step towards it. The idea above makes 4 and 5 go away.
On Thu, Jan 30, 2014 at 1:25 PM, Jon Robson jrobson@wikimedia.org wrote:
I was a bit irritated yesterday to learn that we can automate the creation of Limn graphs and speed up the process.
I had become so tired of manually copying and pasting existing graphs and manually editing them to work for a new graph that I knocked up a script to do this for me. The script simply took an SQL query and the config file and generates all the necessary JSON files for it so that it shows up on the Limn dashboard.
With this script I was able to generate 5 graphs in the time it takes me to generate 1.
However since uploading the script [1] I have now learnt other scripts like this exist. Please can we standardise on a way to generate these graphs (either locally or on the server) and detail it in the README to make this whole process of graph generation nicer for everyone involved?
I've added some graphs (which should update soon) that show activity in the left navigation menu, on the watchlist page and on the diff page. We had this data so it seemed silly not to display it somewhere. When the data becomes available you'll notice that interestingly 'Home' link in the main menu is our most widely used feature. It will be great to see how that changes when search becomes available on special pages. Likewise random is a very widely used feature - we should continue experimenting with that and try and use it to engage new editors.
[1] https://gerrit.wikimedia.org/r/#/c/110271/2/generate-graph.py [2] http://mobile-reportcard.wmflabs.org/#other-graphs-tab
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
This looks like a great improvement. Technically it should be possible to setup a graph just from a single config file and an SQL query. Everything else should just work magically :)
ok, cool. I've deployed the change to limn0 so it's ready for you to try it. It's not quite as magical as you say, but now the steps are:
1. write some sql and maybe python to extract data from EventLogging and MediaWiki dbs into a simple csv datafile 2. change the yaml file to define how often the datafile is generated 3. deploy the sql / python / yaml changes to stat1 [magic] 6. edit the dashboard to reference the datafile generated above 7. deploy the metadata files to limn0