Howdy!
Just wanted to let y’all know about some updates to Discovery Dashboards[1]:
- All dashboards have been updated to use a new collection of datasets[2] - The Wikipedia.org Portal dashboard[4] now tracks: - Clicks on sister project links (T152617) - Clicks on mobile app links (T154634) - The WDQS dashboard[3] now tracks LDF endpoint usage (T153936)
So that’s the front-end news. On the back-end:
- We’ve refactored the entire metric-calculation codebase[5] to use Analytics’ Reportupdater infrastructure[6] (T150915) - That codebase is now thoroughly documented[7] - We’ve introduced additional rules for detecting automata to several scripts that calculate traffic and usage metrics - We’ve erased and backfilled all the metrics from 2017-01-01 using all the new definitions and updated UDFs - So if you see drastic changes to a metric starting on 1st January 2017, that’s why :) - Accordingly, all the graphs have been annotated with this info - We are no longer undercounting full-text and geo search API usage (see [8], hence the jump on 2017-01-01) - Adding new metrics and backfilling data is now way easier - We’re already working on adding the ability to look at several search metrics by language and project (T150410) - We can even add predictive models of metrics AS metrics for the codebase to calculate (T112170, [9])
A huge thanks to Deb for her support and patience through this project, and huge thanks to Chelsy for her thorough code review of and help with [10].
Cheers, Mikhail on behalf of Discovery’s Analysis team
[1]: https://discovery.wmflabs.org/ [2]: https://datasets.wikimedia.org/aggregate-datasets/discovery/ [3]: https://discovery.wmflabs.org/wdqs/ [4]: https://discovery.wmflabs.org/portal/ [5]: https://phabricator.wikimedia.org/diffusion/WDGO/ [6]: https://wikitech.wikimedia.org/wiki/Analytics/Reportupdater [7]: https://github.com/wikimedia/wikimedia-discovery-golden/blob/master/README.m... [8]: https://gerrit.wikimedia.org/r/#/c/315503/ [9]: https://github.com/wikimedia/wikimedia-discovery-golden/blob/master/README.m... [10]: https://gerrit.wikimedia.org/r/#/c/325870/
-- Mikhail Popov // Count Logula Discovery, Wikimedia Foundation
PGP Public Key: https://people.wikimedia.org/~bearloga/public.asc Fingerprint: B362 76D4 FEBE F715 5F40 0C9C 4BC8 A265 E573 3216
Thanks so much, Mikhail, for taking on this huge multi -legged and -armed task of getting the Discovery dashboards migrated to the new ReportUpdater golden codebase which also required an update of the metric calculations. This effort made the dashboards look quite shiny and new (as well as being more accurate!) in the process: check them out if you have a chance.
Chelsy was also a fantastic helper in this effort - completing numerous code reviews along with providing lots of virtual moral support!
Woohoo, Discovery Analysis Team! :)
-- deb tankersley irc: debt Product Manager, Discovery Wikimedia Foundation
On Thu, Mar 9, 2017 at 1:15 PM, Mikhail Popov mpopov@wikimedia.org wrote:
Howdy!
Just wanted to let y’all know about some updates to Discovery Dashboards[1]:
- All dashboards have been updated to use a new collection of datasets[2]
- The Wikipedia.org Portal dashboard[4] now tracks:
- Clicks on sister project links (T152617)
- Clicks on mobile app links (T154634)
- The WDQS dashboard[3] now tracks LDF endpoint usage (T153936)
So that’s the front-end news. On the back-end:
- We’ve refactored the entire metric-calculation codebase[5] to use
Analytics’ Reportupdater infrastructure[6] (T150915)
- That codebase is now thoroughly documented[7]
- We’ve introduced additional rules for detecting automata to several
scripts that calculate traffic and usage metrics
- We’ve erased and backfilled all the metrics from 2017-01-01 using all
the new definitions and updated UDFs - So if you see drastic changes to a metric starting on 1st January 2017, that’s why :) - Accordingly, all the graphs have been annotated with this info
- We are no longer undercounting full-text and geo search API usage (see
[8], hence the jump on 2017-01-01)
- Adding new metrics and backfilling data is now way easier
- We’re already working on adding the ability to look at several
search metrics by language and project (T150410) - We can even add predictive models of metrics AS metrics for the codebase to calculate (T112170, [9])
A huge thanks to Deb for her support and patience through this project, and huge thanks to Chelsy for her thorough code review of and help with [10].
Cheers, Mikhail on behalf of Discovery’s Analysis team
blob/master/README.md [8]: https://gerrit.wikimedia.org/r/#/c/315503/ [9]: https://github.com/wikimedia/wikimedia-discovery-golden/ blob/master/README.md#adding-new-forecasting-modules [10]: https://gerrit.wikimedia.org/r/#/c/325870/
-- Mikhail Popov // Count Logula Discovery, Wikimedia Foundation
PGP Public Key: https://people.wikimedia.org/~bearloga/public.asc Fingerprint: B362 76D4 FEBE F715 5F40 0C9C 4BC8 A265 E573 3216
discovery mailing list discovery@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/discovery
Wow! This is great work. Refactoring backend infrastructure usually isn't fun* and often is under-appreciated but it usually makes everything better—so thanks to all involved for making it happen!
We’re already working on adding the ability to look at several search metrics by language and project (T150410)
You know that's my favorite! Woo hoo!
—Trey
* I think maybe Erik likes it, though. :p
Trey Jones Software Engineer, Discovery Wikimedia Foundation
On Thu, Mar 9, 2017 at 3:15 PM, Mikhail Popov mpopov@wikimedia.org wrote:
Howdy!
Just wanted to let y’all know about some updates to Discovery Dashboards[1]:
- All dashboards have been updated to use a new collection of datasets[2]
- The Wikipedia.org Portal dashboard[4] now tracks:
- Clicks on sister project links (T152617)
- Clicks on mobile app links (T154634)
- The WDQS dashboard[3] now tracks LDF endpoint usage (T153936)
So that’s the front-end news. On the back-end:
- We’ve refactored the entire metric-calculation codebase[5] to use
Analytics’ Reportupdater infrastructure[6] (T150915)
- That codebase is now thoroughly documented[7]
- We’ve introduced additional rules for detecting automata to several
scripts that calculate traffic and usage metrics
- We’ve erased and backfilled all the metrics from 2017-01-01 using all
the new definitions and updated UDFs - So if you see drastic changes to a metric starting on 1st January 2017, that’s why :) - Accordingly, all the graphs have been annotated with this info
- We are no longer undercounting full-text and geo search API usage (see
[8], hence the jump on 2017-01-01)
- Adding new metrics and backfilling data is now way easier
- We’re already working on adding the ability to look at several
search metrics by language and project (T150410) - We can even add predictive models of metrics AS metrics for the codebase to calculate (T112170, [9])
A huge thanks to Deb for her support and patience through this project, and huge thanks to Chelsy for her thorough code review of and help with [10].
Cheers, Mikhail on behalf of Discovery’s Analysis team
blob/master/README.md [8]: https://gerrit.wikimedia.org/r/#/c/315503/ [9]: https://github.com/wikimedia/wikimedia-discovery-golden/ blob/master/README.md#adding-new-forecasting-modules [10]: https://gerrit.wikimedia.org/r/#/c/325870/
-- Mikhail Popov // Count Logula Discovery, Wikimedia Foundation
PGP Public Key: https://people.wikimedia.org/~bearloga/public.asc Fingerprint: B362 76D4 FEBE F715 5F40 0C9C 4BC8 A265 E573 3216
discovery mailing list discovery@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/discovery