Thanks Tomasz; great feedback! In order:
* yeah, top percentiles were a heavily-requested thing so I built it in from the get-go. Similarly, mean/median so we have some ability to avoid distorting the results when the distribution changes. * The 3 days data thing is a known - https://phabricator.wikimedia.org/T100056 - and is next on my to-do list for bugfixes :). * Glad you like the interface! It's actually functional on mobile, too :D. * Sample rate is crucial, yep. I'm reaching out to the authors of the relevant EL schemas to find out how each was handled. * Sessions < results opened makes sense in the event that users didn't find what they wanted and went back to try again, but I'm not sure how "session" is calculated; this is again something we lack transparency around :(. Dan? You're the apps wizard.
In supporting this: probably nothing at the moment although Nik/Kevin chipping in on the relevant phabricator ticket (https://phabricator.wikimedia.org/T99762 ) to validate how much of a PITA the idea of a unified schema and the associated implementations are, would be good.
I'm sort of shocked to hear "we're supposed to be presenting this data at the next metrics meeting": in the future if there are instances where data is going to be up for public scrutiny, would it be possible to explicitly associate time for that? My goal is to get us to the point where our data is reliable all, or at least, most of the time, and for a fragment of one person's time over two weeks, I think progress on that is pretty fantastic. But prepping data for that kind of event does change the priorities and what tasks should be worked on.
If we want to present data, generally speaking, let's discuss what we can show off. If we want to present the dashboards I'll put my all into making the data at least something where we know the deficiencies, if not something where we consider the deficiencies tolerable.
On 26 May 2015 at 19:24, Tomasz Finc tfinc@wikimedia.org wrote:
Thanks Oliver
Early observations
- Really happy to see top percentiles in load graphs
- Mobile Web has only three days data
- Interface is simple and easy to use
- We need to know the sample rate
- Apps have fewer sessions than results page opened
Speaking over IRC it's clear that we don't have confidence in this data. We need to fix this and fix it quickly so that we can accurately plan our work. We're supposed to be presenting this data at the next metrics meeting and we're not a point where I feel comfortable sharing our data let alone next steps.
Oliver & Dan, what can the team do to support you guys on this? I want you guys to own this and know that were here to support you.
Should I be adding new feature requests and bugs to https://phabricator.wikimedia.org/tag/search-data-analytics/ ?
--tomasz
On Tue, May 26, 2015 at 11:04 AM, James Douglas jdouglas@wikimedia.org wrote:
This is a very exciting preview of things to come.
Where are the data coming from? Am I just confused, or does "6 search sessions per day" seem low?
On Fri, May 22, 2015 at 2:35 PM, Oliver Keyes okeyes@wikimedia.org wrote:
http://searchdata.wmflabs.org/ - boop! This was my Friday. Previously we were playing around with them and testing what we needed with a static snapshot; these dashboards will now update once a day with new information.
It has turned up some bugs ("is the mobile schema just not running?") and there are more metrics to add. But for the time being, is progress :)
-- Oliver Keyes Research Analyst Wikimedia Foundation
Wikimedia-search-private mailing list Wikimedia-search-private@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikimedia-search-private
Wikimedia-search-private mailing list Wikimedia-search-private@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikimedia-search-private
Wikimedia-search-private mailing list Wikimedia-search-private@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikimedia-search-private