As many of you are aware, Discovery wants to run a QuickSurvey
<https://www.mediawiki.org/wiki/Extension:QuickSurveys> in Q3 to ask users
if they're satisfied with search results. A requirement of this is that we
can tie the survey responses to our search schema and satisfaction metric,
so that we can correlate responses with the data to figure out how
effective our metric actually is at measuring search satisfaction.
Adam, Julien, and I had a brief chat today. We agreed that our goal is to
be able to tie the data together by whatever means necessary, i.e. not
necessarily by changing QuickSurveys if it's easier a different way. Adam
mentioned that QuickSurveys records mw.user.sessionId, which may be
suitable and persistent enough that we could tie our data together if we
added that to our search logging. Obviously, there are other stakeholders
to talk to (Erik, Oliver) and questions to resolve; Julien wants to have a
meeting with Erik and Oliver next week.
Overall, this is good news. :-)
Thanks!
Dan
--
Dan Garry
Lead Product Manager, Discovery
Wikimedia Foundation
Hey all,
Just writing to let people know that we expect the dashboards to be
backlogged for a few days. This is due to ongoing EventLogging
maintenance at the Analytics Engineering end of things, which should
be neatly tidied up by Tuesday-ish. I'll let y'all know if I get more
information or a more refined idea of when it'll be done :)
--
Oliver Keyes
Count Logula
Wikimedia Foundation
Yesterday in the quarterly review Dan mentioned that our current user
satisfaction metric uses the somewhat arbitrary 10s dwell time cutoff for a
successful search, and that we want to use a survey to correlate
qualitative and quantitative values to pin down a better cutoff for our
users. I don't remember whether Dan mentioned it, or I was just rehashing
the notion on my own, but it may be difficult to pin down a specific cutoff.
A wild thought appears! Why do we have to pin down a specific cut off? Why
can't we have a probabilistic user satisfaction metric? (Other then
complexity and computational speed, which may be relevant.)
We have the ability to gather so much data that we could easily compute
something like this: 20% of users are satisfied when dwell time is <5s, 35%
for 5-10s, 75% for 10-60s, 98% for 1m-5m, 85% for 5m-20m, and 80% for >20m.
Determining the cutoffs might be tricky, and computation is more complex
than counting, but not ridiculously complicated, and potentially much more
accurate for large samples. Presenting the results is still easy: "54.7% of
our users are happy with their search results based on our dwell-time
model".
I tried to do a quick search for papers on this topic, but I didn't find
anything. I'm not familiar with the literature, so that may not mean much.
Okay, back to the TextCat mines....
—Trey
Trey Jones
Software Engineer, Discovery
Wikimedia Foundation
What a long, strange trip it's been. Full write up here:
https://www.mediawiki.org/wiki/User:TJones_(WMF)/Notes/How_Wrong_Would_Usin…
Summary:
- We can't reliably catch day-by-day outliers by using the page view
information that comes along with edits because not enough edits happen.
- Weekly averages (rather than day-by-day counts) don't usually move
that much (i.e., by more than a factor of 2). If we can capture daily or
weekly page view stats, that should keep us reasonably up-to-date overall,
esp. if these moderate swings don't affect scoring much.
- We could gather daily statistics from the page view API and store the
high mark over the last 3-7 for the top 1K to 50K most-viewed articles. The
ranking algorithm could use either the rolling daily average or the high
mark (which ever is higher).
- For "Trending" topics, looking at the top 1K page views every hour
(unfortunately not currently available through the PageviewAPI) would be
the best way to catch suddenly trending topics if we want to be more
responsive, but it isn't clear that it's worth it.
—Trey
Trey Jones
Software Engineer, Discovery
Wikimedia Foundation
Hello!
In the standup, I mentioned that we should collect the remaining blockers
with the completion suggester together. Well, it turns out we already have a
task <https://phabricator.wikimedia.org/T121616> for it! I slightly
repurposed it and tweaked it, and now it's good to go.
Please add your technical blockers to that task, so that we have a unified
place to track the remaining issues that need to be resolved for a more
full rollout.
Thanks!
Dan
--
Dan Garry
Lead Product Manager, Discovery
Wikimedia Foundation
Hey all,
You might notice some weird spikes in the dashboard data. This is
caused by some duplicated days worth of data due to the stop-and-start
of services we're dependent on. We expect to have it resolved by the
end of the day.
Thanks!
--
Oliver Keyes
Count Logula
Wikimedia Foundation
Just a heads-up that we've renamed the UX sprint board in phabricator
to the Portal sprint board, since, well, that's what we're using it
for ;p. It can now be found at
https://phabricator.wikimedia.org/tag/discovery-portal-sprint/
Thanks!
--
Oliver Keyes
Count Logula
Wikimedia Foundation
Heyo,
I had a 1:1 with Dan today and asked what I could do to take some work
off his plate/push us forward. The response was that I should take the
ideas I had in the portal standup and work them into phabricator
tickets.
This does not equate to 'we will work on these things' or anything
else other than 'this is an idea that has been logged' - deciding we
should work on a thing is for Deborah :). So if you see cards flying
past, comment away to refine the ideas but don't worry that there's a
process fail going on.
Thanks,
--
Oliver Keyes
Count Logula
Wikimedia Foundation
For some reason today I wanted to look up Mikhail Baryshnikov. It's been a
while so I forgot how to spell his last name. I didn't try very hard, and I
got no enwiki result. Google, of course, found the correct spelling, which
I then used on enwiki.
Since I used to do name searching and matching, this gave me an idea, which
generalizes beyond just names.
For every article title (and maybe each redirect—we could look into that)
we could generate a phonetic index[1] and store those in a special
EalasticSearch index. (We could look at storing multiple phonetic indexes
for better recall, possibly generated by multiple algorithms; some, like
Double Metaphone, generate multiple index by themselves.)
Then, under certain circumstances (say, zero results and no suggestion from
any other source, or no result with a score above a certain cutoff, or too
few results, etc.), we could make a suggestion and/or show results based on
matching phonetic index plus some score (say, a mix of page views and page
rank, or whatever scoring we've got going on).
So, when some doofus (hey, that's me!) comes along and searches for
"borishnakoff" (worse than what I actually searched for), we could correct
to *baryshnikov* (there's page with that title) or give *Mikhail
Baryshnikov* as a result (likely the top scoring item with the same
phonetic index in the title), or something similar.
Other algorithms exist (and can be devised) for languages other than
English, so the maximally fleshed out version of this would offer a choice
of phonetic indexing algorithms, but I get ahead of myself.
*Has anyone looked into this kind of phonetic indexing for enwiki,
Wikipedia in general, or other wikimedia projects before?*
I have some additional thoughts on how to test the effectiveness of
phonetic indexing on zero results for enwiki without having to fully
implement everything if the index sounds like something we could afford to
build.
Thoughts?
—Trey
[1] https://en.wikipedia.org/wiki/Phonetic_algorithm — Briefly, as an
example, you drop non-initial vowels and duplicate letters, and collapse
letters that tend to sound alike, while taking into account orthographic
conventions like sh, ch, th, initial kn- or pt-, etc. So both *baryshnikov*
and *borishnakoff* are likely to come out something like BRXNGV.
Trey Jones
Software Engineer, Discovery
Wikimedia Foundation
Hi!
I was asked about getting access to query logs for Wikidata Query
Service, for research purposes. So I'd like to start the discussion on
it, specifically:
1. Can we do it at all - technically, legally, privacy-wise? (note we're
talking about SPARQL query text only, no other information to be provided)
2. Are there any considerations why we may want *not* to do it even if
we could?
3. How hard would it be to make such export and do we have any existing
infrastructure that should be used for this?
All ideas/comments about providing (or not providing :) access to this
data are welcome.
--
Stas Malyshev
smalyshev(a)wikimedia.org