Roger that! I think we could squeeze it in -- the change would be pretty straightforward. We'll be able to release a Beta with this A/B test in short order, but it will probably be a couple weeks until our next production release. I hope that's all right.
On Sat, Jan 30, 2016 at 1:02 PM, Gabriel Wicke gwicke@wikimedia.org wrote:
We are also happy to add cached entry points for high-traffic end points in the REST API. I commented to that effect at https://phabricator.wikimedia.org/T124216#1984206. Let us know if you think this would be useful for this use case.
On Sat, Jan 30, 2016 at 8:11 AM, Adam Baso abaso@wikimedia.org wrote:
Okay. As per https://phabricator.wikimedia.org/T124225#1984080 I think
if
we're doing near term experimentation with a controlled A/B test the
Android
app is the only logical place to start. Dmitry, can that work for you?
It's
not required, but I think it would be neat to see if we can move the
needle
even more. Of course your quarterly goals take top priority...but what do you think?
On Sat, Jan 23, 2016 at 5:58 AM, Adam Baso abaso@wikimedia.org wrote:
Hey all, am planning to look at Phabricator tasks and provide a reply during the upcoming weekdays. Just wanted to acknowledge I saw your
replies!
On Friday, January 22, 2016, Erik Bernhardson <
ebernhardson@wikimedia.org>
wrote:
On Thu, Jan 21, 2016 at 1:29 AM, Joaquin Oltra Hernandez jhernandez@wikimedia.org wrote:
Regarding the caching, we would need to agree between apps and web
about
the url and smaxage parameter as Adam noted so that the urls are
exactly the
same to not bloat varnish and reuse the same cached objects across platforms.
It is an extremely adhoc and brittle solution but seems like it would
be
the greatest win.
20% of the traffic from searches by being only in android and web beta seems a lot to me, and we should work on reducing it, otherwise when
it hits
web stable we're going to crush the servers, so caching seems the
highest
priority.
To clarify its 20% of the load, as opposed to 20% of the traffic. But same difference :)
Let's chime in https://phabricator.wikimedia.org/T124216 and continue the cache discussion there.
Regarding the validity of results with opening text only, how should
we
proceed? Adam?
I've put together https://phabricator.wikimedia.org/T124258 to track putting together an AB test that measures the difference in click
through
rates for the two approaches.
On Wed, Jan 20, 2016 at 9:34 PM, David Causse dcausse@wikimedia.org wrote:
Hi,
Yes we can combine many factors, from templates (quality but also disambiguation/stubs), size and others. Today cirrus uses mostly the number of incoming links which (imho) is not very good for morelike. On enwiki results will also be scored according the weights defined
in
https://en.wikipedia.org/wiki/MediaWiki:Cirrussearch-boost-templates
.
I wrote a small bash to compare results : https://gist.github.com/nomoa/93c5097e3c3cb3b6ebad Here is some random results from the list (Semetimes better,
sometimes
worse) :
$ sh morelike.sh Revolution_Muslim Defaults "title": "Chess", "title": "Suicide attack", "title": "Zachary Adam Chesser", ======= Opening text no boost links "title": "Hungarian Revolution of 1956", "title": "Muslims for America", "title": "Salafist Front",
$ sh morelike.sh Chesser Defaults "title": "Chess", "title": "Edinburgh", "title": "Edinburgh Corn Exchange", ======= Opening text no boost links "title": "Dreghorn Barracks", "title": "Edinburgh Chess Club", "title": "Threipmuir Reservoir",
$ sh morelike.sh Time_%28disambiguation%29 Defaults "title": "Atlantis: The Lost Empire", "title": "Stargate", "title": "Stargate SG-1", ======= Opening text no boost links "title": "Father Time (disambiguation)", "title": "The Last Time", "title": "Time After Time",
Le 20/01/2016 19:34, Jon Robson a écrit : > > I'm actually interested to see whether this yields better results
in
> certain examples where the algorithm is lacking [1]. If it's done as > an A/B test we could even measure things such as click throughs in
the
> related article feature (whether they go up or not) > > Out of interest is it also possible to take article size and type
into
> account and not returning any morelike results for things like > disambiguation pages and stubs? > > [1] https://www.mediawiki.org/wiki/Topic:Swsjajvdll3pf8ya > > > On Wed, Jan 20, 2016 at 9:47 AM, Adam Baso abaso@wikimedia.org > wrote: >> >> One thing we could do regarding the quality of the output is check >> results >> against a random sample of popular articles (example approach to
find
>> some >> articles) on mdot Wikipedia. Presuming that improves the quality of >> the >> recommendations or at least does not degrade them, we should
consider
>> adding >> the enhancement task to a future sprint, with further
instrumentation
>> and >> A/B testing / timeboxed beta test, etc. >> >> Joaquin, smaxage (e.g., 24 hour cached responses) does seem a good >> fix for >> now for further reduction of client perceived wait, at least for >> non-cold >> cache requests, even if we stop beating up the backend. Does anyone >> know of >> a compelling reason to not do that for the time being? The main
thing
>> that >> comes to mind as always is growing the Varnish cache object pool - >> probably >> not a huge deal while the thing is only in beta, but on the stable >> channel >> maybe noteworthy because it would run on probably most pages (but >> that's >> what edge caches are for, after all). >> >> Erik, from your perspective does use of smaxage relieve the backend >> sufficiently? >> >> If we do smaxage, then Web, Android, iOS should standardize their >> URLs so we >> get more cache hits at the edge across all clients. Here's the URL
I
>> see >> being used on the web today from mobile web beta: >> >> >>
https://en.m.wikipedia.org/w/api.php?action=query&format=json&format...
>> >> >> -Adam >> >> On Wed, Jan 20, 2016 at 7:45 AM, Joaquin Oltra Hernandez >> jhernandez@wikimedia.org wrote: >>> >>> I'd be up to it if we manage to cram it up in a following sprint
and
>>> it is >>> worth it. >>> >>> We could run a controlled test against production with a long
batch
>>> of >>> articles and check median/percentiles response time with repeated >>> runs and >>> highlight the different results for human inspection regarding >>> quality. >>> >>> It's been noted previously that the results are far from ideal >>> (which they >>> are because it is just morelike), and I think it would be a great >>> idea to >>> change the endpoint to a specific one that is smarter and has some >>> cache (we >>> could do much more to get relevant results besides text
similarity,
>>> take >>> into account links, or see also links if there are, etc...). >>> >>> As a note, in mobile web the related articles extension allows >>> editors to >>> specify articles to show in the section, which would avoid queries >>> to >>> cirrussearch if it was more used (once rolled into stable I
guess).
>>> >>> I remember that the performance related task was closed as
resolved
>>> (https://phabricator.wikimedia.org/T121254#1907192), should we >>> reopen it or >>> create a new one? >>> >>> I'm not sure if we ended up adding the smaxage parameter (I think
we
>>> didn't), should we? To me it seems a no-brainer that we should be >>> caching >>> this results in varnish since they don't need to be completely up
to
>>> date >>> for this use case. >>> >>> On Tue, Jan 19, 2016 at 11:54 PM, Erik Bernhardson >>> ebernhardson@wikimedia.org wrote: >>>> >>>> Both mobile apps and web are using CirrusSearch's morelike:
feature
>>>> which >>>> is showing some performance issues on our end. We would like to >>>> make a >>>> performance optimization to it, but before we would prefer to run >>>> an A/B >>>> test to see if the results are still "about as good" as they are >>>> currently. >>>> >>>> The optimization is basically: Currently more like this takes the >>>> entire >>>> article into account, we would like to change this to take only
the
>>>> opening >>>> text of an article into account. This should reduce the amount of >>>> work we >>>> have to do on the backend saving both server load and latency the >>>> user sees >>>> running the query. >>>> >>>> This can be triggered by adding these two query parameters to the >>>> search >>>> api request that is being performed: >>>> >>>> cirrusMltUseFields=yes&cirrusMltFields=opening_text >>>> >>>> >>>> The API will give a warning that these parameters do not exist,
but
>>>> they >>>> are safe to ignore. Would any of you be willing to run this test? >>>> We would >>>> basically want to look at user perceived latency along with click >>>> through >>>> rates for the current default setup along with the restricted
setup
>>>> using >>>> only opening_text. >>>> >>>> Erik B. >>>> >>>> _______________________________________________ >>>> Mobile-l mailing list >>>> Mobile-l@lists.wikimedia.org >>>> https://lists.wikimedia.org/mailman/listinfo/mobile-l >>>> >> >> _______________________________________________ >> Mobile-l mailing list >> Mobile-l@lists.wikimedia.org >> https://lists.wikimedia.org/mailman/listinfo/mobile-l >> > _______________________________________________ > Mobile-l mailing list > Mobile-l@lists.wikimedia.org > https://lists.wikimedia.org/mailman/listinfo/mobile-l
Mobile-l mailing list Mobile-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mobile-l
Mobile-l mailing list Mobile-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mobile-l
Mobile-l mailing list Mobile-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mobile-l
-- Gabriel Wicke Principal Engineer, Wikimedia Foundation
Mobile-l mailing list Mobile-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mobile-l