Q3 goals

List overview All Threads
Download

newer

older

Re: [discovery] [Analytics] An...

WebPageTest dashboard for portals

Erik Bernhardson

5 Nov 2015 5 Nov '15

10:56 p.m.

I happened to be looking at the proposed Q3 goals yesterday, currently they say:

- Make www.wikipedia.org a portal for exploring open content on Wikimedia sites. - Bring more consistency to the user experience for search across desktop, mobile web, and mobile apps. - Enhance search results and expose users to other interesting content by improving interwiki search integration.

My concern is that our current user satisfaction metric suggests 15% of users are happy with the results they are getting. This is really bad. I would prefer to see us focus on search relevance and improving the scoring of what we already have before spending more focus on interwiki search. I don't have anything to point to yet, but for search satisfaction to be so low I don't think that's because we arn't surfacing content from other wiki's, I think its because when you search for something we surface all kinds of results that aren't nearly as relevant as other data we have in the corpus.

I really want to see us focus on fixing what we already have and validating the features we already support before we go whole hog on incorperating all kinds of new data.

Erik B.

Attachments:

attachment.htm (text/html — 1.3 KB)

Show replies by date

billinghurst

6 Nov 6 Nov

12:50 a.m.

I know that the WS community has some long-awaited phabricator requests for search improvements.

An example relates to search, subpages and typeahead that impact upon our compilation works .

As background, as the WSes reproduce PD works where the components of a work will be subpages -- that can be biographical or an article in a journal.

Search and typeahead are currently designed (packaged?) for the predominant component of the lead name of the work. So that works well for Encyclopaedia Britannica, but is a lot less useful for 'shakespeare' for each biographical work within each encyclopaedia.

Typeahead is especially limited where content pages are in multiple namespaces and/or where subpages are utilised.

Prior to improving interwiki, fixing up intrawiki would be a good first step.

Billinghurst

On Fri, 6 Nov 2015 08:56 Erik Bernhardson ebernhardson@wikimedia.org wrote:

...

I happened to be looking at the proposed Q3 goals yesterday, currently they say:

Make www.wikipedia.org a portal for exploring open content on

Wikimedia sites.

Bring more consistency to the user experience for search across

desktop, mobile web, and mobile apps.

Enhance search results and expose users to other interesting content

by improving interwiki search integration.

My concern is that our current user satisfaction metric suggests 15% of users are happy with the results they are getting. This is really bad. I would prefer to see us focus on search relevance and improving the scoring of what we already have before spending more focus on interwiki search. I don't have anything to point to yet, but for search satisfaction to be so low I don't think that's because we arn't surfacing content from other wiki's, I think its because when you search for something we surface all kinds of results that aren't nearly as relevant as other data we have in the corpus.

I really want to see us focus on fixing what we already have and validating the features we already support before we go whole hog on incorperating all kinds of new data.

Erik B. _______________________________________________ discovery mailing list discovery@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/discovery

Federico Leva (Nemo)

8:28 a.m.

Erik Bernhardson, 05/11/2015 22:56:

...

My concern is that our current user satisfaction metric suggests 15% of users are happy with the results they are getting. This is really bad. I would prefer to see us focus on search relevance and improving the scoring of what we already have before spending more focus on interwiki search.

What expense? The code for interwiki search is ready, only https://phabricator.wikimedia.org/T96881 needs fixing AFAIK. I agree that inlining the interwiki results is a bit harder, maybe that can be done later.

15 % is a low satisfaction (but is that en.wiki? en.wiki has lots of garbage of course), sure. On the bright side, that means it's easy to improve. To make up numbers: if even just 3 % of users search dictionary definitions on Wikipedia, interwiki search could increase the pool of happy users by 20 %. ;-)

Nemo

Erik Bernhardson

9:07 a.m.

The satisfaction metric is for all searches across all wikis. This includes searches on wiktionarys.

I don't think putting new information, such as Portuguese results on Spanish Wikipedia, is going to have nearly the same level of impact as improving the existing search results. We have an incredible amount of information within the individual wiki's that we fail to surface. The existing scoring algorithms are naive and don't even begin to incorperate the things the world has learned in the last 20 years. For example one of the factors we use as an indicator of relevance is the number of incoming links, but almost 20 years ago two PhD students from Stanford described a significantly better way, called PageRank. I think we would be much better off integrating the learning's of the last 20 years than naively just throwing more data into the pile and hoping something better comes out.

Erik B On Nov 5, 2015 11:28 PM, "Federico Leva (Nemo)" nemowiki@gmail.com wrote:

...

Erik Bernhardson, 05/11/2015 22:56:

...
My concern is that our current user satisfaction metric suggests 15% of users are happy with the results they are getting. This is really bad. I would prefer to see us focus on search relevance and improving the scoring of what we already have before spending more focus on interwiki search.

What expense? The code for interwiki search is ready, only https://phabricator.wikimedia.org/T96881 needs fixing AFAIK. I agree that inlining the interwiki results is a bit harder, maybe that can be done later.

15 % is a low satisfaction (but is that en.wiki? en.wiki has lots of garbage of course), sure. On the bright side, that means it's easy to improve. To make up numbers: if even just 3 % of users search dictionary definitions on Wikipedia, interwiki search could increase the pool of happy users by 20 %. ;-)

Nemo

discovery mailing list discovery@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/discovery

David Causse

9:41 a.m.

Le 05/11/2015 22:56, Erik Bernhardson a écrit :

...

I really want to see us focus on fixing what we already have and validating the features we already support before we go whole hog on incorperating all kinds of new data.

Hi,

I totally agree, there's some existing features that need to be reviewed, tuned or rewritten. Some queries give better results if disabled: - kennedy[1] with default features enable does not bring JFK in the first page - kennedy[2] with some features disabled (all fields, boost links) brings JFK in the top 3

Working without a relevancy lab will always lead to discrepancies like that, the developer will focus on a limited set of 4/5 queries to develop the feature with a high risk to break previous features. I'd really like to use the relevancy lab to review existing features.

[1] https://en.wikipedia.org/w/index.php?title=Special%3ASearch&profile=defa... [2] https://en.wikipedia.org/w/index.php?title=Special%3ASearch&profile=defa...

Oliver Keyes

4:45 p.m.

+1 on reviewing existing features. That it is standard does not mean that it works, and it's nice to be able to pass results back upstream.

On 6 November 2015 at 03:41, David Causse dcausse@wikimedia.org wrote:

...

Le 05/11/2015 22:56, Erik Bernhardson a écrit :

...
I really want to see us focus on fixing what we already have and validating the features we already support before we go whole hog on incorperating all kinds of new data.

Hi,

I totally agree, there's some existing features that need to be reviewed, tuned or rewritten. Some queries give better results if disabled:

kennedy[1] with default features enable does not bring JFK in the first

page

kennedy[2] with some features disabled (all fields, boost links) brings

JFK in the top 3

Working without a relevancy lab will always lead to discrepancies like that, the developer will focus on a limited set of 4/5 queries to develop the feature with a high risk to break previous features. I'd really like to use the relevancy lab to review existing features.

[1] https://en.wikipedia.org/w/index.php?title=Special%3ASearch&profile=defa... [2] https://en.wikipedia.org/w/index.php?title=Special%3ASearch&profile=defa...

discovery mailing list discovery@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/discovery

-- Oliver Keyes Count Logula Wikimedia Foundation

Trey Jones

13 Nov 13 Nov

9:51 p.m.

I'm only a week late to the party—and it's Friday the 13th so anything goes, right?

Erik wrote:

...

I would prefer to see us focus on search relevance and improving the scoring of what we already have before spending more focus on interwiki search.

David wrote:

...

Working without a relevancy lab will always lead to discrepancies like that, the developer will focus on a limited set of 4/5 queries to develop the feature with a high risk to break previous features. I'd really like to use the relevancy lab to review existing features.

David's uncovered a number of weird results with the standard search config (as have others), and while I love to say, "the plural of anecdote is not data", but that's David's point, we need to assess performance overall, not just on the motivational examples.

The relevance lab will let us test a lot of options quickly and relatively cheaply. I've been thinking about it in my 10% time and I've got a line on how to handle annotations (including "required result") that even in the absence of a proper gold standard corpus makes it feasible to collect examples like David's and use them as "search quality unit tests" to make sure we don't break things.

The cross-language cross-wiki task is endlessly fascinating (to a language nerd like me), but I worry that the maximum potential impact is low, and that success is very hard to measure, because the plausible use cases are so complex (esp. right now—I want inline surveys, dagnabbit!). I think the same may be true of other cross-wiki searching.

I think this also fits with the overall Discovery vision of first looking inward and making sure our fundamentals are sound.

Can we talk more about the theory and practice of updating our Q3 goals in our next weekly meeting?

—Trey

Trey Jones Software Engineer, Discovery Wikimedia Foundation

On Fri, Nov 6, 2015 at 10:45 AM, Oliver Keyes okeyes@wikimedia.org wrote:

...

+1 on reviewing existing features. That it is standard does not mean that it works, and it's nice to be able to pass results back upstream.

On 6 November 2015 at 03:41, David Causse dcausse@wikimedia.org wrote:

...
Le 05/11/2015 22:56, Erik Bernhardson a écrit :

...
I really want to see us focus on fixing what we already have and validating the features we already support before we go whole hog on incorperating all kinds of new data.

Hi,

I totally agree, there's some existing features that need to be reviewed, tuned or rewritten. Some queries give better results if disabled:

kennedy[1] with default features enable does not bring JFK in the first

page

kennedy[2] with some features disabled (all fields, boost links) brings

JFK in the top 3

Working without a relevancy lab will always lead to discrepancies like

that,

...
the developer will focus on a limited set of 4/5 queries to develop the feature with a high risk to break previous features. I'd really like to use the relevancy lab to review existing features.

[1]

https://en.wikipedia.org/w/index.php?title=Special%3ASearch&profile=defa...

...
[2]

https://en.wikipedia.org/w/index.php?title=Special%3ASearch&profile=defa...

...

discovery mailing list discovery@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/discovery

-- Oliver Keyes Count Logula Wikimedia Foundation

discovery mailing list discovery@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/discovery

Federico Leva (Nemo)

10:04 p.m.

Trey Jones, 13/11/2015 21:51:

...

The cross-language cross-wiki task is endlessly fascinating (to a language nerd like me), but I worry that the maximum potential impact is low, and that success is very hard to measure

On the other hand success becomes very easy to measure if you define success as driving traffic to smaller wikis (e.g. counting clicks).

Nemo

Erik Bernhardson

10:24 p.m.

On Fri, Nov 13, 2015 at 1:04 PM, Federico Leva (Nemo) nemowiki@gmail.com wrote:

...

Trey Jones, 13/11/2015 21:51:

...
The cross-language cross-wiki task is endlessly fascinating (to a language nerd like me), but I worry that the maximum potential impact is low, and that success is very hard to measure

On the other hand success becomes very easy to measure if you define success as driving traffic to smaller wikis (e.g. counting clicks).

Nemo

But is that success? If you can drive an extra 10k clicks to smaller wiki's, or an extra 500k clicks to primary wikis (totally pulling numbers out of a hat), was the effort to drive 10k clicks worth it?

Pine W

10:29 p.m.

Is it possible to work on both the large wiki and small wiki opportunities concurrently?

I agree with the general sentiment that internal search results could use improvement. It's kind of amazing how many clicks I need to make sometimes to find things. (And by the way, I would think that we would want to *decrease* clicks in search pages and *increase* pageviews of non-search pages/)

On Fri, Nov 13, 2015 at 1:24 PM, Erik Bernhardson < ebernhardson@wikimedia.org> wrote:

...

On Fri, Nov 13, 2015 at 1:04 PM, Federico Leva (Nemo) nemowiki@gmail.com wrote:

...
Trey Jones, 13/11/2015 21:51:

...
The cross-language cross-wiki task is endlessly fascinating (to a language nerd like me), but I worry that the maximum potential impact is low, and that success is very hard to measure

On the other hand success becomes very easy to measure if you define success as driving traffic to smaller wikis (e.g. counting clicks).

Nemo

But is that success? If you can drive an extra 10k clicks to smaller wiki's, or an extra 500k clicks to primary wikis (totally pulling numbers out of a hat), was the effort to drive 10k clicks worth it?

discovery mailing list discovery@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/discovery

3317

Age (days ago)

3325

Last active (days ago)

discovery@lists.wikimedia.org

9 comments

7 participants

tags (0)

participants (7)

billinghurst
David Causse
Erik Bernhardson
Federico Leva (Nemo)
Oliver Keyes
Pine W
Trey Jones