Hello all!
We had an interesting discussion yesterday with David about the way we
do sharding of our indices on elasticsearch. Here are a few notes for
whoever finds the subject interesting and wants to jump in the
discussion:
Context:
We recently activated row aware shard allocation on our elasticsearch
search clusters. This means that we now have one additional constraint
on shard allocation: spread copies of shards across multiple
datacenter rows, so that if we loose a full row, we still have a copy
of all the data. During an upgrade of elasticsearch, another
constraint comes into play: a shard can move from a node with an older
version of elasticsearch to a node with a newer version, but not the
other way around. This leads to elasticsearch struggling to allocate
all shards during the recent codfw upgrade to elasticsearch 2.3.5.
While it is not the end of the world (we can still server traffic if
some indices don't have all shards allocated), this is something we
need to improve.
Number of shards / number of replicas:
An elasticsearch index is split at creation in a number of shards. A
number of replica per shard is configured [1]. The total number of
shards for an index is "number_of_shards * (number_of_replicas + 1)".
Increasing the number of shards per index allow to execute read
operation in parallel over the different shards and aggregate the
results at the end, improving response time Increasing the number of
replicas allow to distribute the read load over more nodes (and
provides some redundancy in case we loose one server). As term
frequency [2] is calculated over a shard and not over the full index,
There is some black magic involved in how we shard our indices, but
most of it is documented [3]
The enwiki_content example:
enwiki_content index is configured to have 6 shards and 3 replicas,
for a total number of 24 shards. It also has the additional constraint
that there is at most 1 enwiki_content per node. This ensures a
maximum spread of enwiki_content shards over the cluster. Since
enwiki_content is one of the index with the most traffic, this ensure
that the load is well distributed over the cluster.
Now the bad news: for codfw, which is a 24 node cluster, it means that
reaching this perfect equilibrium of 1 shard per node is a serious
challenge if you take into account the other constraint in place. Even
with relaxing the constraint to 2 enwiki shards per node, we have seen
unassigned shards during elasticsearch upgrade.
Potential improvements:
While ensuring that a large index has a number of shards close to the
number of nodes in the cluster allows for optimally spreading load
over the cluster, it degrade fast if all the stars are not aligned
perfectly. There are 2 opposite solutions
1) decrease the number of shards to leave some room to move them around
2) increase the number of shards and allow multiple shards of the same
index to be allocated on the same node
1) is probably impractical on our large indices, enwiki_content shards
are already ~30Gb and this makes it impractical to move them around
during relocation and recovery
2) is probably our best bet. More smaller shards means that a single
query load will be spread over more nodes, potentially improving
response time. Increasing number of shards for enwiki_content from 6
to 20 (total shards = 80) means we have 80 / 24 = 3.3 shards per node.
Removing the 1 shards per node constraint and letting elasticsearch
spread the shards as best as it can means that in case 1 node is
missing, or during an upgrade, we still have the ability to move
shards around. Increasing this number even more might help keep the
load evenly spread across the cluster (the difference between 8 or 9
shards per node is smaller than the difference between 3 or 4 shards
per node).
David is going to do some tests to validate that those smaller shards
don't impact the scoring (smaller shards mean worse frequency
analysis).
I probably forgot a few points, but this email is more than long
enough already...
Thanks to all of you who kept reading until the end!
MrG
[1] https://www.elastic.co/guide/en/elasticsearch/reference/current/_basic_conc…
[2] https://www.elastic.co/guide/en/elasticsearch/guide/current/scoring-theory.…
[3] https://wikitech.wikimedia.org/wiki/Search#Estimating_the_number_of_shards_…
--
Guillaume Lederrey
Operations Engineer, Discovery
Wikimedia Foundation
UTC+2 / CEST
The Search Team in Discovery needs your help! Discernatron [1] is a search
relevance tool developed by the Discovery department. Its goal is to help
improve search relevance - showing articles that are most relevant to
search queries - with human assistance. We need your help grading search
results!
Join us for lunch at 12pm (SF time) in the 5th floor lounge on Tuesday 13th
September! In the Discernatron lunch, we'll give a brief overview of what
the Discernatron is, then ask people to get rating queries, so bring your
laptops! We're hoping a limited amount of food will be provided for the
event, but you can only eat it if you agree to rate queries for us. ;-)
We'll also be set up for remote participation on Hangouts and IRC, and the
session will be recorded.
Hangout: https://hangouts.google.com/hangouts/_/7pcv3gtfcbczzhaxbezyqhedfee
YouTube stream: https://www.youtube.com/watch?v=q4W9t6IcjWk
Thanks! If there are any questions, let me know!
Dan
[1]: https://www.mediawiki.org/wiki/Discernatron
--
Dan Garry
Lead Product Manager, Discovery
Wikimedia Foundation
Hi Discovery,
I'm wondering if there are any significant improvements coming in the next
+/- 18 months for multimedia search on Commons. Finding images can be a
very time-consuming job for Wikimedians and other users of the site.
For example, it would be nice to be able to do a "join" search with
categories, so that only media files that appear in 2+ selected categories
are shown in search results.
As an example, suppose that I want an image of a blue tile roof in China.
The search "blue tile roof China" doesn't show any results that interest me
on the first page. However,
https://commons.wikimedia.org/wiki/File:Wuhan_University_-_roof_tiles.JPG
is an image that would interest me. That file is several layers deep in
subcategories under the China category. It would be nice to be able to find
that image by searching a join of the China category and its subcategories,
with the "blue roof" category.
Thanks,
Pine
Thanks, Takashi! :)
--
Deb Tankersley
Product Manager, Discovery
IRC: debt
Wikimedia Foundation
On Fri, Aug 26, 2016 at 8:29 PM, Takashi OTA <
supertakot+translators(a)gmail.com> wrote:
> HI Deborah,
>
> Done for Japanese messages.
> Thanks.
>
> --Takashi [[U:Takot]]
>
> 2016年8月26日(金) 13:23 Takashi OTA <supertakot+translators(a)gmail.com>:
>
>> Hi Deborah,
>>
>> I'll take on JA this weekend.
>> Please wait for a while.
>>
>> Cheers,
>>
>> --Takot
>>
>> On Tue, Aug 23, 2016 at 05:57 Deborah Tankersley <
>> dtankersley(a)wikimedia.org> wrote:
>>
>>> Hi Stryn,
>>>
>>> We really only need these languages to be translated for:
>>>
>>> Portuguese: PT, EN, RU, HE, AR, ZH, KO, EL
>>> Japanese: JA, EN, RU, KO, AR, HE
>>>
>>>
>>> As those are the only languages that we're looking for on the PT and JA
>>> wiki sites in order to determine if the user typed in one of those
>>> languages. We didn't enable the feature for ALL languages.
>>>
>>> Hope that helps!
>>>
>>> Cheers,
>>>
>>> Deb
>>>
>>>
>>> --
>>> Deb Tankersley
>>> Product Manager, Discovery
>>> IRC: debt
>>> Wikimedia Foundation
>>>
>>> On Mon, Aug 22, 2016 at 8:23 AM, Stryn <strynwiki(a)gmail.com> wrote:
>>>
>>>> Hmm, there is really no easier way to translate those?
>>>>
>>>> Showing results from [[:als:Main_Page|Alemannic Wikipedia]].
>>>> Showing results from [[:cbk-zam:Main_Page|Chavacano Wikipedia]]
>>>>
>>>> There are many languages where I don't know the Finnish translation, so
>>>> I would need to Google them first.
>>>>
>>>> Do we have system messages for language names? Like "Chavacano"?
>>>> Then if do, would only need to translate the "showing results from"
>>>> part and the rest could come from the system message? Or is it too complex?
>>>>
>>>> ... Ok I just found it, we have messages "MediaWiki:Project-localized-
>>>> name-akwiki/fi", can't we merge in some way those messages?
>>>> Now we have messages for "Wikimedia Project Names" and "Wikimedia
>>>> Interwiki Search Results". I mean it just makes so much work for
>>>> translators to translate those both.
>>>> We can't use directly the messages from the "Wikimedia Project Names"
>>>> groups for interwiki search texts because of conjugates, but something
>>>> like "Showing results from: {{MediaWiki:Project-localized-
>>>> name-akwiki/fi}}" would be nice.
>>>>
>>>> *Stryn*
>>>>
>>>> *Suomenkielisen Wikipedian ylläpitäjä & osoitepaljastaja / Admin and
>>>> checkuser on the Finnish Wikipedia*
>>>>
>>>> *Wikidatan ylläpitäjä / Admin on Wikidata*
>>>>
>>>> *Meta-Wikin ylläpitäjä / Admin on Meta-Wiki*
>>>> *Ylivalvoja / Steward*
>>>>
>>>> 2016-08-18 22:45 GMT+03:00 Deborah Tankersley <
>>>> dtankersley(a)wikimedia.org>:
>>>>
>>>>>
>>>>> The Discovery team has been making good progress in enabling cross
>>>>> language search results on several wiki's and now we need help in
>>>>> translating a phrase: "*showing results from"*.
>>>>>
>>>>>
>>>> _______________________________________________
>>>> Translators-l mailing list
>>>> Translators-l(a)lists.wikimedia.org
>>>> https://lists.wikimedia.org/mailman/listinfo/translators-l
>>>>
>>>>
>>> _______________________________________________
>>> Translators-l mailing list
>>> Translators-l(a)lists.wikimedia.org
>>> https://lists.wikimedia.org/mailman/listinfo/translators-l
>>>
>>
> _______________________________________________
> Translators-l mailing list
> Translators-l(a)lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/translators-l
>
>
The phab folks so far have resisted really fixing autocomplete. And they
won't accept patches that don't align with their own product vision. Thus,
at least so far, the answer has been "you can't get there from here."
Kevin Smith
Agile Coach, Wikimedia Foundation
On Fri, Aug 26, 2016 at 9:43 AM, Jonathan Morgan <jmorgan(a)wikimedia.org>
wrote:
> Grrr.
>
> https://phabricator.wikimedia.org/T99739
>
> Hey who can we Barnstar to get this fixed?
>
> J
>
>