The Search Platform Team
<https://www.mediawiki.org/wiki/Wikimedia_Search_Platform> holds office
hours the first Wednesday of each month. Come ask us anything about
Wikimedia search!
We’re particularly interested in:
* Opportunities for collaboration—internally or externally to the Wikimedia
Foundation
* Challenges you have with on-wiki search, in any of the languages we
support
But we're happy to talk about anything search-related. Feel free to add
your items to the Etherpad Agenda for the next meeting.
Details for our next meeting:
Date: Wednesday, December 5th, 2018
Time: 16:00-17:00 GMT / 08:00-9:00 PST / 11:00-12:00 EST / 17:00-18:00 CET
Etherpad: https://etherpad.wikimedia.org/p/Search_Platform_Office_Hours
Google Meet link: https://meet.google.com/vyc-jvgq-dww
*N.B.:* Google Meet System Requirements
<https://support.google.com/meet/answer/7317473>
—Trey
Trey Jones
Sr. Software Engineer, Search Platform
Wikimedia Foundation
// Sorry for cross-posting
Hi everyone,
the Advanced Search interface is now available as a default feature on all
wikis. That means you, logged-in or not, can carry out advanced searches
even if you don’t know any search syntax.
The new feature provides some existing advanced search options in a visual
interface. This can help you find pages that contain a particular template,
search in page titles, for a specific sequence of characters, and much
more. Plus, the way namespaces can be selected has been redesigned. Among
other things, you can now select several namespaces with one click, e.g. to
search in all talk namespaces. More detailed information is on the project
page. [1]
The feature was developed by Wikimedia Deutschland’s Technical Wishes Team.
[2] The idea for it was born in 2016 in workshops with editors, followed by
prototypes, several feedback rounds on dewiki and Meta, and finally the
beta function, which 43,000 people across all wikis used. During the beta
phase, bugs were fixed, the namespace selection was revised, and more
search options were added.
Many thanks to everyone who took the time to give feedback (onwiki, in
discussions, at the dev summit and more), to test or to translate. A big
thank you also goes to the Discovery team at the WMF for their support, by
making backend adjustments and implementing a new search parameter for deep
category searches.
The development team hopes that the new feature will help you find what
you’re looking for more easily. People who prefer to keep the previous
search interface can deactivate the new feature in their user preferences.
[3]
As always, feedback is welcome on the central talk page. [4]
Johanna for the Technical Wishes team
-- Johanna Strodt Project Manager Community Communications Technical
Wishlist, Wikimedia Deutschland
[1] https://meta.wikimedia.org/wiki/WMDE_Technical_Wishes/AdvancedSearch
[2] https://meta.wikimedia.org/wiki/WMDE_Technical_Wishes
[3]
https://meta.wikimedia.org/wiki/Special:Preferences#mw-prefsection-searchop…
[4] https://www.mediawiki.org/wiki/Help_talk:Extension:AdvancedSearch
Just a notice that RFP for haystack, https://haystackconf.com/ , has gone
out. Proposals due jan 9th. Conference april 24th and 25th.
I'll probably submit the recent autocomplete work as a talk. Something like
the following. I don't like any of these titles yet and the abstract surely
needs help too, but will ponder on it.
Titles?
Running millions of hits per second on a laptop
Elasticsearch and Tensorflow, an offline evaluation story
Exporting explains
Abstract:
What if we could export the scoring equation along with all the inputs and
variables, and run this equation outside of elasticsearch? Simple queries
can then be run on a laptop at a rate of several million hits per second.
This opens up a variety of optimization techniques such as coordinate
ascent or hyperopt to tune the query variables including boosts, tie
breakers, similarity parameters, etc.
Thanks a lot for forwarding! There'll also be a bigger, more extensive
announcement of the feature deployment on this list soon. So stay
tuned. :)
Best,
Johanna
--
Johanna Strodt
Project Manager Community Communications Technical Wishlist
Forwarding to the Discovery list and more project email lists so that
people know that this feature change is coming.
Pine
( https://meta.wikimedia.org/wiki/User:Pine )
---------- Forwarded message ---------
From: Johanna Strodt <johanna.strodt at wikimedia.de
<https://lists.wikimedia.org/mailman/listinfo/discovery>>
Date: Wed, Nov 21, 2018 at 12:14 PM
Subject: [Translators-l] Fwd: AdvancedSearch announcement: looking for
translation support
To: <translators-l at lists.wikimedia.org
<https://lists.wikimedia.org/mailman/listinfo/discovery>>
Dear all,
the Advanced Search interface will soon become a default feature on
all wikis. Its deployment is planned for November 28.
The feature adds an advanced parameter form to the Special:Search page
in order to make already existing advanced search options such as
"intitle" or "subpageof" more visible and accessible for everyone. It
also changes the way namespaces can be selected.
We want to announce the upcoming deployment on village pumps with this
short message: https://meta.wikimedia.org/wiki/User:Johanna_Strodt_(WMDE)/Advanced_Search_…
The feature has already been deployed to deWP, arWP, faWP and huWP, so
we're now looking for translations in other languages.
We're planning to publish the message on Monday, Nov 26, around 11 am
UTC. Therefore it would be great to have the translations ready by
Monday November 26, 7 am UTC.
Thanks a lot. Any support is very appreciated!
Johanna
--
Johanna Strodt
Project Manager Community Communications Technical Wishlist
Wikimedia Deutschland e. V. | Tempelhofer Ufer 23-24 | 10963 Berlin
Phone: +49 (0)30 219 158 26-0https://wikimedia.de
Imagine a world, in which every single human being can freely share in
the sum of all knowledge. That‘s our commitment.
Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e.
V. Eingetragen im Vereinsregister des Amtsgerichts
Berlin-Charlottenburg unter der Nummer 23855 B. Als gemeinnützig
anerkannt durch das Finanzamt für Körperschaften I Berlin,
Steuernummer 27/029/42207.
_______________________________________________
Translators-l mailing listTranslators-l at lists.wikimedia.org
<https://lists.wikimedia.org/mailman/listinfo/discovery>https://lists.wikimedia.org/mailman/listinfo/translators-l
Forwarding to the Discovery list and more project email lists so that
people know that this feature change is coming.
Pine
( https://meta.wikimedia.org/wiki/User:Pine )
---------- Forwarded message ---------
From: Johanna Strodt <johanna.strodt(a)wikimedia.de>
Date: Wed, Nov 21, 2018 at 12:14 PM
Subject: [Translators-l] Fwd: AdvancedSearch announcement: looking for
translation support
To: <translators-l(a)lists.wikimedia.org>
Dear all,
the Advanced Search interface will soon become a default feature on
all wikis. Its deployment is planned for November 28.
The feature adds an advanced parameter form to the Special:Search page
in order to make already existing advanced search options such as
"intitle" or "subpageof" more visible and accessible for everyone. It
also changes the way namespaces can be selected.
We want to announce the upcoming deployment on village pumps with this
short message: https://meta.wikimedia.org/wiki/User:Johanna_Strodt_(WMDE)/Advanced_Search_…
The feature has already been deployed to deWP, arWP, faWP and huWP, so
we're now looking for translations in other languages.
We're planning to publish the message on Monday, Nov 26, around 11 am
UTC. Therefore it would be great to have the translations ready by
Monday November 26, 7 am UTC.
Thanks a lot. Any support is very appreciated!
Johanna
--
Johanna Strodt
Project Manager Community Communications Technical Wishlist
Wikimedia Deutschland e. V. | Tempelhofer Ufer 23-24 | 10963 Berlin
Phone: +49 (0)30 219 158 26-0https://wikimedia.de
Imagine a world, in which every single human being can freely share in
the sum of all knowledge. That‘s our commitment.
Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e.
V. Eingetragen im Vereinsregister des Amtsgerichts
Berlin-Charlottenburg unter der Nummer 23855 B. Als gemeinnützig
anerkannt durch das Finanzamt für Körperschaften I Berlin,
Steuernummer 27/029/42207.
_______________________________________________
Translators-l mailing list
Translators-l(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/translators-l
Good day,
This is the weekly update from the Search Platform team for the week
starting 2018-11-12
Programming note: Given the upcoming US holiday the next update will
be for the week starting 2018-11-26.
As always, feedback and questions welcome.
== Discussions ==
=== Search ===
* David and Trey have resolved the problems with 32-bit Chinese
characters (like 𨨏—[0]), which were returning irrelevant results, and
showing lots of unicode replacement characters (�) in the results. The
highlighter fix was deployed [1] first so there aren't any more �
characters in the results. The re-indexing [2] to improve the
relevance of results is now also done for Chinese-language wikis.
== Did you know? ==
* Letters are encoded internally by computers as numbers—for example,
“A” is 65 and “a” is 97.[3] Years ago, programs and even websites
would use different encodings[4] to represent text, often leading to
unreadable gibberish on screen. Unicode[5] was intended to be a single
encoding for most of the world’s writing systems. The most-used parts
of it fit into a 16-bit representation,[6] which can handle about 65
thousand characters. But that's not enough for the very large number
of rare and historical Chinese, Japanese, and Korean (CJK) characters,
which are represented in 16-bit Unicode using “surrogate pairs”.[7]
1,024 Unicode characters are set aside to be “high surrogates”—the
first half of a 32-bit character—and 1,024 characters are set aside to
be “low surrogates”—the second half. By themselves, the surrogates
aren’t valid and don’t represent anything, but in pairs they can
represent over a million additional characters. Since these characters
are usually rare, software can sometimes treat them incorrectly split
them up, which can result in you seeing the Unicode replacement
character �,[8] which is used when something has gone wrong processing
Unicode text. (When the character is fine, but you don’t have a font
to show it, you sometimes get little squares instead. Since the most
common source of these squares for English speakers is unrepresented
CJK characters, a slang term for the squares is “tofu”.[9])
[0] https://phabricator.wikimedia.org/T168427
[1] https://phabricator.wikimedia.org/T209293
[2] https://phabricator.wikimedia.org/T209156
[3] https://en.wikipedia.org/wiki/ASCII#Printable_characters
[4] https://en.wikipedia.org/wiki/Character_encoding#Common_character_encodings
[5] https://en.wikipedia.org/wiki/Unicode
[6] https://en.wikipedia.org/wiki/UTF-16
[7] https://en.wikipedia.org/wiki/Universal_Character_Set_characters#Surrogates
[8] https://en.wikipedia.org/wiki/Specials_(Unicode_block)#Replacement_character
[9] https://en.wiktionary.org/wiki/tofu#Noun
----
Subscribe to receive on-wiki (or opt-in email) notifications of the
Discovery weekly update.
https://www.mediawiki.org/wiki/Newsletter:Discovery_Weekly
The archive of all past updates can be found on MediaWiki.org:
https://www.mediawiki.org/wiki/Discovery/Status_updates
Interested in getting involved? See tasks marked as "Easy" or
"Volunteer needed" in Phabricator.
[1] https://phabricator.wikimedia.org/maniphest/query/qW51XhCCd8.7/#R
[2] https://phabricator.wikimedia.org/maniphest/query/5KEPuEJh9TPS/#R
Many thanks,
Chris Koerner
Community Relations Specialist
Wikimedia Foundation
The Search Platform Team
<https://www.mediawiki.org/wiki/Wikimedia_Search_Platform> will be holding
office hours the first Wednesday of each month, starting next week. Come
ask us anything about Wikimedia search!
We’re particularly interested in:
* Opportunities for collaboration—internally or externally to the Wikimedia
Foundation
* Challenges you have with on-wiki search, in any of the languages we
support
But we're happy to talk about anything search-related.
Details for our next meeting:
Date: Wednesday, November 7th, 2018
Time: 16:00 GMT / 08:00 PST / 11:00 EST / 17:00 CET
Google Meet link: https://meet.google.com/vyc-jvgq-dww
*N.B.:* Google Meet System Requirements
<https://support.google.com/meet/answer/7317473>
Hope to see you there!
—Trey
Trey Jones
Sr. Software Engineer, Search Platform
Wikimedia Foundation