This might explain the unexpectedly low results from the explore similar
test. Not saying he fix would ensure usage, but it seems we lost quite a
few events. This problem also effects autocompletd data we dashboard,
although I don't think we've used it for tests recently. Full text search
analytics should be unaffected as they don't use the mw.track functionality.
---------- Forwarded message ----------
From: "Sam Smith" <samsmith(a)wikimedia.org>
Date: Oct 12, 2017 3:41 AM
Subject: [Analytics] Heads up: mw.track client-side EventLogging mechanism
"ignored" certain events
To: "Analytics List" <analytics(a)lists.wikimedia.org>
Cc:
o/
Prior to Thursday, 28th September, if your client-side EventLogging
instrumentation logged event via mw.track, then only events tracked
during the first pageview of a user's session were logged.
Now, technically, the events weren't ignored or dropped. Instead, the
subscriber for the "event" topic was never subscribed when the module
was loaded from the ResourceLoader's cache and so events published on
that topic simply weren't received and logged.
This bug was discovered while testing some instrumentation maintained
by Readers Web [0] and independently by Timo Tijhof, who submitted the
ideal fix [1] promptly.
-Sam
[0] https://phabricator.wikimedia.org/T175918
[1] https://gerrit.wikimedia.org/r/#/c/378804/
---
Engineering Manager
Readers
Timezone: BST (UTC+1)
IRC (Freenode): phuedx
_______________________________________________
Analytics mailing list
Analytics(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/analytics
Hello!
Related to my previous incident report [1], we also had an issue with
logstash [2].
Logstash stops collecting logs while elasticsearch / cirrus is down.
This is most probably related to API Feature logging, which are sent
by logstash to the cirrus cluster. Sadly, there are no obvious fix at
this point. It might be possible to tune the elasticsearch output
plugin to fail fast, but that is not obvious from the documentation.
[1] https://wikitech.wikimedia.org/wiki/Incident_documentation/20170920-Elastic…
[2] https://wikitech.wikimedia.org/wiki/Incident_documentation/20170920-Logstash
--
Guillaume Lederrey
Operations Engineer, Discovery
Wikimedia Foundation
UTC+2 / CEST
Hello!
TL;DR: Our recent elasticsearch cluster restart did not go as planned.
Most important lesson learned: we did not understand the recovery
settings correctly.
Yesterday, we did a cold restart of the elasticsearch / cirrus eqiad
cluster. This restart did not go as planned. It did not generate any
user facing impact, since we moved all the traffic to codfw before the
restart. It did impact logstash (more of that in a different report).
Incident documentation:
https://wikitech.wikimedia.org/wiki/Incident_documentation/20170920-Elastic…
Have fun!
Guillaume
--
Guillaume Lederrey
Operations Engineer, Discovery
Wikimedia Foundation
UTC+2 / CEST
We recently got a suggestion via Phabricator[1] to automatically map
between hiragana and katakana when searching on English Wikipedia and other
wiki projects. As an always-on feature, this isn't difficult to implement,
but major commercial search engines (Google.jp, Bing, Yahoo Japan,
DuckDuckGo, Goo) don't do that. They give different results when searching
for hiragana/katakana forms (for example, オオカミ/おおかみ "wolf"). They also give
different *numbers* of results, seeming to indicate that it's not just
re-ordering the same results (say, so that results in the same script are
ranked higher).[2] I want to know what they know that I don't!
Does anyone have any thoughts on whether this would be useful (seems that
it would) and whether it would cause any problems (it must, or otherwise
all the other search engines would do it, right?).
Any idea why it might be different between a Japanese-language wiki and a
non-Japanese-language wiki? We often are more aggressive in matching
between characters that are not native to a given language--for example,
accents on Latin characters are generally ignored on English-language
wikis. So it might make sense to merge hiragana and katakana on
English-language wikis but not Japanese-language wikis.
Thanks very much for any suggestions or information!
—Trey
[1] https://phabricator.wikimedia.org/T176197
[2] Details of my tests at https://phabricator.wikimedia.org/T173650#3580309
Trey Jones
Sr. Software Engineer, Search Platform
Wikimedia Foundation