Discovery

discovery@lists.wikimedia.org

1 participants
755 discussions

[Wikimedia-search] Retrospective action items status
by Kevin Smith 22 Jul '15

22 Jul '15

In our recent (July) team retrospective, we didn't have a chance to review the action items that came out of our June retrospective. However, I have posted those previous items, with status updates (as best I know them)[1]. Of the 18 items, 5 are "done", and several others are improved or in progress. [1] https://www.mediawiki.org/wiki/Wikimedia_Search_Team/Retrospective_2015-07-… That page will also contain our July retrospective notes, after they have been processed. Kevin Smith Agile Coach Wikimedia Foundation *Imagine a world in which every single human being can freely share in the sum of all knowledge. That's our commitment. Help us make it a reality.*

2 1

[Wikimedia-search] Trouble with vagrant role: analytics
by James Douglas 14 Jul '15

14 Jul '15

I'm having trouble enabling the analytics role on vagrant. Does this mean anything to anyone? ==> default: Error: Puppet::Parser::AST::Resource failed with error > ArgumentError: Could not find declared class ::cdh::hadoop at > /vagrant/puppet/modules/role/manifests/hadoop.pp:45 > on node mediawiki-vagrant.dev > I even tried vagrant destroying, and starting from scratch. It seems like maybe I need to apt-get install something Hadoop related, but my Google-fu isn't helping.

5 15

[Wikimedia-search] Getting WDQS into production
by Kevin Smith 13 Jul '15

13 Jul '15

We had a meeting today with Giuseppe and Andrew from Ops, and clarified our path toward getting WDQS deployed in production (as a test service). Here are the takeaways/action items I'm aware of: 1. We need to specify our hardware needs ASAP ---> I think this means we should unstall https://phabricator.wikimedia.org/T86561 and assign it to Stas. 2. Most likely the service will run on existing hardware (and ops will want to deploy it in both data centers) 3. Debian packaging is not required--we'll use maven+archiva+git deploy (?) 4. Andrew can help Stas with archiva (which Stas and Nik have already used) 5. Giuseppe can help Stas with puppet, which should be pretty easy 6. The puppet work should include basic health and performance monitoring 7. Stas will consider using jmx for additional logging Full notes of the meeting are here: http://etherpad.wikimedia.org/p/DiscoveryOpsWDQS Kevin Smith Agile Coach Wikimedia Foundation *Imagine a world in which every single human being can freely share in the sum of all knowledge. That's our commitment. Help us make it a reality.*

4 3

[Wikimedia-search] Proposal: try harder to make suggestions in some queries
by Nikolas Everett 09 Jul '15

09 Jul '15

If the query returned 0 results and didn't have any syntax (no intitle:foo) in it, should we try _harder_ to get suggestions? I don't know exactly what that changes that means but we can totally implement the retry if we think it'll help. The idea is that it might not be performant enough to run super duper strong suggester settings all the time and when there are no results it important to have suggestions. For reference, only 20% of 0 results queries that I counted this morning returned a suggestion. I don't know how many asked for it though.

3 3

[Wikimedia-search] Quick process reminder (for engineers)
by Kevin Smith 08 Jul '15

08 Jul '15

Hi all, As a reminder, all[1] of your Discovery-related research and coding work should be tracked in phabricator. During our Tuesday/Thursday standups, most of what you talk about should be tasks on one of the "sprint" workboards. If you are working on a task that isn't in the sprint board, please a) re-check to be sure that is the highest priority thing you should be working on, and b) if it is, add it to phabricator and/or to the sprint board as needed. When you pick what to work on, try to grab something from near the top of the sprint's Backlog column, and move it to In Progress. Please use the Needs Review column as needed, and when the task is really done, move it to Done. Each sub-team should be focused on its quarterly goal. Please be sure that Dan is aware of any work you do outside that. If you have any questions, check with him, me, or a team lead. [1] If you do a 15-minutes task here or there, it doesn't need to be tracked in phab. But any substantive work should be. Personally I would set the threshold at about an hour, but your mileage may vary. Thanks much! Kevin Smith Agile Coach Wikimedia Foundation *Imagine a world in which every single human being can freely share in the sum of all knowledge. That's our commitment. Help us make it a reality.*

3 4

[Wikimedia-search] HTTP request logging
by James Douglas 25 Jun '15

25 Jun '15

Let's say, hypothetically, that I wanted to measure information about HTTP requests coming into the Wikipedia Portal (www.wikipedia.org). * Do we record this information? * If so, is it accessible via analytical tools? * If so, how do I get my mitts on it? * If not, is it accessible from a database or similar? Context: https://phabricator.wikimedia.org/T100673

2 13

[Wikimedia-search] Upgrade Beta to Elasticsearch 1.6
by Nikolas Everett 24 Jun '15

24 Jun '15

In our neverending march towards progress I've created a phabricator task <https://phabricator.wikimedia.org/T103598> to upgrade beta to Elasticsearch 1.6.0. That requires a few things: * Release our plugins to archiva * Propose a patch to upgrade to those new versions * Manually land the patch in beta and sync those versions of the plugins * On every Elasticsearch node (deployment-elastic0[5678]) download the elasticsearch 1.6 package, install it, and restart elasticsearch. Its not a ton of work but in our effort to get non-Nik people used to doing elasticsearch maintenance I'd love for someone else to grab it. In our effort to upgrade to 1.6 soon, it'd be cool if someone could grab it in the next few days. We need at least a week of beta testing 1.6.0 before we upgrade production, just to be sure. So anyone want to do it? I don't expect you need special permissions that are hard to get because its beta. We can add grant you whatever permissions you lack in just a few minutes. Nik

1 0

[Wikimedia-search] Phabricator process has been tweaked
by Kevin Smith 23 Jun '15

23 Jun '15

We have refined our use of our "sprint" projects, explicitly orienting them around people, not projects. The first step was that we renamed our Data-And-Research board to Analytics, reflecting that it is really all about Oliver's work. Then we moved data-related engineering work out of the Analytics-Sprint board, and into the Cirrus-Sprint board. The people doing this work are part of the Cirrus sub-team, so all of their work (Cirrus or not) now appears on a single board. At some point, we might rename the Cirrus-Sprint project to reflect this new nature, but we're leaving it for now. Short version: Every individual contributor on the team should be able to focus on just one workboard. If you are an engineer, you should no longer have to look for work on the Data or Analysis workboards. I have updated our process page[1] accordingly. [1] https://www.mediawiki.org/wiki/Discovery/Process Kevin Smith Agile Coach Wikimedia Foundation *Imagine a world in which every single human being can freely share in the sum of all knowledge. That's our commitment. Help us make it a reality.*

2 1

[Wikimedia-search] Fwd: [Announcements] Elasticsearch 1.6.0 released
by Nikolas Everett 18 Jun '15

18 Jun '15

Hurray! Nik ---------- Forwarded message ---------- From: Clinton Gormley <noreply(a)discuss.elastic.co> Date: Tue, Jun 9, 2015 at 12:36 PM Subject: [Announcements] Elasticsearch 1.6.0 released To: nik9000(a)gmail.com Clinton_Gormley <http://discuss.elastic.co/users/clinton_gormley> June 9 Today, we are pleased to announce the release of Elasticsearch 1.6.0, based on Lucene 4.10.4. This is the latest stable version of Elasticsearch and is packed with awesome new features: - Faster restarts with synced flushing - Shard allocation no longer blocks pending tasks JSON response body filtering - Security fix for shared file-system repositories - Upgrade API for ancient indices - More lenient highlighting for Kibana users - mlockall for Windows users - Fine-grained script settings You can read more in the blog post: https://www.elastic.co/blog/elasticsearch-1-6-0-released ------------------------------ To respond, reply to this email or visit http://discuss.elastic.co/t/elasticsearch-1-6-0-released/2235/1 in your browser. To unsubscribe from these emails, visit your user preferences <http://discuss.elastic.co/my/preferences>.

2 1

[Wikimedia-search] Schema:Search
by Adam Baso 18 Jun '15

18 Jun '15

Hey all, looks like the new rev'd Schema:Search logging for desktop is in effect. Here are a couple queries suggesting that, at least on English Wikipedia for stable versions of Chrome most of the time people either click on an OpenSearch result from the set or dropdown results or press ENTER / click on the magnifying glass icon. > SELECT count(DISTINCT event_userSessionToken) FROM Search_12057910 WHERE timestamp > '20150614' AND timestamp < '20150615' AND (event_action = 'click-result' OR event_action = 'submit-form') AND wiki = 'enwiki' and userAgent LIKE '%Chrome/43%'; +----------------------------------------+ | count(DISTINCT event_userSessionToken) | +----------------------------------------+ | 1049 | +----------------------------------------+ 1 row in set (0.20 sec) > SELECT count(DISTINCT event_userSessionToken) FROM Search_12057910 WHERE timestamp > '20150614' AND timestamp < '20150615' AND wiki = 'enwiki' and userAgent LIKE '%Chrome/43%'; > SELECT count(DISTINCT event_userSessionToken) FROM Search_12057910 WHERE timestamp > '20150614' AND timestamp < '20150615' AND wiki = 'enwiki' and userAgent LIKE '%Chrome/43%'; +----------------------------------------+ | count(DISTINCT event_userSessionToken) | +----------------------------------------+ | 1100 | +----------------------------------------+

2 3

← Newer
1
...
69
70
71
72
73
74
75
76
Older →

Jump to page:

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

Discovery