Discovery June 2015

discovery@lists.wikimedia.org

20 participants
17 discussions

[Wikimedia-search] Trouble with vagrant role: analytics

by James Douglas

I'm having trouble enabling the analytics role on vagrant. Does this mean anything to anyone? ==> default: Error: Puppet::Parser::AST::Resource failed with error > ArgumentError: Could not find declared class ::cdh::hadoop at > /vagrant/puppet/modules/role/manifests/hadoop.pp:45 > on node mediawiki-vagrant.dev > I even tried vagrant destroying, and starting from scratch. It seems like maybe I need to apt-get install something Hadoop related, but my Google-fu isn't helping.

8 years, 9 months

[Wikimedia-search] HTTP request logging

by James Douglas

Let's say, hypothetically, that I wanted to measure information about HTTP requests coming into the Wikipedia Portal (www.wikipedia.org). * Do we record this information? * If so, is it accessible via analytical tools? * If so, how do I get my mitts on it? * If not, is it accessible from a database or similar? Context: https://phabricator.wikimedia.org/T100673

8 years, 10 months

[Wikimedia-search] Upgrade Beta to Elasticsearch 1.6

by Nikolas Everett

In our neverending march towards progress I've created a phabricator task <https://phabricator.wikimedia.org/T103598> to upgrade beta to Elasticsearch 1.6.0. That requires a few things: * Release our plugins to archiva * Propose a patch to upgrade to those new versions * Manually land the patch in beta and sync those versions of the plugins * On every Elasticsearch node (deployment-elastic0[5678]) download the elasticsearch 1.6 package, install it, and restart elasticsearch. Its not a ton of work but in our effort to get non-Nik people used to doing elasticsearch maintenance I'd love for someone else to grab it. In our effort to upgrade to 1.6 soon, it'd be cool if someone could grab it in the next few days. We need at least a week of beta testing 1.6.0 before we upgrade production, just to be sure. So anyone want to do it? I don't expect you need special permissions that are hard to get because its beta. We can add grant you whatever permissions you lack in just a few minutes. Nik

8 years, 10 months

[Wikimedia-search] Phabricator process has been tweaked

by Kevin Smith

We have refined our use of our "sprint" projects, explicitly orienting them around people, not projects. The first step was that we renamed our Data-And-Research board to Analytics, reflecting that it is really all about Oliver's work. Then we moved data-related engineering work out of the Analytics-Sprint board, and into the Cirrus-Sprint board. The people doing this work are part of the Cirrus sub-team, so all of their work (Cirrus or not) now appears on a single board. At some point, we might rename the Cirrus-Sprint project to reflect this new nature, but we're leaving it for now. Short version: Every individual contributor on the team should be able to focus on just one workboard. If you are an engineer, you should no longer have to look for work on the Data or Analysis workboards. I have updated our process page[1] accordingly. [1] https://www.mediawiki.org/wiki/Discovery/Process Kevin Smith Agile Coach Wikimedia Foundation *Imagine a world in which every single human being can freely share in the sum of all knowledge. That's our commitment. Help us make it a reality.*

8 years, 10 months

[Wikimedia-search] Fwd: [Announcements] Elasticsearch 1.6.0 released

by Nikolas Everett

Hurray! Nik ---------- Forwarded message ---------- From: Clinton Gormley <noreply(a)discuss.elastic.co> Date: Tue, Jun 9, 2015 at 12:36 PM Subject: [Announcements] Elasticsearch 1.6.0 released To: nik9000(a)gmail.com Clinton_Gormley <http://discuss.elastic.co/users/clinton_gormley> June 9 Today, we are pleased to announce the release of Elasticsearch 1.6.0, based on Lucene 4.10.4. This is the latest stable version of Elasticsearch and is packed with awesome new features: - Faster restarts with synced flushing - Shard allocation no longer blocks pending tasks JSON response body filtering - Security fix for shared file-system repositories - Upgrade API for ancient indices - More lenient highlighting for Kibana users - mlockall for Windows users - Fine-grained script settings You can read more in the blog post: https://www.elastic.co/blog/elasticsearch-1-6-0-released ------------------------------ To respond, reply to this email or visit http://discuss.elastic.co/t/elasticsearch-1-6-0-released/2235/1 in your browser. To unsubscribe from these emails, visit your user preferences <http://discuss.elastic.co/my/preferences>.

8 years, 10 months

[Wikimedia-search] Schema:Search

by Adam Baso

Hey all, looks like the new rev'd Schema:Search logging for desktop is in effect. Here are a couple queries suggesting that, at least on English Wikipedia for stable versions of Chrome most of the time people either click on an OpenSearch result from the set or dropdown results or press ENTER / click on the magnifying glass icon. > SELECT count(DISTINCT event_userSessionToken) FROM Search_12057910 WHERE timestamp > '20150614' AND timestamp < '20150615' AND (event_action = 'click-result' OR event_action = 'submit-form') AND wiki = 'enwiki' and userAgent LIKE '%Chrome/43%'; +----------------------------------------+ | count(DISTINCT event_userSessionToken) | +----------------------------------------+ | 1049 | +----------------------------------------+ 1 row in set (0.20 sec) > SELECT count(DISTINCT event_userSessionToken) FROM Search_12057910 WHERE timestamp > '20150614' AND timestamp < '20150615' AND wiki = 'enwiki' and userAgent LIKE '%Chrome/43%'; > SELECT count(DISTINCT event_userSessionToken) FROM Search_12057910 WHERE timestamp > '20150614' AND timestamp < '20150615' AND wiki = 'enwiki' and userAgent LIKE '%Chrome/43%'; +----------------------------------------+ | count(DISTINCT event_userSessionToken) | +----------------------------------------+ | 1100 | +----------------------------------------+

8 years, 10 months

[Wikimedia-search] Concerns about production-like projects in labs

by Kevin Smith

In a recent meeting, Oliver expressed concerns about us having services running in labs which are treated sort of like they were in production. Examples include WDQS (already) and maps (potentially). We agreed to have this discussion on the mailing list, so this is an invitation to do so. I am not familiar enough with the various technical issues to explain them properly, so hopefully someone else will step in and do so. I believe one big area of concern is analytics. Kevin Smith Agile Coach Wikimedia Foundation *Imagine a world in which every single human being can freely share in the sum of all knowledge. That's our commitment. Help us make it a reality.*

8 years, 10 months

[Wikimedia-search] Maps tasks and phabricator

by Kevin Smith

Hello all, We will now be tracking our maps-related work in the Discovery-Maps-Sprint project[1], rather than the Maps project[2]. Maps will continue to exist, as a feature-centric list of tasks. However, it will not reflect the status of tasks (e.g. "In Progress" or "Needs Review"). That will be handled by Discovery-Maps-Sprint, along with the Maps column of the Discovery project[3]. In Maps, I have moved all the tasks into a single column, and have hidden the other columns, to avoid them accidentally being used. I have copied all the Maps tasks over to the Sprint board, into the appropriate columns. Note that the columns in Sprint are a bit different than they were in Maps, to align with other Sprint boards within the Discovery department. For details on how tasks are handled, see the Discovery department process page[4]. I will be requesting phab herald rules to automatically ensure that Maps tasks automatically get sync'd between these three related projects. Please let me know if you have any questions or concerns. [1] https://phabricator.wikimedia.org/tag/discovery-maps-sprint/ [2] https://phabricator.wikimedia.org/tag/maps/ [3] https://phabricator.wikimedia.org/tag/discovery/ [4] https://www.mediawiki.org/wiki/Search_and_Discovery/Process#Workflow_and_Ph… Kevin Smith Agile Coach Wikimedia Foundation *Imagine a world in which every single human being can freely share in the sum of all knowledge. That's our commitment. Help us make it a reality.*

8 years, 10 months

[Wikimedia-search] Elasticsearch 1.6 upgrade

by Nikolas Everett

I think Elasticsearch 1.6 is coming out soon. I can feel it in my bones. <https://github.com/elastic/elasticsearch/issues?q=is%3Aopen+is%3Aissue+labe…> I'm excited for it because it'll give a way to do Elasticsearch restarts more quickly (1 hour instead of 36). I filed https://phabricator.wikimedia.org/T101530 to enumerate the tasks for upgrading to it. I'm so excited by it that I sent this email. Nik

8 years, 10 months

[Wikimedia-search] Getting the word out

by Federico Leva (Nemo)

It was nice to chat with many of you in Lyon! Wes mentioned that this mailing list will be used intensely, but I don't remember how I got to know about it. Starting from simple things, it will be good if * it's added to https://meta.wikimedia.org/wiki/Mailing_lists/Overview with a description of the usage you expect to make of it; * short notifications about it are sent to all relevant mailing lists (which you can find in the wiki list above once you know who you want to reach). Nemo

8 years, 10 months

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

Discovery June 2015