Discovery March 2016

discovery@lists.wikimedia.org

27 participants
58 discussions

Data access guidelines released
by Oliver Keyes 08 Mar '16

08 Mar '16

Hey y'all, As my final hurrah, I've released the data access guidelines used by the Discovery team in research and analysis on to Meta. It can be found at https://meta.wikimedia.org/wiki/Discovery/Data_access_guidelines The intent here is to ensure that we are transparent about what information we have, what we do with it, and what expectations we have on how employees will guarantee the safety and security of the information and the people behind it. To my knowledge the Discovery team is the first team in Engineering to have released this kind of information so prominently. I am pretty happy we could lead the way, and look forward to other groups with research interests hopefully doing the same! Best, -- Oliver Keyes Count Logula Wikimedia Foundation

3 3

Proposed changes to Discovery workboard structure
by Dan Garry 08 Mar '16

08 Mar '16

Hello all! I’m writing to solicit feedback on a plan to slightly rework the Discovery workboards on Phabricator. Back when we were assembling the team and putting the process together, I expressed a strong desire to have a centralised Discovery workboard which would contain every task related to Discovery’s work. In the end, I’d say that hasn’t really worked out, because there’s simply too many tasks in the workboard and too much in flight. It’s been hard to sensibly break things down into different categories of task on a per-project basis. To help alleviate this problem, here’s my proposal: - The Discovery workboard is disabled. - The Discovery tag will continue to exist as a parent-tag for Discovery’s work, but there will be no associated workboard. - Individual backlog workboards can be created for individual projects at the request of that team, for example: - Discovery-Search-Backlog - Discovery-Analysis-Backlog - Sprint boards will remain exactly as they are presently. How will this affect you? Firstly, this won’t affect you at all unless you spend a significant amount of time in the Discovery backlog, which should mostly be! Otherwise, nothing will really change for you. The workflow of engineers working on Maps and Wikidata Query Service, for example, will likely be completely unaffected by this change. As I’m the primary consumer of the Discovery board, and it’s not working very well for me at the minute, I’d like to move forwards with this proposal unless there are any strong objections. Kevin and I will handle all the logistics. Thoughts? Thanks, Dan -- Dan Garry Lead Product Manager, Discovery Wikimedia Foundation

4 8

Re: [discovery] [WikimediaMobile] Results from similar articles A/B test
by Justin Ormont 07 Mar '16

07 Mar '16

[resending due to mailing list byte limits] A/B testing: Slightly off-topic - You may try separating latency effects in an A/B test by running a counter factual test. Run your feature, but don't display any changes. A/B/CF test. The experiment scorecard then would compare your feature & control to the counterfactual test highlighting indirect changes like added latency. --justin

1 0

Help Rename Relevanc* Lab
by Trey Jones 07 Mar '16

07 Mar '16

Greetings Discovery Friends, Foes, and Other Enthusiasts! For various reasons, we're looking to rename the Relevanc(e|y) Lab so that it doesn't have "lab" in its name. If you have any clever or amusing ideas, jot them down over in Phabricator: https://phabricator.wikimedia.org/T128765 If you aren't Phab-enabled, you can read the discussion so far to get calibrated, and reply here and I'll post them back to Phab. Thanks, —Trey

4 5

Fwd: [Wikitech-l] Using Wikipedia/Wikidata in a nonprofit search engine
by Pine W 06 Mar '16

06 Mar '16

Forwarding to the Discovery mailing list, although it sounds like the OP hopes to have any possible discussion on Wikitech-l. I wonder if there would be ways for WMF Discovery to leverage the work that's being done already on Commoncrawl and Commonsearch for use in Wikimedia internal search. Pine ---------- Forwarded message ---------- From: Sylvain Zimmer <sylvain(a)sylvainzimmer.com> Date: Sun, Mar 6, 2016 at 11:46 AM Subject: [Wikitech-l] Using Wikipedia/Wikidata in a nonprofit search engine To: wikitech-l(a)lists.wikimedia.org Hi, Some of you may be familiar with http://commoncrawl.org ; they are doing an excellent job of making large crawls of the web accessible to everyone. I've been working on an open search engine based on these crawls for a while, and I would love to have your feedbacks on the project: https://about.commonsearch.org/ Specifically, I would be curious to know what you would consider to be the best possible integration of Wikipedia & Wikidata in a general search engine? As a first step, we have just started using the "official website" property from Wikidata and we are considering importing the Wikipedia abstracts next (https://github.com/commonsearch/cosr-back/issues/11). I'm looking forward to your feedbacks... or contributions! :-) Thanks in advance, PS: A few wikimedians recommended me to post on wikitech-l to keep the focus on the technical aspects of the project and hopefully avoid linking this project in any way to the KE stuff, which it actually predates by far (https://news.ycombinator.com/item?id=6209088). -- Sylvain Zimmer http://sylvinus.org _______________________________________________ Wikitech-l mailing list Wikitech-l(a)lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l

1 0

Re: [discovery] [Ops] First two weeks as part of the team
by Chad Horohoe 04 Mar '16

04 Mar '16

You may also find these diagrams useful: https://commons.wikimedia.org/wiki/File:Wikipedia_webrequest_flow_2015-10.p… https://wikitech.wikimedia.org/wiki/File:Infrastructure_overview.png -Chad On Feb 24, 2016 6:58 PM, "Mukunda Modell" <mmodell(a)wikimedia.org> wrote: > >> On Feb 17, 2016 1:50 AM, "Guillaume Lederrey" <glederrey(a)wikimedia.org> > >> wrote: > >>> * I still have not found a global architecture schema (something like > >>> a high level component or deplyoment diagram). But I have never seen > >>> any company having those... > > I made a diagram of the scap (mediawiki) deployment architecture a while > back: https://commons.wikimedia.org/wiki/File:Scap-diagram.png .. > > That does not exactly apply to the new scap3 architecture but it's not too > far off. > > .... > > On Thu, Feb 18, 2016 at 10:37 AM, Giuseppe Lavagetto > <glavagetto(a)wikimedia.org> wrote: > > > > About cherry-picks in beta: the problem is not cherry-picking (I think > > it's a reasonable way to test things) but persistent cherry-picking to > > monkey patch problems is. I think if we follow the flow of: > > > > - writing a patch > > - testing it on beta with a cherry-pick > > - get it merged on ops/puppet and production > > There are a lot of patches on beta these days and there have been a lot of > different people cherry-picking without much coordination. This has lead > to breakage quite often. Patches also get lost regularly. I assume this > usually happens because someone has rebased the HEAD and accidentally > dropped a patch. > > It can be really difficult to get a patch merged in ops/puppet within a > week (or even a month). I've seen a lot of patches sit around for weeks and > even now with the Puppet SWAT windows, it's still sometimes unrealistic to > expect patches get merged into production that quickly. (+CC Tyler) > > Without a system to manage things, and with very little coordination > between everyone who is working on beta, I don't expect the situation to > improve too much. > > I intend to propose a solution for beta & puppet patch cherry-picks very > soon, however, I haven't fully formulated my proposal yet. I will write to > the ops list when I have something written in a clear and presentable way. > > _______________________________________________ > Ops mailing list > Ops(a)lists.wikimedia.org > https://lists.wikimedia.org/mailman/listinfo/ops > >

3 2

Wikivoyage mapframe
by Yuri Astrakhan 03 Mar '16

03 Mar '16

Yesterday, I posted a note on all Wikivoyages' traveler pubs, except EN and RU, describing our maps efforts and asking to switch their mapframe template to the Wikimedia maps service. Two languages are having a discussion, some have switched, some have not replied. *== Discussion ==* de Deutsch https://de.wikivoyage.org/wiki/Wikivoyage:Lounge#Maps_for_Wikivoyage nl Nederlands https://nl.wikivoyage.org/wiki/Wikivoyage:Reizigerscaf%C3%A9#Maps_for_Wikiv… *== Waiting ==* fa فارسی he עברית it Italiano pl Polski pt Português sv Svenska uk Українська vi Tiếng Việt *== Done ==* el Ελληνικά en English es Español fr Français ro Română (no embedded maps, fixed myself) ru Русский zh 中文

3 4

DevopsDay Amsterdam - June 29 - July 1
by Guillaume Lederrey 03 Mar '16

03 Mar '16

Hello teams! I'm thinking in going to DevopsDays Amsterdam (June 29th, 30th and July 1st 2016) [1]. It might be a good occasion to meet some of you in real life if you are so inclined. For those who do not know what DevopsDays are, it is a fun conference format, with a fairly small audience (200-400 people) centered around Devops themes (no surprise). The really fun part of this conference is the Open Spaces [2]. It is a good occasion to have great conversations with smart people, exchange idea, spread our knowledge, ... Let me know... MrG [1] http://www.devopsdays.org/events/2016-amsterdam/ [2] http://www.devopsdays.org/pages/open-space-format/

1 0

Re: [discovery] [Ops] Orchestration of deployments
by Guillaume Lederrey 03 Mar '16

03 Mar '16

I have been wanting to get a deeper look at Salt for a long long time... Seems that the time has come! On Tue, Mar 1, 2016 at 10:48 PM, Andrew Otto <otto(a)wikimedia.org> wrote: > Could also put it in puppet in elasticsearch module or role. > > We often use salt for things like this. It works (sometimes?)! > > > > On Tue, Mar 1, 2016 at 2:47 PM, Daniel Zahn <dzahn(a)wikimedia.org> wrote: >> >> >> On Tue, Mar 1, 2016 at 10:02 AM, Guillaume Lederrey >> <glederrey(a)wikimedia.org> wrote: >>> >>> Do we have a central place for those kind of scripts? I'd like to >>> version it in a more obvious place than my personal Github repo. >> >> >> +1 for not using personal Github. If in doubt you can always use this: >> >> operations/software Random software tools for ops tasks (svn2git, >> udpprofile, etc) >> >> and then there are a lot of repos under operations/software/foo >> >> >> -- >> Daniel Zahn <dzahn(a)wikimedia.org> >> Operations Engineer >> >> _______________________________________________ >> Ops mailing list >> Ops(a)lists.wikimedia.org >> https://lists.wikimedia.org/mailman/listinfo/ops >> >

1 0

Tech talk on the topic of Search at Airbnb
by Moiz Syed 02 Mar '16

02 Mar '16

Hey all, Just found out that there is a tech talk happening at Airbnb next Wednesday on the topic of Search. We can all go check it out. We can also say hi to Juliusz while we're there. https://www.airbnb.com/meetups/b8zxexa2q-search-airbnb

2 1

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

Discovery March 2016