Hello team!
I've been with you for 2 weeks. As Tomasz suggested, I might be the
right moment to list what I have discovered so far. I'm trying to
point out the difference I see from what I'm used to, not to judge
(yet), I don't have enough context for that... Still I am biased by my
previous experience and that will show up below...
I need to unlearn a few things. It seems that last week, each and
every decision I took has been challenged by someone (usually for good
reasons). Things that I take for granted as "the right way (tm)" are
actually done in a fairly different way here. Examples:
== Distributed team ==
** my belief **
It is hard to be part of a distributed team.
** what happens at WMF **
You've all been extremely welcoming! Having the chance to see you in
person in SF in January definitely helps (it is a bit harder for me to
find my way around the Ops team, probably in part because I have not
had the chance to meet them in person). The fact that we also have
kind of informal conversations (IRC, unmeeting, ...) helps to belong.
It seems to me that you've made all the necessary effort to include
me.
** comments **
I still have to make an effort to get in touch when I need to. I'm so
used to coffee breaks where you take time to discuss whatever needs
discussing. IRC feels more intrusive as it is a continuous flow, not a
set time where you go have coffee and do the needful... I also need to
learn to ignore IRC interruptions. It still feels that I might miss
something important and that there is just too much information
sometimes...
== Versionning ==
**my belief **
anything deployed must have a version number
** what happens at WMF **
* deployments on labs are pretty much free-form, cherry pick whatever
you want on puppetmaster
* deployments on prod seems to have version numbers at least for
mediawiki code, puppet code is deployed directly from production
branch
** comments **
Having clear version numbers implies having a conscious decision of
creating a version, potentially with the appropriate checks of the
content of that version, additional testing. It allows to have a clear
separation between creating a version and promoting it to production.
Not having versions everywhere allows for more flexibility and puts
responsibility of making the right choices more on the people than on
the process. Probably a good thing if you have smart enough people
(and WMF seems to have a pretty smart crowd).
Having a shared git repository on deployment-puppetmaster scares the
hell out of me! I'm so used to preparing anything I want to push
locally and then just applying a specific tag / version...
== Cherry-picking ==
** my belief **
cherry-picking is bad and should be used only as a last resort solution
** what happens at WMF **
* cherry picking is the norm
* this seems to be influenced by Gerrit, which promotes cherry picking
and single commits patches
** comment **
I'm all for rewriting git history to make it more readable, to help
tell the story of what is happening to the code. I think that branches
and merges are a good tool for that. Cherry picking fixes from one
branch to the next leaves a lot of opportunities to forget one.
Merging helps tell the story of "those are all the fixes done on
branch X, I've applied them on branch X+1". Also, I'm not a huge fan
of gerrit idea of changes being a single commit. Having a coherent
change split in multiple phases make sense to me (for example: 1)
preliminary refactoring, 2) my actual work 3) some clean up I did
along the way). I need to dig deeper into topic branchs and how they
integrate with gerrit (yes, I am brand new to gerrit).
It also seems that all this cherry picking creates much more
flexibility (I can take any commit and apply it anywhere). Again,
giving control back to the human and not to the tool.
== Stupid code is good code ==
I need to write a blog post about this one, but like Kernighan said
"Everyone knows that debugging is twice as hard as writing a program
in the first place. So if you're as clever as you can be when you
write it, how will you ever debug it?" [1]. Looking at our code
(mainly puppet at the moment), I think there are quite a few places
where we do the smart thing, when stupid would be sufficient. That's
probably the cost of hiring smart people ...
== A few random points ==
* We have an incredible amount of documentation. It is easy to read
(I've been drawn into it and lost much time). It is also outdated in
some place (documentation always is).
* so many different ways to deploy (puppet, trebuchet, salt, manual stuff, ...)
* I still have not found a global architecture schema (something like
a high level component or deplyoment diagram). But I have never seen
any company having those...
[1]: https://en.wikiquote.org/wiki/Brian_Kernighan
Luca: I love what you have done with this place! Yes, I need to do my
part of that documentation work and I have not done it yet... I'll see
what I can do...
On Fri, Feb 19, 2016 at 9:50 AM, Luca Toscano <ltoscano(a)wikimedia.org> wrote:
> Hello!
>
> On Thu, Feb 18, 2016 at 10:37 AM, Giuseppe Lavagetto
> <glavagetto(a)wikimedia.org> wrote:
>>
>> [X-posting to ops as this discussion is relevant there too]
>>
>> On Wed, Feb 17, 2016 at 5:53 PM, Erik Bernhardson
>> <ebernhardson(a)wikimedia.org> wrote:
>> > On Feb 17, 2016 1:50 AM, "Guillaume Lederrey" <glederrey(a)wikimedia.org>
>> > wrote:
>>
>> >> * I still have not found a global architecture schema (something like
>> >> a high level component or deplyoment diagram). But I have never seen
>> >> any company having those...
>> >
>> > Pretty sure one doesn't exist :(
>>
>> Luca (the new analytics opsen) has started to work on
>> https://wikitech.wikimedia.org/wiki/File:Infrastructure_overview.png
>>
>> I asked him to share the sources for it so that everyone can improve it.
>>
>> Also, if you need some oral history, just ask opsens and we'll be
>> happy to give you an overview of how things work :)
>
>
>
> I will try to update the schema with more up to date information and I'll
> also share the source for draw.io with it (probably next week). There is a
> lot of useful docs related to architecture, each one focusing on different
> aspects (and points in time!):
>
> - https://wikitech.wikimedia.org/wiki/Clusters, General overview
> - https://wikitech.wikimedia.org/wiki/LVS_and_Varnish (welcome to
> Wonderland)
> - https://wikitech.wikimedia.org/wiki/Network_design (welcome to Wonderland
> part two)
> -
> https://wikitech.wikimedia.org/wiki/LVS_and_Varnish#/media/File:Wikipedia_w…
> - puppet site.pp!
> - ...
>
> In my opinion new opsens should fill the gaps and/or update the outdated
> pages, it is a good way to meet new people :)
>
> Luca
>
>
>
> _______________________________________________
> Ops mailing list
> Ops(a)lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/ops
>
Short questions about how we organize our work...
I'm using mainly
https://phabricator.wikimedia.org/tag/discovery-search-sprint/ and
https://phabricator.wikimedia.org/project/board/1227/ to track what
I'm doing. I have a few things I do not know where to put:
T109101 is actually done, but waiting for the elasticsearch upgrade to
push it at the same time to prod (to do only one restart). This is
still WiP, so I dont want to close it, but it is not really in need of
review either.
Any idea?
MrG
I signed on to receive updates regarding the process of launching new
Search on Wikimedia, but find that the list serve content is not what I assumed
it would be. So please Unsubscribe me. I assumed I had made that change
via prior alteration of my subscription at the site, but I am today and
yesterday still receiving further mailings. Thank you.
Gergo, good to know, thanks. Graph extension itself does not know how long
the data is valid - it simply gets a URL from which to get the pageviews
(or any other) data. At this point, only the person who writes the graph
template knows how long its valid for.
We could add an extra attribute to the graph, e.g. <graph refresh="60">
(number of minutes), to let graph extension update cache expiry.
On Thu, Feb 18, 2016 at 11:04 PM, Gergo Tisza <gtisza(a)wikimedia.org> wrote:
> On Thu, Feb 18, 2016 at 9:02 AM, Yuri Astrakhan <yastrakhan(a)wikimedia.org>
> wrote:
>
> > It will be updated whenever the page containing the template is
> > re-generated (e.g. the page is changed, or someone does a null-save). I
> > heard that every page is forcefully regenerated if its older than 30
> days,
> >
>
> Yes, and extension tags embedded in the page can reduce that, so if the
> graph has a way of knowing how long the data will be valid, it can tell
> that to the parser via ParserOutput::updateCacheExpiry.
> As a hacky manual workaround, you can put <div
> style="display:none">{{CURRENTHOUR}}</div> into the page to force hourly
> refresh.
> _______________________________________________
> Wikimedia-l mailing list, guidelines at:
> https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines
> New messages to: Wikimedia-l(a)lists.wikimedia.org
> Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l,
> <mailto:wikimedia-l-request@lists.wikimedia.org?subject=unsubscribe>
>
Hello all,
The Discovery team recently updated the Wikipedia.org
<http://wikipedia.org/> portal page by moving all inline JavaScript into a
separate file in order to analyze the amount of incoming traffic that use
JavaScript-friendly browsers. This information is very important to our
team, as we endeavor to make the portal page more interesting and user
friendly for all our visitors.
The results were very encouraging - here's the executive summary from
the analysis
document
<https://commons.wikimedia.org/wiki/File:Analysis_of_Wikipedia_Portal_Traffi…>
that
can also be accessed from the Wikimedia Discovery page
<https://www.mediawiki.org/wiki/Wikimedia_Discovery#Wikipedia.org_Portal_Page>
:
*On 5 February 2016 we deployed a patch to the Wikipedia Portal moving the
inline JavaScript into a separate file, which enabled us to finally measure
the proportion of traffic with JS support separate from the overall traffic
to the Portal. This report covers logs of HTTP request from 5 Feb to 10
Feb, 2016. *
*Overall, 93% of the requests made to the Wikipedia Portal have JS support.
However, a large component (45%) of this overall percentage is accounted by
traffic from United States, which has an overall proportion of 96%. The
remaining 55% of the traffic from 234 other countries show a lot of
variation in JS support, with 86.5% on average. *
*We also performed an analysis of browser usage and learned that approx.
75% of the traffic comes from users with relatively modern browsers, with a
few exceptions such as Internet Explorer 8 (3.2% of total traffic). Of
those 17 browsers, 14 had populations with more than 93% JS support. That
is, less than 7% of those browsers’ users had turned off JavaScript for
privacy/bandwidth/other reasons. Interestingly, only 80% of Opera Mini 7
traffic and 60% of Android 4 / Chrome Mobile 30 traffic had JS support.*
Please let us know if there are any concerns or questions!
Cheers,
Deb
--
Deb Tankersley
Product Manager, Discovery
Wikimedia Foundation
Heyo!
As an effort to be more transparent about what I, at least, am working
on[0] I've resolved to send the sort of notes I'd usually send to the
standup, on what I am working on, have worked on, and will be working
on, to the public list as well. That way they can be referred back to
and the community can ask questions.
So!
I've been out for a week and a bit, so the "worked on" is a bit
sparse. But this morning I:
* Completed code review for the data collection code to get the zero
results rate by project (https://phabricator.wikimedia.org/T126244),
dashboarding for the same (https://phabricator.wikimedia.org/T110590),
and displaying browser usage on the portal dashboards
(https://phabricator.wikimedia.org/T124827). These will all hopefully
be deployed today (Mikhail will send out a distinct email when that
happens)
* Poked Legal again as part of our ongoing efforts to make public our
guidelines around data collection and storage
(https://phabricator.wikimedia.org/T123673)
I will be:
* Working on collecting data on the number of pageviews we get on the
portal, and visualising the same
(https://phabricator.wikimedia.org/T125737)
* Getting back into the loop and finding out what's happening with our
new A/B tests.
Thanks!
-O
[0] Phabricator exists but is fragmented
--
Oliver Keyes
Count Logula
Wikimedia Foundation