Re: [Wikitech-l] Git, Gerrit and the coming migration

10 Mar 2012

      I am a seasoned developer. I only do it in my spare time and only 
when I am particularly annoyed about something not working in MediaWiki
(mostly as a feedback from plwiki community or recently checkusers).
You can see my pitiful track record here:
https://www.mediawiki.org/wiki/Special:Code/MediaWiki/author/saper
Most of those things are urgent one-liners that got to be pushed
to deployment very quickly and I never had any problem with getting
it through the process. But I'm a small guy. So please do not
treat my remarks below as something that would make life of people
pushing hundreds of lines per week to MediaWiki more difficult.
I only don't like (as it happends with git to me recently) to
learn the tool once again from scratch just because I haven't used
it for the last 4 weeks or so.
...
...
Antoine Musso hashar+wmf@free.fr wrote:
Diederik van Liere wrote:
...
We are making three fundamental changes in one-shot:
They are not that much of change. It is like if you changed from using a
paper map and an old car to a nice SUV with a GPS. It is still a lot of
metal on 4 wheels with one purpose: move some fresh meat from A to B.
The model is the same. Only the tool changes.
(you can quote me on this when we finally take the decision to migrate
to JavaScript or Python)
...

Migrating from a centralized source control system to a

decentralized system (SVN -> Git)
Decentralization itself is just a buzz word for the twitter guys. In the
end, it does not change that much since most people have a reference
repository.  I guess most developers will use the WMF repository as a
reference, or at the very least, all patches will eventually end up in
the WMF repository.
We could imagine having the WMF feature team to use their own repository
then submit a nice giant patch once in a while.
I am using mercurial, git and starting to learn fossil. There is one 
change which is partially related to  tools, partally to the distributed
nature of git.
The fundamental change is something else to me: you lose feeling
of linearity. I like hg because it still tries to give me a cosy nice
local version numbers (great to switch from SVN, you can even have
your old SVN commit numbers to stay after migration). But when
I look at the gerrit interface (not gerrit's fault) I have
no idea what was done before, what was done after, what's the history.
What happend to me on a very first day trying gerrit:
I got an email to merge a change (since somebody pushed something
conflicing in between), so I duly issued some magic git commands
and I got it pushed. However, when I came back to gerrit
I ended up with an "empty" commit:
https://gerrit.wikimedia.org/r/#change,2916
I thought I did merge though! What happened? Looks an empty commit
got filed...
I tried poking around in gerrit to find out what happened I 
had no clue, only running "git log" locally revealed that
may change was indeed merged by somebody else in the background.
I had a feeling I have 4 or more revisions flying around ("commits")
and I could not relate them to each other. Only "git log"
locally helped me to get out of the trouble.
Looking at this screen:
https://gerrit.wikimedia.org/r/#dashboard,103
Those two commits are related, but it's totally non-obvious that
something follows up on something else. I have clicked on them
and yes, I can find that 87f491132487313144e531354578ea2fbd3b42b4
is common to both of them. Oh, cool!
In comparison to this, the current Special:Code follow up revision
system is easy, readable and very useful.
Oh, and by the way those I3577f029 and Ifb002160 are some
identifiers totally unrelated to commits (I need to learn more about
Change-Id vs. a commit... I promise I will - already got burned
by missing pre-commit hook in the repo). And there is only
date, not a timestamp to get some sense of linearity again.
I am really afraid I will be lost when my dashboard will have
many more patches and merges.
Looking at this or any more complex git development tree makes my
cry for linear revision numbers. At least I can find out what
was before, what is after - sure it comes at a cost of potentially
more difficult merging and branching, but let's be serious,
how many "edit conflicts" do we have in the tree?
I don't think that our development timeline is more complex than this:
http://fossil-scm.org/index.html/timeline?n=200
and this is so much more readable (and yes, I know gitk).
I read in this thread that there could be a tree-like priority system
to sort out more impactful changes from more specific ones.
Building such a tree can be very challenging and as far as I understand
we don't have a tool yet. We end up with a bunch of loose commits,
somehow connected to each other, not linearized.
And from experience, trivial and small patches get through
to deployment very fast. It's larger things that have to wait
longer...
I presume this is less of a problem with the current use in 
operations where changes are of different nature.
...
...

Introducing a new dedicated code-review tool (Gerrit)

That one is a habit change. It is a bit disturbing for the first week,
just like any new web interface. We will eventually get used to it.  I
am sure people will easily adapt to the GUI and we will be there to assist.
As described previously in this thread I would describe myself as  
a simple git fetch/push/commit guy. No rebase, no cherry-pick yet.
Maybe the world is still two-dimensional to me.
To solve problems with the above (simple fetch / push model
except for WAY TOO FANCY things like HEAD:refs/for/<some-branch>
I am told to refer to this:
https://www.mediawiki.org/wiki/Gerrit/resolve_conflict
six commands, including "git-review" I still don't have running.
I am afraid to ask what's the non-git-review version of the 
commands in the guide :)
Looking at this again:
https://gerrit.wikimedia.org/r/#change,2916
I have 28f176ca9ca3767bfa9f0ec219f5fa0c299c5761 and
87f491132487313144e531354578ea2fbd3b42b4 here (those are commits,
fine) and Ifb002160485030496c7d3f2abc4991484b533648
Additionally there is this c64fd4488d2ea24e120acb15db413377494dd3b3 
("Patch Set 1") referring me to (gitweb) which is calls it "commit".
Ah, and there is 1101a1b3fe7f4d1c29321157fc1ef9b9f3fb6ff0 as well.
Ouch and there is this "refs/changes/16/2916/1" <-- the good think
I can actually click on it in gitweb!
All this makes "MFT r111795, r111881, r111920, r112573, r112995, r113169"
looks pale in comparison. And I can actually click a link in [[Special:Code]],
and go back and forth on followups, neat!
The only real confusion to me in the current system is the role 
of "1.19" and "1.19wmf1" tags. It took me a while to figure out
they are something like github's pull request OR bumping revsion's
priority for review with "1.19wmf1" meaning "urgent!"
...
...

Introducing a gated-trunk model

We have been using a gated-trunk model for as long as I can remember.
Here how it goes with subversion/CodeReview:
=======================[ SVN PROCESS ]================================

someone submit its patch proposal in subversion trunk
patch is reviewed then either it:
 -> gets rejected : revision is reverted and marked as such
 -> is accepted  : revision marked 'ok'
 -> it needs enhancement : marked 'fixme' repeat :-)

From time to time, all patches marked 'ok' are allowed to pass the gate
and land in a wmf branch. Then we deploy them.
======================================================================
We will use the exact same model with git/gerrit:
=======================[ GIT PROCESS ]================================

someone submit their patch proposal in Gerrit
patch is reviewed then either it:
-> gets rejected: marked abandoned in Gerrit
-> is accepted : patch is merged in WMF reference repository by Gerrit
-> it needs enhancement : comment asking submitter to enhance it.

From time to time, all patches merged in the master branch are allowed
to pass the gate and land in a wmf branch. Then we deploy them.
As a summary:
commit to trunk      --> submit to Gerrit
 revision marked 'ok' --> change merged
 trunk to WMF branch  --> master into WMF
I have a small confusion here:
1) Is this true that under current process all unreverted stuff
from SVN will eventually make it to the live site whenever new "wmf"
branch will be created? I think lots of new stuff gets in here this
way, not necessarily be explicit merge into whatever wmf branch
du jour is. Yes, it creates a nice thrill in the community whenever
we do this.
2) My understanding until now was that WMF is going to run master
in production. Why create another WMF branch then? So we will have
a one step to push things to (review branch->master) and then
(master->WMF) to deploy? This would be actually a double-gated deployment
branch. Ouch.
3) If (1) above is true, at least now stuff unreviewed and left alone
(i.e. not obviously dangerous) will make it into the deployment
(worst case with the new 1.X version). This creates a kind of pressure
in the system (good or bad), but will not let changes to lie forever.
This won't be a case in the new model I guess.
---
Maybe I'm an SVN backpedaller (although hg is my primary vcs now),
but I don't think benefits of easy cherry-picking from what is
today's trunk to wmf branches are so cool that we need to use
those new, imperfect tools. Is branching/merging and conflict
resolution between developers such a problem in MediaWiki?
I really think that tighter integration with bugtracker (so
bug attachments end up in vcs review queue and commit comments
can be seen as quasi-bugs) would be much more beneficial to
users. I will try to see how it would have worked with systems
like fossil for example and report back.
And, having seen fossil, bzr and other systems, I just don't 
subscribe to idea that DVCS *must* have the user interface
from hell. And I am not going to use git often enough to have
all commands in my head (or even neatly scripted as some people
suggest). I love the change the DVCS's bring but not at the
cost of needing to say "git push origin HEAD:refs/for/master" 
or something like that. (I had to go back to docs to write
this command again).
//Saper

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

Re: [Wikitech-l] Git, Gerrit and the coming migration