Hi all,
Some disclaimers before I start my thread:
1) I am a big believer in Git and dvcs and I think this is the right decision 2) I am a big believer in Gerrit and code-review and I think this is the right decision 3) I might be wholly unaware / inaccurate of certain things, apologies in advance. 4) A BIIGG thankyou to all the folks involved in preparing this migration (evaluation, migration and training): in particular Chad, Sumanah and Roan (but I am sure more people are involved and I am just blissfully unaware).
My main worry is that we are not spending enough time on getting all engineers (both internal and in the community) up to speed with the coming migration to Git and Gerrit and that we are going to blame the tools (Gerrit and/or Git) instead of the complex interaction between three changes. We are making three fundamental changes in one-shot: 1) Migrating from a centralized source control system to a decentralized system (SVN -> Git) 2) Introducing a new dedicated code-review tool (Gerrit) 3) Introducing a gated-trunk model
My concern is not about the UI of Gerrit, I know it's popular within WMF to say that it's UI sucks but I don't think that's the case and even if it was an issue it's only minor. People have already suggested that we might consider other code-review systems, I did a quick Google search and we are the only community considering migrating from Gerrit to Phabricator. I think this is besides the point: the real challenge is moving to a gated-trunk model, regardless of the chosen code-review tool. I cannot imagine other code-review tools that are also based on a gated-trunk model and work with Git are much easier than Gerrit. The complexity comes from the gated-trunk model, not from the tool.
The gated-trunk model means that, when you clone or pull from master, it might be the case that files relevant to you have been changed but that those new changes are waiting to be merged (the pull request backlog, AKA the code-review backlog). In the always-commit world with no gatekeeping between developers and master, this never happens; your local copy can always be fully synchronized with trunk ("master"). Even if a commit is reverted, then your local working copy will still have it, and any changes that you might have based on this reverted commit, you can still commit. Obviously people get annoyed when you keep checking in reverted code, but it won't break anything.
In an ideal world, our code-review backlog would be zero commits at any time of the day, if that's the case then 'master' is always up-to-date and you have the same situation as with the 'always-commit' model. However, we know that the code-review backlog is a fact and it's the intersection of Git, Gerrit and the backlog that is going to be painful.
Suppose I clone master, but there are 10 commits waiting to be reviewed with files that are relevant to me. I am happily coding in my own local branch and after a while ready to commit. Meanwhile, those 10 commits have been reviewed and merged and now when I want to merge my branch back to master I get merge conflicts. Either I discover these merge conflicts when my branch is merged back to master or if I pull mid-way to update my local branch.
To be a productive engineer after the migration it will *not* be sufficient if you have only mastered git clone, git pull, git push, git add and git commit commands. These are the basic git commands.
Two overall recommendations:
1) The Git / Gerrit combination means that you will have to understand git rebase, git commit --amend, git bisect and git cherry-pick. This is advanced Git usage and that will make the learning curve steeper. I think we need to spend more time on training, I have been looking for good tutorials about Git&Gerrit in practise and I haven't been able to find it but maybe other people have better Google Fu skills (I think we are looking for advanced tutorials, not just cloning and pulling, but also merging, bisect and cherrypick).
2) We need to come up with a smarter way determining how to approach the code-review backlog. Three overall strategies come to mind: a) random, just pick a commit b) time-based picking (either the oldest or the youngest commit) c) 'impact' of commit
a) and b) do not require anything but are less suited for a gated-trunk model. Option c) could be something where we construct a graph of the codebase and determine the most central files (hubs) and that commits are sorted by centrality in this graph. The graph only needs to be reconstructed after major refactoring or every month or so. Obviously, this requires a bit of coding and I don't have formal proof that this actually will reduce the pain but I am hopeful. If constructing a graph is too cumbersome then we can sort by the number of affected files in a commit as a proxy. If we cannot come up with a c) strategy then the only real option is to make sure that the queue is as Wikimedia short as possible.
Best, Diederik
On 06/03/12 20:20, Diederik van Liere wrote:
My concern is not about the UI of Gerrit, I know it's popular within WMF to say that it's UI sucks but I don't think that's the case and even if it was an issue it's only minor. People have already suggested that we might consider other code-review systems, I did a quick Google search and we are the only community considering migrating from Gerrit to Phabricator. I think this is besides the point: the real challenge is moving to a gated-trunk model, regardless of the chosen code-review tool. I cannot imagine other code-review tools that are also based on a gated-trunk model and work with Git are much easier than Gerrit. The complexity comes from the gated-trunk model, not from the tool.
Hmm. No. I understand that the UI when you have a change proposal, with 4 old alternatives, depending on several commits and other ones depending on it will need to be more complex than subversion. But when you look at a single diff it shouldn't be harder than our well-known Code Review.
The gated-trunk model means that, when you clone or pull from master, it might be the case that files relevant to you have been changed but that those new changes are waiting to be merged (the pull request backlog, AKA the code-review backlog).
Maybe there should be a way to fetch all commits pending in gerrit and show them to you inside git interface.
Suppose I clone master, but there are 10 commits waiting to be reviewed with files that are relevant to me. I am happily coding in my own local branch and after a while ready to commit. Meanwhile, those 10 commits have been reviewed and merged and now when I want to merge my branch back to master I get merge conflicts. Either I discover these merge conflicts when my branch is merged back to master or if I pull mid-way to update my local branch.
I'm more afraid that you commit and then we have 11 conflicting commits waiting for review. And noone to even notice about that until two months later. We will probably be in the situation where the same trivial fix is independently submitted by several developers, though.
I wonder, does gerrit allow you to rebase a commit to an older version from its interface? (I assume rebasing to a newer one is perfectly supported, as it'd be a cherry-pick)
- We need to come up with a smarter way determining how to approach
the code-review backlog. Three overall strategies come to mind: a) random, just pick a commit b) time-based picking (either the oldest or the youngest commit) c) 'impact' of commit
I don't think that's different than how we do it now. It's just that reviewing becomes more important.
On Wed, Mar 7, 2012 at 5:20 AM, Diederik van Liere dvanliere@gmail.com wrote:
My concern is not about the UI of Gerrit, I know it's popular within WMF to say that it's UI sucks but I don't think that's the case and even if it was an issue it's only minor.
We are changing from a CR system where everything (or atleast almost everything) was displayed on a single (and non confusing) page to a system that to plainly put it, is a clusterfuck of confusing (And i'm known for choosing the more time consuming/evil/painful methods of doing things),
One of the major sticking points (that from my understanding is what the WMF people are hateing on) is the diff views in comparison of CR to Gerrit, For example we used to have them all display on one page where as with Gerrit you need to open each one (each changed file) in a separate tab, I don't really call that a minor issue.
On Tue, Mar 6, 2012 at 3:42 PM, K. Peachey p858snake@gmail.com wrote:
One of the major sticking points (that from my understanding is what the WMF people are hateing on) is the diff views in comparison of CR to Gerrit, For example we used to have them all display on one page where as with Gerrit you need to open each one (each changed file) in a separate tab, I don't really call that a minor issue.
Something that RobLa discovered recently is that there are arrows on the sides of each diff view, and you can use those to go to the prev/next file. This is totally non-obvious of course, but it makes the interface look a little less braindead.
Roan
On Tue, Mar 6, 2012 at 6:57 PM, Roan Kattouw roan.kattouw@gmail.com wrote:
On Tue, Mar 6, 2012 at 3:42 PM, K. Peachey p858snake@gmail.com wrote:
One of the major sticking points (that from my understanding is what the WMF people are hateing on) is the diff views in comparison of CR to Gerrit, For example we used to have them all display on one page where as with Gerrit you need to open each one (each changed file) in a separate tab, I don't really call that a minor issue.
Something that RobLa discovered recently is that there are arrows on the sides of each diff view, and you can use those to go to the prev/next file. This is totally non-obvious of course, but it makes the interface look a little less braindead.
I'd figured that out a little while ago. I guess it seemed a little more obvious to me. Perhaps we should start a "How do I...?" or FAQ page for Gerrit, so we can start collecting these common questions in a single place.
-Chad
On Tue, Mar 6, 2012 at 6:42 PM, K. Peachey p858snake@gmail.com wrote:
On Wed, Mar 7, 2012 at 5:20 AM, Diederik van Liere dvanliere@gmail.com wrote:
My concern is not about the UI of Gerrit, I know it's popular within WMF to say that it's UI sucks but I don't think that's the case and even if it was an issue it's only minor.
We are changing from a CR system where everything (or atleast almost everything) was displayed on a single (and non confusing) page to a system that to plainly put it, is a clusterfuck of confusing (And i'm known for choosing the more time consuming/evil/painful methods of doing things),
I'll be the first to admit that Gerrit may not be the single most intuitive piece of software. But to call it a clusterfuck is doing it a disservice. The learning curve is steeper than our home-grown CodeReview tool, but I honestly don't think it's insurmountable.
Learning a new tool and new workflow is jarring, but I think it's a feeling that will pass slightly as time goes on. That's part of the reason we pushed the decision about Phabricator out by a few months-- it allows us to get some hands-on experience with Gerrit. By that point I hope we'll be moving past the initial Git learning curve, and I think it'll allow us to make a better decision about code review tools without being clouded by git-isms (some of which will remain regardless of the tool in question).
Sadly you only get one first impression, and Gerrit doesn't do a good job at that :(
-Chad
On 7 March 2012 02:13, Chad innocentkiller@gmail.com wrote:
On Tue, Mar 6, 2012 at 6:42 PM, K. Peachey p858snake@gmail.com wrote:
On Wed, Mar 7, 2012 at 5:20 AM, Diederik van Liere dvanliere@gmail.com wrote:
My concern is not about the UI of Gerrit, I know it's popular within WMF to say that it's UI sucks but I don't think that's the case and even if it was an issue it's only minor.
We are changing from a CR system where everything (or atleast almost everything) was displayed on a single (and non confusing) page to a system that to plainly put it, is a clusterfuck of confusing (And i'm known for choosing the more time consuming/evil/painful methods of doing things),
I'll be the first to admit that Gerrit may not be the single most intuitive piece of software. But to call it a clusterfuck is doing it a disservice. The learning curve is steeper than our home-grown CodeReview tool, but I honestly don't think it's insurmountable.
TL;DR: I've said to many people that I have concerns about this move, so I'm trying to explain what concerns me and asking what are the benefits for me.
Usually when something is replaced with more complicated solution, there are some needs it addressed or new benefits it brings. But honestly, I'm quite happy with the current tools with all their shortcomings. We actually get stuff done, even if breaking lots of things while doing that and only noticing so late that the only option is to fix it as well as possible.
So what are the benefits of Git+Gerrit+Gated trunk? I can think many tasks it makes harder (for me), but I can hardly think any benefits (for me). The ops/lead engineers might be happy when they get a stable master. But if that is because development slows down due to increased hurdles getting stuff into the master, is it really a benefit at all? The "wait until you can't revert" model above is certainly not going to work with gated trunk, and I doubt that anybody suddenly starts reviewing those big refactorings or commits to the areas of codebase nobody really knows when we switch over.
Our code base is very old and complex to the extend that many areas of it are not understood but only few developers, if anyone. Yet those few developers do not have the time to review patches or improve those areas of the code to the extend that other developers would be able to learn them and work on them. We need to keep modernizing our core code, making it less complex and designing good interfaces. I think the current state of core is already limiting innovation in extensions.
Rather than solving our review problem, I think GiGeGa might even make it worse by slowing the type development work mentioned above.
So once again, what are the benefits for me? The advantages listed at http://www.mediawiki.org/wiki/Git/Conversion#Rationale don't apply to me: * I rarely work off-line * I spend about 30 minutes a week merging code (this is supposed to be easier) * I spend around 10 hours a week reviewing code (this is going to be much more difficult) * I do many commits per day (not easier, don't know yet how much more difficult)
From this I gather the people who benefit most are people who just
want to do some small fix, fire it and forget it and people who spend lots of time merging or traveling.
One benefit I would like to see is that WMF would feel the pain needed to get the code of its various projects reviewed. It has many projects with single or few developers but nobody is assigned to do the code review on them from the very beginning. I'm afraid it will just find some way around requirement.
-Niklas
On Wed, Mar 7, 2012 at 10:43 AM, Niklas Laxström niklas.laxstrom@gmail.comwrote:
Rather than solving our review problem, I think GiGeGa might even make it worse by slowing the type development work mentioned above.
But that was the point right, to introduce this whole system? By using a
gated-trunk model, the development capacity is limited by the reviewing capacity, which in our case means that we have to slow down development.
Bryan
Le 07/03/12 10:43, Niklas Laxström a écrit :
- I spend around 10 hours a week reviewing code (this is going to be
much more difficult)
Here how one would start his code review day:
Load the Gerrit main interface. In the search box at the top right, enter your favorite project. For example: mediawiki/core The change list is now filtered.
Blame Gerrit for showing the sha1 instead of change number.
Then:
# Fetch change you are interested in: $ git review -d 1234
# Diff against gated trunk master branch: $ git diff origin/master
From there you can edit the patchset, even if it was submitted by
someone else. Then reuse his commit message and edit it!!
$ git commit -a --amend <append something like: patchset2: fixing typo by someone> $ git review -f # submit and then delete (-f) local branch
Repeat.
Sometime you get interrupted when doing a review. You could then git commit your current review progress then fetch another change to review / merge. You will then be able to come back to where you where at during review.
Anyway, that is a slightly different workflow, but eventually we will all get used to it. It is not harder than subversion and it is even a bit quicker :)
On 07/03/12 01:13, Chad wrote:
Learning a new tool and new workflow is jarring, but I think it's a feeling that will pass slightly as time goes on. That's part of the reason we pushed the decision about Phabricator out by a few months-- it allows us to get some hands-on experience with Gerrit. By that point I hope we'll be moving past the initial Git learning curve, and I think it'll allow us to make a better decision about code review tools without being clouded by git-isms (some of which will remain regardless of the tool in question).
Sadly you only get one first impression, and Gerrit doesn't do a good job at that :(
I don't see the point of not starting with the best tool from the beginning. Migrations are painful, and with permanent consequences, so the less the better. We may err on deciding which one is best, or not know about a better alternative until after migrating, but refusing to consider them? Imagine you were going to learn horseback riding, and were given a lame horse. - Hey! You have given me an injured horse. - First learn to gallop with it, then we can consider if it's worth changing it.
On Thu, Mar 8, 2012 at 3:08 PM, Platonides Platonides@gmail.com wrote:
On 07/03/12 01:13, Chad wrote:
Learning a new tool and new workflow is jarring, but I think it's a feeling that will pass slightly as time goes on. That's part of the reason we pushed the decision about Phabricator out by a few months-- it allows us to get some hands-on experience with Gerrit. By that point I hope we'll be moving past the initial Git learning curve, and I think it'll allow us to make a better decision about code review tools without being clouded by git-isms (some of which will remain regardless of the tool in question).
Sadly you only get one first impression, and Gerrit doesn't do a good job at that :(
I don't see the point of not starting with the best tool from the beginning. Migrations are painful, and with permanent consequences, so the less the better. We may err on deciding which one is best, or not know about a better alternative until after migrating, but refusing to consider them? Imagine you were going to learn horseback riding, and were given a lame horse.
- Hey! You have given me an injured horse.
- First learn to gallop with it, then we can consider if it's worth
changing it.
I'd hardly call Gerrit a lame horse, more like a horse with funny spots on it and an extra tail.
Also: what's this mythical "best tool?" I've not seen it suggested before.
-Chad
I'd hardly call Gerrit a lame horse, more like a horse with funny spots on it and an extra tail.
Also: what's this mythical "best tool?" I've not seen it suggested before.
+1
There are alternative solutions, but none of them are viable without development work. Gerrit is viable right now, in its current state. Its downside is that its interface is slightly painful.
Every tool we use is going to have something we dislike about it interface-wise. Let's work with the OpenStack team and improve it.
- Ryan
K. Peachey wrote:>
One of the major sticking points (that from my understanding is what the WMF people are hateing on) is the diff views in comparison of CR to Gerrit, For example we used to have them all display on one page where as with Gerrit you need to open each one (each changed file) in a separate tab, I don't really call that a minor issue.
I like Gerrit diff system. Yeah I know that sounds like trolling read below though before discarding this mail.
Whenever you click the [Diff All Unified] in Gerrit, a whole bunch of windows open. Here how to use that: - pick a window, review code, comment it. - have you finished that window ? -> yes : close it -> no : cycle to next window - Repeat.
Since comments can be saved individually, you can safely close a window without losing them.
Another advantage is that you can put the files side by side which helps sometime.
Previously, we had to move all back to the bottom to add a comment then go up again. Same applied when reviewing changes in several files. Up and down constantly. Sometime I ended up doing a svn diff -c 1234 on each file I was interested in.
When you have patchsets accumulating, you can review the whole bunch of them since the diff of the lastest patch include all previous patchsets already. That is a huge advantage over reviewing a bunch of follow up in our svn world.
If you really want to check what patchset 32 changed, just ask Gerrit to do the diff based on patchset 31. There is a selector for that.
We can ignore whitespaces from the web interface! One of the alias I constantly use is:
svn diff -c <revision> -x -wbu
Which does a diff of revision <revision> ignoring white space.
So yes. Gerrit diff is a bit strange but I am sure people will get used to it.
I like Gerrit diff system. Yeah I know that sounds like trolling read below though before discarding this mail.
Sure, inline comments are cool but I think they lack discoverability. You need to browse all the diffs just in case there's a lonely inline comment there.
Suppose hashar commits c1234. Ahar Voultoiz reviews a couple of files and adds an inline comment 'This feature is PHP 5.4 only' Then Antoine Musso does git review -d 1234, finds it fine and approves it.
I think inline comments should also produce entries at the comments section.
On Thu, Mar 8, 2012 at 4:03 PM, Platonides Platonides@gmail.com wrote:
I like Gerrit diff system. Yeah I know that sounds like trolling read below though before discarding this mail.
Sure, inline comments are cool but I think they lack discoverability. You need to browse all the diffs just in case there's a lonely inline comment there.
Suppose hashar commits c1234. Ahar Voultoiz reviews a couple of files and adds an inline comment 'This feature is PHP 5.4 only' Then Antoine Musso does git review -d 1234, finds it fine and approves it.
I think inline comments should also produce entries at the comments section.
The change's initial view tells you which files have comments, and how many comments are listed inside a file.
- Ryan
On Sat, Mar 10, 2012 at 1:58 PM, Platonides Platonides@gmail.com wrote:
On 09/03/12 01:07, Ryan Lane wrote:
The change's initial view tells you which files have comments, and how many comments are listed inside a file.
- Ryan
Do you have an easy-to-remember revision with inline comments?
https://gerrit.wikimedia.org/r/#change,2157
See patchsets 1 and 2.
- Ryan
On 11/03/12 00:29, Ryan Lane wrote:
On Sat, Mar 10, 2012 at 1:58 PM, Platonides Platonides@gmail.com wrote:
On 09/03/12 01:07, Ryan Lane wrote:
The change's initial view tells you which files have comments, and how many comments are listed inside a file.
- Ryan
Do you have an easy-to-remember revision with inline comments?
https://gerrit.wikimedia.org/r/#change,2157
See patchsets 1 and 2.
- Ryan
I had expected a more prominent marker, but that seems good enough to be acceptable.
Why does gerrit insist so much in loading everything with javascript? You end up viewing patchsets which expand to nothing because it's (apparently) fetching them from the net.
On Wed, Mar 7, 2012 at 6:20 AM, Diederik van Liere dvanliere@gmail.com wrote:
- The Git / Gerrit combination means that you will have to understand
git rebase, git commit --amend, git bisect and git cherry-pick. This is advanced Git usage and that will make the learning curve steeper. I think we need to spend more time on training, I have been looking for good tutorials about Git&Gerrit in practise and I haven't been able to find it but maybe other people have better Google Fu skills (I think we are looking for advanced tutorials, not just cloning and pulling, but also merging, bisect and cherrypick).
I can second that. I recently had to pick up Git (and what I guess is the gated-trunk model) for a couple of projects, and I found it quite hard to adapt to. The most jarring thing, coming from subversion, was getting into the habit of considering each piece of work as entirely independent of any other, so, instead of:
checkout code add a feature update - nothing new commit add a feature update - merge commit ...
You do:
checkout code create branch (or however you will do it here) add a feature commit/push checkout master again...
In other words, you don't think of adding new layers of code to a single monolithic code base, you think of sending individual, independent packets of code to be combined in some order.
And if you mess up the branching, it can be incredibly confusing with Git's crappy command line interface to know how to recover. (Hint: you can achieve a lot with cherrypick and reflog)
I had such a bad time of it, I wrote a big anti-Git rant: http://steveko.wordpress.com/2012/02/24/10-things-i-hate-about-git/
At the very bottom is a conceptual model of Git compared to Subversion that may actually be helpful.
Steve
On 07/03/12 04:03, Steve Bennett wrote:
checkout code add a feature update - nothing new commit add a feature update - merge commit ...
This is not the way you work with subversion. You usually do: checkout code add a feature commit add a feature commit
Sure, occasionally the commit will fail due to a failed file, so you svn up, verify it's been correctly merged and recommit. But it's a uncommon case.
You do:
checkout code create branch (or however you will do it here) add a feature commit/push checkout master again...
In other words, you don't think of adding new layers of code to a single monolithic code base, you think of sending individual, independent packets of code to be combined in some order.
I don't think about changes as "adding to a monolithic code base". In a linear development there is a clear way on how they are combined, but the author probably know how it could be. In git, you have the same thing, but with more ambiguity when you need to specify from which revision you develop it. Too young, and there is the risk you're depending on a parent which can get rejected. Too old and you miss new changes and it may not apply.
Hi Diederik,
Let me first thank you for taking what must've been a long time to clearly articulate your thoughts on Git/Gerrit. Having been a guinea pig for several weeks now, I think your input is highly valuable. Replies inline.
On Tue, Mar 6, 2012 at 2:20 PM, Diederik van Liere dvanliere@gmail.com wrote:
Hi all,
Some disclaimers before I start my thread:
- I am a big believer in Git and dvcs and I think this is the right decision
- I am a big believer in Gerrit and code-review and I think this is
the right decision
We have a convert \o/
- I might be wholly unaware / inaccurate of certain things, apologies
in advance.
Totally ok and totally understandable. Part of my job with this migration is to educate. I didn't understand a whole lot about git before this process started (beyond a simple clone/fetch/commit/push), but I've learned lots and I'm willing to share what I've learned.
- A BIIGG thankyou to all the folks involved in preparing this
migration (evaluation, migration and training): in particular Chad, Sumanah and Roan (but I am sure more people are involved and I am just blissfully unaware).
Let me pile on the thanks. Sumana's been a tremendous help with keeping me on top of documentation, communication, and generally not hiding in a bunker and doing this alone. Roan's been an awesome help and guinea pig, going above and beyond like he always does. Also a huge thanks to Ryan, who's been superb in getting this infrastructure up and helping me to understand it (and fix it when I break things).
My main worry is that we are not spending enough time on getting all engineers (both internal and in the community) up to speed with the coming migration to Git and Gerrit and that we are going to blame the tools (Gerrit and/or Git) instead of the complex interaction between three changes. We are making three fundamental changes in one-shot:
- Migrating from a centralized source control system to a
decentralized system (SVN -> Git) 2) Introducing a new dedicated code-review tool (Gerrit) 3) Introducing a gated-trunk model
These are big changes. They're drastic changes. They require a rethinking of a great many things that we do from both technical and non-technical perspectives. Unfortunately, I don't see how we could've done #1 without #2. CodeReview is not designed (and was never designed) to work with a DVCS. The workflow's just not there, and it would've basically required rewriting huge parts of it. Rather than reinvent the wheel (again), we went with Gerrit.
Arguably, we could've gone a straight push and skipped item #3. But given the continual code review backlog, and the desire to keep trunk stable (and hopefully deploy much more often), the decision to gate trunk was made pretty early on in the discussions.
My concern is not about the UI of Gerrit, I know it's popular within WMF to say that it's UI sucks but I don't think that's the case and even if it was an issue it's only minor. People have already suggested that we might consider other code-review systems, I did a quick Google search and we are the only community considering migrating from Gerrit to Phabricator. I think this is besides the point: Â the real challenge is moving to a gated-trunk model, regardless of the chosen code-review tool. I cannot imagine other code-review tools that are also based on a gated-trunk model and work with Git are much easier than Gerrit. The complexity comes from the gated-trunk model, not from the tool.
Agreed. Gerrit has a learning curve (see other e-mail), but I do think a lot of the current confusion comes back to the gated trunk model. It is a completely different workflow from what we've been doing for the past ~10 years, so it's bound to be confusing regardless of the tools used.
In an ideal world, our code-review backlog would be zero commits at any time of the day, if that's the case then 'master' is always up-to-date and you have the same situation as with the 'always-commit' model. However, we know that the code-review backlog is a fact and it's the intersection of Git, Gerrit and the backlog that is going to be painful.
I don't think we can make any assumptions about how the code review backlog is going to look in gerrit. The list in long because we've got no real rush to review--the code's in trunk and the impetus is on the reviewer to eventually review the code sometime before deployment (hopefully).
I think there will be some lag initially as we're getting our feet wet, but I believe it will improve with time.
Suppose I clone master, but there are 10 commits waiting to be reviewed with files that are relevant to me. I am happily coding in my own local branch and after a while ready to commit. Meanwhile, those 10 commits have been reviewed and merged and now when I want to merge my branch back to master I get merge conflicts. Either I discover these merge conflicts when my branch is merged back to master or if I pull mid-way to update my local branch.
I agree with what Platonides said regarding this.
To be a productive engineer after the migration it will *not* be sufficient if you have only mastered git clone, git pull, git push, git add and git commit commands. These are the basic git commands.
Two overall recommendations:
- The Git / Gerrit combination means that you will have to understand
git rebase, git commit --amend, git bisect and git cherry-pick. This is advanced Git usage and that will make the learning curve steeper. I think we need to spend more time on training, I have been looking for good tutorials about Git&Gerrit in practise and I haven't been able to find it but maybe other people have better Google Fu skills (I think we are looking for advanced tutorials, not just cloning and pulling, but also merging, bisect and cherrypick).
git-review helps lower some of these barriers since it automatically rebases against origin/* for you so you get a clean merge on push. Cherry picking's not that hard, and gerrit actually gives you the command from the UI to pull the specific patchset.
I've yet to need git bisect for pushing patchsets--what's the use case there?
- We need to come up with a smarter way determining how to approach
the code-review backlog. Three overall strategies come to mind: a) random, just pick a commit b) time-based picking (either the oldest or the youngest commit) c) 'impact' of commit
A combination of all 2 & 3--just as we do it now. I prefer that older commits get reviewed so they can be fixed before it becomes difficult. Generally speaking, this will remain very similar in git. However, if something is more or less important, they can be reviewed accordingly.
One of the most important habits I can encourage people to get into is using separate local branches for separate features/fixes/etc. If two commits aren't related--they should not be dependent on one another. It makes the review process more difficult when you've got unrelated dependencies since you have to review all of them to submit. This raises the barrier to getting things merged to master.
-Chad
On Wed, Mar 7, 2012 at 6:01 AM, Chad innocentkiller@gmail.com wrote:
git-review helps lower some of these barriers since it automatically rebases against origin/* for you so you get a clean merge on push. Cherry picking's not that hard, and gerrit actually gives you the command from the UI to pull the specific patchset.
Pulling a specific patchset is also made easier by git-review: if you're looking at https://gerrit.wikimedia.org/r/1234 , you can pull that patchset in using git review -d 1234 .
One of the most important habits I can encourage people to get into is using separate local branches for separate features/fixes/etc. If two commits aren't related--they should not be dependent on one another. It makes the review process more difficult when you've got unrelated dependencies since you have to review all of them to submit. This raises the barrier to getting things merged to master.
Yes, this has been a hobby horse of mine too. The git-fu for disentangling unrelated commits so they're no longer based on each other isn't too difficult, but it's much better to get into a habit that avoids the problem in the first place.
Roan
On 2012-03-07, at 6:01 AM, Chad wrote:
My main worry is that we are not spending enough time on getting all engineers (both internal and in the community) up to speed with the coming migration to Git and Gerrit and that we are going to blame the tools (Gerrit and/or Git) instead of the complex interaction between three changes. We are making three fundamental changes in one-shot:
- Migrating from a centralized source control system to a
decentralized system (SVN -> Git) 2) Introducing a new dedicated code-review tool (Gerrit) 3) Introducing a gated-trunk model
These are big changes. They're drastic changes. They require a rethinking of a great many things that we do from both technical and non-technical perspectives. Unfortunately, I don't see how we could've done #1 without #2. CodeReview is not designed (and was never designed) to work with a DVCS. The workflow's just not there, and it would've basically required rewriting huge parts of it. Rather than reinvent the wheel (again), we went with Gerrit.
Arguably, we could've gone a straight push and skipped item #3. But given the continual code review backlog, and the desire to keep trunk stable (and hopefully deploy much more often), the decision to gate trunk was made pretty early on in the discussions.
I understand that we want to do all 3 of those changes, my point was merely to make it very in explicit what we are changing and that the biggest change, IMHO, is the introduction of 3). It seems that most of the discussion is focusing on the tools (that's also how this thread started) while I think the discussion should focus on mastering the new workflow and what we can do to make sure that we have the right tutorials & training available to make this migration as gentle as possible. I am confident that we will master the new tools, but a new workflow requires new habits and that might take more time to develop.
On 03/08/2012 05:45 AM, Diederik van Liere wrote:
I understand that we want to do all 3 of those changes, my point was merely to make it very in explicit what we are changing and that the biggest change, IMHO, is the introduction of 3). It seems that most of the discussion is focusing on the tools (that's also how this thread started) while I think the discussion should focus on mastering the new workflow and what we can do to make sure that we have the right tutorials & training available to make this migration as gentle as possible. I am confident that we will master the new tools, but a new workflow requires new habits and that might take more time to develop.
I agree 100%.
Antoine, you mentioned that you're happy to answer questions in #mediawiki, and you pointed to some learning resources. That's a good start.
Who is willing to compose and teach a few interactive tutorials, between now and March 21st, on how to use the new tools (including the points Diederik brought up about more advanced git topics like squash, bisect, cherry-pick, and rebase)? Antoine, are you volunteering?
Chad wrote, of the arrows to next/previous diff:
I'd figured that out a little while ago. I guess it seemed a little more obvious to me. Perhaps we should start a "How do I...?" or FAQ page for Gerrit, so we can start collecting these common questions in a single place.
Who is willing to do this? https://www.mediawiki.org/wiki/Git would be a reasonable home for it.
If we don't do things like this, then the migration will be more painful and frustrating for lots of developers. Let's set ourselves up for success.
Le 08/03/12 17:35, Sumana Harihareswara a écrit :
I'd figured that out a little while ago. I guess it seemed a little more
obvious to me. Perhaps we should start a "How do I...?" or FAQ page for Gerrit, so we can start collecting these common questions in a single place.
Who is willing to do this? https://www.mediawiki.org/wiki/Git would be a reasonable home for it.
I have already added one question in [[Gerrit]] about rebasing change.
https://www.mediawiki.org/wiki/Gerrit
Probably a lot more to add there :-]
Le 06/03/12 20:20, Diederik van Liere a écrit :
My main worry is that we are not spending enough time on getting all engineers (both internal and in the community) up to speed with the coming migration to Git and Gerrit
There are plenty of guides around that should cover most beginner question. I will be happy to answer questions in #mediawiki.
--> git magic:
An introduction to git, available in several languages. http://www-cs-students.stanford.edu/~blynn/gitmagic/
Follow Scott Chacon git evangelist:
--> Pro git
Everyone should read that free book. It comes with visual explanations which make the theory very easy to understand.
If you had only one chapter to read, read the Git Branching one.
I had the opportunity to read his "Git Internals" book. At $12 it is well worth it: http://peepcode.com/products/git-internals-pdf
--> GitHub:
Create a public repository there and play with it. It is a great exercise to have fun with a remote. Try forking a project such as the Wikipedia mobile application. Github has a lot of actually helpfull and well written help.
--> CHEAT CODES!!!!!!!
Search for git cheat sheets. Print several of them and stick them near your screen. Use them as a references.
Finally, the day of someone doing integration, still by Scott Chacon:
http://schacon.github.com/git/everyday.html#Integrator
Le 08/03/12 14:52, Antoine Musso a écrit :
There are plenty of guides around that should cover most beginner question.
And I forgot git ready which as lot of tip from beginner to pro :-]
Diederik van Liere wrote:
We are making three fundamental changes in one-shot:
They are not that much of change. It is like if you changed from using a paper map and an old car to a nice SUV with a GPS. It is still a lot of metal on 4 wheels with one purpose: move some fresh meat from A to B.
The model is the same. Only the tool changes. (you can quote me on this when we finally take the decision to migrate to JavaScript or Python)
- Migrating from a centralized source control system to a
decentralized system (SVN -> Git)
Decentralization itself is just a buzz word for the twitter guys. In the end, it does not change that much since most people have a reference repository. I guess most developers will use the WMF repository as a reference, or at the very least, all patches will eventually end up in the WMF repository.
We could imagine having the WMF feature team to use their own repository then submit a nice giant patch once in a while.
- Introducing a new dedicated code-review tool (Gerrit)
That one is a habit change. It is a bit disturbing for the first week, just like any new web interface. We will eventually get used to it. I am sure people will easily adapt to the GUI and we will be there to assist.
- Introducing a gated-trunk model
We have been using a gated-trunk model for as long as I can remember. Here how it goes with subversion/CodeReview:
=======================[ SVN PROCESS ]================================ - someone submit its patch proposal in subversion trunk - patch is reviewed then either it: -> gets rejected : revision is reverted and marked as such -> is accepted : revision marked 'ok' -> it needs enhancement : marked 'fixme' repeat :-)
From time to time, all patches marked 'ok' are allowed to pass the gate
and land in a wmf branch. Then we deploy them. ======================================================================
We will use the exact same model with git/gerrit:
=======================[ GIT PROCESS ]================================ - someone submit their patch proposal in Gerrit - patch is reviewed then either it: -> gets rejected: marked abandoned in Gerrit -> is accepted : patch is merged in WMF reference repository by Gerrit -> it needs enhancement : comment asking submitter to enhance it.
From time to time, all patches merged in the master branch are allowed
to pass the gate and land in a wmf branch. Then we deploy them.
As a summary:
commit to trunk --> submit to Gerrit revision marked 'ok' --> change merged trunk to WMF branch --> master into WMF
Note: it works the same with Bugzilla, people send their patches as attachments to a bug report. It reviewed there and eventually patch is applied by a gate keeper.
I am a seasoned developer. I only do it in my spare time and only when I am particularly annoyed about something not working in MediaWiki (mostly as a feedback from plwiki community or recently checkusers).
You can see my pitiful track record here:
https://www.mediawiki.org/wiki/Special:Code/MediaWiki/author/saper
Most of those things are urgent one-liners that got to be pushed to deployment very quickly and I never had any problem with getting it through the process. But I'm a small guy. So please do not treat my remarks below as something that would make life of people pushing hundreds of lines per week to MediaWiki more difficult.
I only don't like (as it happends with git to me recently) to learn the tool once again from scratch just because I haven't used it for the last 4 weeks or so.
Antoine Musso hashar+wmf@free.fr wrote:
Diederik van Liere wrote:
We are making three fundamental changes in one-shot:
They are not that much of change. It is like if you changed from using a paper map and an old car to a nice SUV with a GPS. It is still a lot of metal on 4 wheels with one purpose: move some fresh meat from A to B.
The model is the same. Only the tool changes. (you can quote me on this when we finally take the decision to migrate to JavaScript or Python)
- Migrating from a centralized source control system to a
decentralized system (SVN -> Git)
Decentralization itself is just a buzz word for the twitter guys. In the end, it does not change that much since most people have a reference repository. I guess most developers will use the WMF repository as a reference, or at the very least, all patches will eventually end up in the WMF repository.
We could imagine having the WMF feature team to use their own repository then submit a nice giant patch once in a while.
I am using mercurial, git and starting to learn fossil. There is one change which is partially related to tools, partally to the distributed nature of git.
The fundamental change is something else to me: you lose feeling of linearity. I like hg because it still tries to give me a cosy nice local version numbers (great to switch from SVN, you can even have your old SVN commit numbers to stay after migration). But when I look at the gerrit interface (not gerrit's fault) I have no idea what was done before, what was done after, what's the history.
What happend to me on a very first day trying gerrit:
I got an email to merge a change (since somebody pushed something conflicing in between), so I duly issued some magic git commands and I got it pushed. However, when I came back to gerrit I ended up with an "empty" commit:
https://gerrit.wikimedia.org/r/#change,2916
I thought I did merge though! What happened? Looks an empty commit got filed...
I tried poking around in gerrit to find out what happened I had no clue, only running "git log" locally revealed that may change was indeed merged by somebody else in the background.
I had a feeling I have 4 or more revisions flying around ("commits") and I could not relate them to each other. Only "git log" locally helped me to get out of the trouble.
Looking at this screen:
https://gerrit.wikimedia.org/r/#dashboard,103
Those two commits are related, but it's totally non-obvious that something follows up on something else. I have clicked on them and yes, I can find that 87f491132487313144e531354578ea2fbd3b42b4 is common to both of them. Oh, cool!
In comparison to this, the current Special:Code follow up revision system is easy, readable and very useful.
Oh, and by the way those I3577f029 and Ifb002160 are some identifiers totally unrelated to commits (I need to learn more about Change-Id vs. a commit... I promise I will - already got burned by missing pre-commit hook in the repo). And there is only date, not a timestamp to get some sense of linearity again.
I am really afraid I will be lost when my dashboard will have many more patches and merges.
Looking at this or any more complex git development tree makes my cry for linear revision numbers. At least I can find out what was before, what is after - sure it comes at a cost of potentially more difficult merging and branching, but let's be serious, how many "edit conflicts" do we have in the tree?
I don't think that our development timeline is more complex than this:
http://fossil-scm.org/index.html/timeline?n=200
and this is so much more readable (and yes, I know gitk).
I read in this thread that there could be a tree-like priority system to sort out more impactful changes from more specific ones. Building such a tree can be very challenging and as far as I understand we don't have a tool yet. We end up with a bunch of loose commits, somehow connected to each other, not linearized.
And from experience, trivial and small patches get through to deployment very fast. It's larger things that have to wait longer...
I presume this is less of a problem with the current use in operations where changes are of different nature.
- Introducing a new dedicated code-review tool (Gerrit)
That one is a habit change. It is a bit disturbing for the first week, just like any new web interface. We will eventually get used to it. I am sure people will easily adapt to the GUI and we will be there to assist.
As described previously in this thread I would describe myself as a simple git fetch/push/commit guy. No rebase, no cherry-pick yet. Maybe the world is still two-dimensional to me.
To solve problems with the above (simple fetch / push model except for WAY TOO FANCY things like HEAD:refs/for/<some-branch> I am told to refer to this:
https://www.mediawiki.org/wiki/Gerrit/resolve_conflict
six commands, including "git-review" I still don't have running. I am afraid to ask what's the non-git-review version of the commands in the guide :)
Looking at this again:
https://gerrit.wikimedia.org/r/#change,2916
I have 28f176ca9ca3767bfa9f0ec219f5fa0c299c5761 and 87f491132487313144e531354578ea2fbd3b42b4 here (those are commits, fine) and Ifb002160485030496c7d3f2abc4991484b533648
Additionally there is this c64fd4488d2ea24e120acb15db413377494dd3b3 ("Patch Set 1") referring me to (gitweb) which is calls it "commit". Ah, and there is 1101a1b3fe7f4d1c29321157fc1ef9b9f3fb6ff0 as well.
Ouch and there is this "refs/changes/16/2916/1" <-- the good think I can actually click on it in gitweb!
All this makes "MFT r111795, r111881, r111920, r112573, r112995, r113169" looks pale in comparison. And I can actually click a link in [[Special:Code]], and go back and forth on followups, neat!
The only real confusion to me in the current system is the role of "1.19" and "1.19wmf1" tags. It took me a while to figure out they are something like github's pull request OR bumping revsion's priority for review with "1.19wmf1" meaning "urgent!"
- Introducing a gated-trunk model
We have been using a gated-trunk model for as long as I can remember. Here how it goes with subversion/CodeReview:
=======================[ SVN PROCESS ]================================
- someone submit its patch proposal in subversion trunk
- patch is reviewed then either it: -> gets rejected : revision is reverted and marked as such -> is accepted : revision marked 'ok' -> it needs enhancement : marked 'fixme' repeat :-)
From time to time, all patches marked 'ok' are allowed to pass the gate and land in a wmf branch. Then we deploy them. ======================================================================
We will use the exact same model with git/gerrit:
=======================[ GIT PROCESS ]================================
- someone submit their patch proposal in Gerrit
- patch is reviewed then either it: -> gets rejected: marked abandoned in Gerrit -> is accepted : patch is merged in WMF reference repository by Gerrit -> it needs enhancement : comment asking submitter to enhance it.
From time to time, all patches merged in the master branch are allowed to pass the gate and land in a wmf branch. Then we deploy them.
As a summary:
commit to trunk --> submit to Gerrit revision marked 'ok' --> change merged trunk to WMF branch --> master into WMF
I have a small confusion here:
1) Is this true that under current process all unreverted stuff from SVN will eventually make it to the live site whenever new "wmf" branch will be created? I think lots of new stuff gets in here this way, not necessarily be explicit merge into whatever wmf branch du jour is. Yes, it creates a nice thrill in the community whenever we do this.
2) My understanding until now was that WMF is going to run master in production. Why create another WMF branch then? So we will have a one step to push things to (review branch->master) and then (master->WMF) to deploy? This would be actually a double-gated deployment branch. Ouch.
3) If (1) above is true, at least now stuff unreviewed and left alone (i.e. not obviously dangerous) will make it into the deployment (worst case with the new 1.X version). This creates a kind of pressure in the system (good or bad), but will not let changes to lie forever. This won't be a case in the new model I guess.
---
Maybe I'm an SVN backpedaller (although hg is my primary vcs now), but I don't think benefits of easy cherry-picking from what is today's trunk to wmf branches are so cool that we need to use those new, imperfect tools. Is branching/merging and conflict resolution between developers such a problem in MediaWiki?
I really think that tighter integration with bugtracker (so bug attachments end up in vcs review queue and commit comments can be seen as quasi-bugs) would be much more beneficial to users. I will try to see how it would have worked with systems like fossil for example and report back.
And, having seen fossil, bzr and other systems, I just don't subscribe to idea that DVCS *must* have the user interface from hell. And I am not going to use git often enough to have all commands in my head (or even neatly scripted as some people suggest). I love the change the DVCS's bring but not at the cost of needing to say "git push origin HEAD:refs/for/master" or something like that. (I had to go back to docs to write this command again).
//Saper
On 10/03/12 22:43, Marcin Cieslak wrote:
I am a seasoned developer. I only do it in my spare time and only when I am particularly annoyed about something not working in MediaWiki (mostly as a feedback from plwiki community or recently checkusers).
You can see my pitiful track record here:
https://www.mediawiki.org/wiki/Special:Code/MediaWiki/author/saper
Most of those things are urgent one-liners that got to be pushed to deployment very quickly and I never had any problem with getting it through the process. But I'm a small guy. So please do not treat my remarks below as something that would make life of people pushing hundreds of lines per week to MediaWiki more difficult.
I only don't like (as it happends with git to me recently) to learn the tool once again from scratch just because I haven't used it for the last 4 weeks or so.
I am using mercurial, git and starting to learn fossil. There is one change which is partially related to tools, partally to the distributed nature of git.
The fundamental change is something else to me: you lose feeling of linearity. I like hg because it still tries to give me a cosy nice local version numbers (great to switch from SVN, you can even have your old SVN commit numbers to stay after migration). But when I look at the gerrit interface (not gerrit's fault) I have no idea what was done before, what was done after, what's the history.
Yes, git is so poewerful, that gets fragile in itself. I end up with several clones and no idea about where they differ. Isn't there a way to compare them?
What happend to me on a very first day trying gerrit:
I got an email to merge a change (since somebody pushed something conflicing in between), so I duly issued some magic git commands and I got it pushed. However, when I came back to gerrit I ended up with an "empty" commit:
https://gerrit.wikimedia.org/r/#change,2916
I thought I did merge though! What happened? Looks an empty commit got filed...
I tried poking around in gerrit to find out what happened I had no clue, only running "git log" locally revealed that may change was indeed merged by somebody else in the background.
I had a feeling I have 4 or more revisions flying around ("commits") and I could not relate them to each other. Only "git log" locally helped me to get out of the trouble.
Looking at this screen:
Interesting. I didn't know about it. Of course, being #dashboard,103 instead of #dashboard,saper makes impossible to change it to find the changes by a given user. It isn't even possible to script it downloading them all in a loop to retrieve the relationship between user and id, as gerrit is unusable without javascript.
Those two commits are related, but it's totally non-obvious that something follows up on something else. I have clicked on them and yes, I can find that 87f491132487313144e531354578ea2fbd3b42b4 is common to both of them. Oh, cool!
In comparison to this, the current Special:Code follow up revision system is easy, readable and very useful.
Oh, and by the way those I3577f029 and Ifb002160 are some identifiers totally unrelated to commits (I need to learn more about Change-Id vs. a commit... I promise I will - already got burned by missing pre-commit hook in the repo). And there is only date, not a timestamp to get some sense of linearity again.
I am really afraid I will be lost when my dashboard will have many more patches and merges.
It doesn't seem to list all the changes, only the recent ones. So it wouldn't be a replacement for http://www.mediawiki.org/wiki/Special:Code/MediaWiki/author/saper I guess it has to be done with git log --author=saper from now on.
Looking at this or any more complex git development tree makes my cry for linear revision numbers. At least I can find out what was before, what is after - sure it comes at a cost of potentially more difficult merging and branching, but let's be serious, how many "edit conflicts" do we have in the tree?
Only a few currently. But a non-linear merging based on review time will increase them. Still, revision numbers arbitrarily assigned by date of submission would be almost as good as our current system. But I don't think that's supported by gerrit.
- Introducing a new dedicated code-review tool (Gerrit)
That one is a habit change. It is a bit disturbing for the first week, just like any new web interface. We will eventually get used to it. I am sure people will easily adapt to the GUI and we will be there to assist.
As described previously in this thread I would describe myself as a simple git fetch/push/commit guy. No rebase, no cherry-pick yet. Maybe the world is still two-dimensional to me.
To solve problems with the above (simple fetch / push model except for WAY TOO FANCY things like HEAD:refs/for/<some-branch> I am told to refer to this:
https://www.mediawiki.org/wiki/Gerrit/resolve_conflict
six commands, including "git-review" I still don't have running. I am afraid to ask what's the non-git-review version of the commands in the guide :)
I agree, there are too many magic commands in that page.
Looking at this again:
https://gerrit.wikimedia.org/r/#change,2916
I have 28f176ca9ca3767bfa9f0ec219f5fa0c299c5761 and 87f491132487313144e531354578ea2fbd3b42b4 here (those are commits, fine) and Ifb002160485030496c7d3f2abc4991484b533648
Additionally there is this c64fd4488d2ea24e120acb15db413377494dd3b3 ("Patch Set 1") referring me to (gitweb) which is calls it "commit". Ah, and there is 1101a1b3fe7f4d1c29321157fc1ef9b9f3fb6ff0 as well.
Ouch and there is this "refs/changes/16/2916/1" <-- the good think I can actually click on it in gitweb!
? That's not a link.
All this makes "MFT r111795, r111881, r111920, r112573, r112995, r113169" looks pale in comparison. And I can actually click a link in [[Special:Code]], and go back and forth on followups, neat!
I proposed several times to bump git change numbers, so new ones don't "conflict" with svn ones. Ie. the number alone would allow you to point to gerrit or CodeReview, keeping a bit of consistency between models. Even the urls kind of match https://gerrit.wikimedia.org/r/123 for r123
Who cares about that? That r stands there for being a review system. I was told that zero effort was going to be made for that (they were unsure about the consequences of bumping the auto_increment, although there's little I can do about that) and to just talk about "change 1234" not "r1234".
It's obvious that c3000 comes after r115000, isn't it? ;)
The only real confusion to me in the current system is the role of "1.19" and "1.19wmf1" tags. It took me a while to figure out they are something like github's pull request OR bumping revsion's priority for review with "1.19wmf1" meaning "urgent!"
Well, their meaning is 'this should be merged to 1.19/1.9wmf1 branch', but of course that means there's a greater urgency to review them.
- Introducing a gated-trunk model
I have a small confusion here:
- Is this true that under current process all unreverted stuff
from SVN will eventually make it to the live site whenever new "wmf" branch will be created? I think lots of new stuff gets in here this way, not necessarily be explicit merge into whatever wmf branch du jour is. Yes, it creates a nice thrill in the community whenever we do this.
Yes. New wmf branches get branched from trunk, so they get everthing in trunk at that point (there have been some little exceptions, with some unstable changes reverted in the branch, but not in trunk).
git shouldn't change that. You can branch from master as well.
- My understanding until now was that WMF is going to run master
in production. Why create another WMF branch then? So we will have a one step to push things to (review branch->master) and then (master->WMF) to deploy? This would be actually a double-gated deployment branch. Ouch.
Good point, if we are going to have that quick reviewing, wmf branches should either get all master commits, or have frequent merges from master.
I really think that tighter integration with bugtracker (so bug attachments end up in vcs review queue and commit comments can be seen as quasi-bugs) would be much more beneficial to users. I will try to see how it would have worked with systems like fossil for example and report back.
Indeed, that'd be an important feature.
And, having seen fossil, bzr and other systems, I just don't subscribe to idea that DVCS *must* have the user interface from hell. And I am not going to use git often enough to have all commands in my head (or even neatly scripted as some people suggest). I love the change the DVCS's bring but not at the cost of needing to say "git push origin HEAD:refs/for/master" or something like that. (I had to go back to docs to write this command again).
//Saper
On Sun, 11 Mar 2012 12:06:10 -0800, Platonides Platonides@gmail.com wrote:
On 10/03/12 22:43, Marcin Cieslak wrote:
I am using mercurial, git and starting to learn fossil. There is one change which is partially related to tools, partally to the distributed nature of git.
The fundamental change is something else to me: you lose feeling of linearity. I like hg because it still tries to give me a cosy nice local version numbers (great to switch from SVN, you can even have your old SVN commit numbers to stay after migration). But when I look at the gerrit interface (not gerrit's fault) I have no idea what was done before, what was done after, what's the history.
Yes, git is so poewerful, that gets fragile in itself. I end up with several clones and no idea about where they differ. Isn't there a way to compare them?
//Saper
You could designate a single one of them as your primary repo or create a new one just for the purpose of monitoring. On that repo add each of your clones as a remote. And then git fetch from them all. You should end up with foo/master, bar/master, etc... tracking branches and a gui that displays all the branches should be able to give you an idea of what commits are not in which repos.
On 11/03/12 22:32, Daniel Friesen wrote:
You could designate a single one of them as your primary repo or create a new one just for the purpose of monitoring. On that repo add each of your clones as a remote. And then git fetch from them all. You should end up with foo/master, bar/master, etc... tracking branches and a gui that displays all the branches should be able to give you an idea of what commits are not in which repos.
Hmm, no. They are just different tries of starting from scratch for committing using git, with different metadata success, etc.
Platonides Platonides@gmail.com wrote: The fundamental change is something else to me: you lose feeling of linearity. I like hg because it still tries to give me a cosy nice local version numbers (great to switch from SVN, you can even have your old SVN commit numbers to stay after migration). But when I look at the gerrit interface (not gerrit's fault) I have no idea what was done before, what was done after, what's the history.
Yes, git is so poewerful, that gets fragile in itself. I end up with several clones and no idea about where they differ. Isn't there a way to compare them?
Yeah, I have the same problem very often. Many times got confused enough to save diffs and "rm -rf" the working directory.
Git is a mess to integrate with - it's almost impossible to have some API. And you can't have "hg rollback" - my favorite feature of Mercurial (no need to commit --amend or something).
Looking at this again:
https://gerrit.wikimedia.org/r/#change,2916
I have 28f176ca9ca3767bfa9f0ec219f5fa0c299c5761 and 87f491132487313144e531354578ea2fbd3b42b4 here (those are commits, fine) and Ifb002160485030496c7d3f2abc4991484b533648
Additionally there is this c64fd4488d2ea24e120acb15db413377494dd3b3 ("Patch Set 1") referring me to (gitweb) which is calls it "commit". Ah, and there is 1101a1b3fe7f4d1c29321157fc1ef9b9f3fb6ff0 as well.
Ouch and there is this "refs/changes/16/2916/1" <-- the good think I can actually click on it in gitweb!
? That's not a link.
Go to to the gitweb screen
https://gerrit.wikimedia.org/r/gitweb?p=test/mediawiki/extensions/examples.g...
and you will see little pink patches (tags) like
https://gerrit.wikimedia.org/r/gitweb?p=test/mediawiki/extensions/examples.g...
those refs/changes/14/2914/1 are gerrit pointers to commits. (http://book.git-scm.com/7_git_references.html calls them "Git references")
I see also some people want to get rid of them at times: http://www.mailinglistarchive.com/html/repo-discuss@googlegroups.com/2010-05...
Unfortunately, I don't see them in my local repository. Why? How can you clone them?
All this makes "MFT r111795, r111881, r111920, r112573, r112995, r113169" looks pale in comparison. And I can actually click a link in [[Special:Code]], and go back and forth on followups, neat!
I proposed several times to bump git change numbers, so new ones don't "conflict" with svn ones. Ie. the number alone would allow you to point to gerrit or CodeReview, keeping a bit of consistency between models. Even the urls kind of match https://gerrit.wikimedia.org/r/123 for r123
Who cares about that? That r stands there for being a review system. I was told that zero effort was going to be made for that (they were unsure about the consequences of bumping the auto_increment, although there's little I can do about that) and to just talk about "change 1234" not "r1234".
It's obvious that c3000 comes after r115000, isn't it? ;)
It looks like that Gerrit precedessor, Rietveld works with SVN (yes!) and has a much nicer interface:
http://codereview.appspot.com/5727045/#ps1
it uses "Issues" as bases for development (I think this is gerrit's "change"). Maybe we should use that instead (yes, I know, google app engine and stuff...)
I really think that tighter integration with bugtracker (so bug attachments end up in vcs review queue and commit comments can be seen as quasi-bugs) would be much more beneficial to users. I will try to see how it would have worked with systems like fossil for example and report back.
Indeed, that'd be an important feature.
And I think now you have englightened me.
Maybe our workflow should probably be completely 'change' based and not 'commit' based.
Our [[Git/Workflow]] page does not say much about changesets.
Actually working on a changeset mini-branch (git fetch origin refs/changes/16/2916/1 && git checkout FETCH_HEAD) and then
git add ... && git commit -m "..." && git push origin HEAD:refs/changes/2916
would be a nice workflow; we probably wouldn't (or shouldn't) do rebasing then since theoretically many people can work on a changeset at the same time.
(I can't check how it works since my "git push" fails again on [remote rejected] HEAD -> refs/for/master (prohibited by Gerrit))
If "refs/changes/2916" would work also for fetch (now you need "refs/changes/xx/2916/n" it could be very interesting.
If only I could somehow fetch all tags "refs/changes/*...." to my local git repository - anybody knows how do this?
It also seems impossible to do code reviews offline, since we need to
ssh -p 29418 gerrit.wikimedia.org host gerrit review <options>
It seems that we can use revision numbers sometimes:
ssh -p 29418 gerrit.wikimedia.org gerrit query 2714
but to work on review witch patchsets we need to be more specific:
ssh gerrit.wikimedia.org gerrit review eff577ea766db8dcb3952baf99ba8053fac0da8c --restore
So I think 'having stuff offline' is not an argument for gerrit at all.
I can't even see changesets/patchets in my local git repository - it looks like I am living in a slightly different world with gerrit. Similarly to what we have with bugzilla now.
//Saper
On 03/08/2012 06:21 AM, Antoine Musso wrote:
Diederik van Liere wrote:
- Migrating from a centralized source control system to a
decentralized system (SVN -> Git)
Decentralization itself is just a buzz word for the twitter guys. In the end, it does not change that much since most people have a reference repository. I guess most developers will use the WMF repository as a reference, or at the very least, all patches will eventually end up in the WMF repository.
We could imagine having the WMF feature team to use their own repository then submit a nice giant patch once in a while.
Even though most people will have a reference repository, decentralizing work into a zillion branches will take some getting used to. People will need to take time and learn to change their workflows.
- Introducing a gated-trunk model
We have been using a gated-trunk model for as long as I can remember.
No -- there's a big difference. In our current model (SVN and no gate between a committer and trunk), it's probable that revision n+70 depends on stuff in revision n, which hasn't been reviewed yet, which means that if revision n is a FIXME, then we have ugly ripple effects everywhere.
This happens way too often, and once we're used to not having it happen anymore, we will be so happy.
For this reason, in the short term we aim to eliminate the ability to push directly and bypass review. Everything will go through Gerrit, even if in some cases (as with individual translations) an automatic approver bot will auto-approve mergers to specific branches.
wikitech-l@lists.wikimedia.org