On Mon, Mar 26, 2012 at 9:50 PM, Tim Starling tstarling@wikimedia.org wrote:
For commits with lots of files, Gerrit's diff interface is too broken to be useful. It does not provide a compact overview of the change which is essential for effective review.
Luckily, there are alternatives, specifically local git clients and gitweb. However, these don't work when git's change model is broken by the use of git commit --amend.
They do; it just wasn't obvious to you how to do it, but that doesn't mean it can't be done.
$ git fetch https://gerrit.wikimedia.org/r/p/analytics/udp-filters refs/changes/22/3222/3 && git branch foo FETCH_HEAD $ git fetch https://gerrit.wikimedia.org/r/p/analytics/udp-filters refs/changes/22/3222/4 && git branch bar FETCH_HEAD $ git diff foo..bar
The two 'git fetch' commands (or at least the part before the &&) can be taken from the change page in Gerrit.
For commits with a small number of files, such changes are reviewable by the use of the "patch history" table in the diff views. But when there are a large number of files, it becomes difficult to find the files which have changed, and if there are a lot of changed files, to produce a compact combined diff.
So if there are no objections, I'm going to change [[Git/Workflow]] to restrict the recommended applications of "git commit --amend", and to recommend plain "git commit" as an alternative. A plain commit seems to work just fine. It gives you a separate commit to analyse with Gerrit, gitweb and client-side tools, and it provides a link to the original change in the "dependencies" section of the change page.
I have mixed feelings towards this. One time at the office, over lunch, I was bitching about how Gerrit used amends instead of real branches, and after discussing this for 15 minutes we felt like we'd just reverse-engineered the Gerrit developers' decision process (RobLa: "I think we just figured out why the Gerrit people did it this way.")
There are several advantages to using Gerrit the way it's meant to be used (with amends rather than followup commits): * Review comments of one logical change are centralized, rather than being scattered across multiple changes (this is what I said I'd do differently if I was writing a code review system now) * Conflicts must be resolved by rebasing the conflicted topic branch onto the HEAD of master, so there aren't any merge commits containing conflict resolutions unless you deliberately create them (and someone else approves them). I imagine I'd be quite annoyed/confused if I was using git bisect to track down a bug and it blamed a complex merge commit that had conflicts * Every commit that ends up being merged into master is "clean", in that it's been approved and passes the tests. This is the entire point of continuous integration / pre-commit review / gated trunk / etc., and it's a major advantage because: ** you can rewind master to any point in history and it'll be in a sane state ** you can start a branch (including a deployment branch) off any point in master and be confident that that's a reasonably sane branch point ** if you find subtle breakage later, you can use git bisect on master to find out where it broke, and there will not be any botched intermediate states confusing bisect (e.g. if there's a commit somewhere that breaks half the code and you merge it in together with a followup commit that fixes it, bisect might find that commit and wrongly blame it for the breakage you're looking for; this probability increases as merged-in feature branches get longer)
Of course you can't blindly place absolute trust in any random commit on master, but at least you know that it 1) has been reviewed as a unit and approved, and once we have better Jenkins integration you'll know that 2) it passed the lint checks and 3) it passed the tests as they existed at the time. Approving commits that are broken because you know they were fixed later in some follow-up commit somewhere violates this entire model, and it exposes you to the danger of merging the broken commit but not the follow-up, if the follow-up is held up for some reason. Fortunately, once we have proper Jenkins integration, this will not be possible because Jenkins will mark the broken commit as having failed the tests, and you will not be able to approve and merge it without manually overriding Jenkins's V:-1 review.
An unrelated disadvantage of "stacked changes" in Gerrit (i.e. changes that depend on other unmerged changes) is that if B.1 (change B patchset 1) depends on A.1, and someone amends A.1 later to produce A.2, B.1 will still depend on A.1 and will need to be rebased to depend on A.2 instead. I've done this before with a stack of four commits (amended B and had to rebase C and D), and I can tell you it's not fun. I think I've figured out a trick for it now (use an interactive rebase from the top of the stack), but it's not intuitive and I've never tried it.
That said, if you have multiple commits that are related/dependent but both represent valid, sane, non-broken states (e.g. you introduce a function, then use it somewhere), then by all means chain them together. I'll +1 Trevor there, amending is probably not a good idea if you're adding lots of changes. And if all else fails, you can just abandon the intermediate changes, squash them together into one omnibus change, and submit that for review instead.
As for Amir's question about git-review: git-review warns you against stacked changes, because it's considered an anti-pattern in Gerrit. However, the warning ends with a question like "Are you sure this is what you meant to do?", and if you type "yes" it'll continue. You can also suppress this warning with the -y flag.
Roan