On Tue, Mar 22, 2011 at 7:46 PM, Tim Starling tstarling@wikimedia.org wrote:
On 23/03/11 04:24, Rob Lanphier wrote:
The most convincing general Subversion->DVCS argument I've read is here: http://hginit.com/00.html
This argument refers to Mercurial, but the same benefits apply to Git.
The article seems quite biased.
That doesn't mean it's wrong, and no one implied it was objective.
The tone is quite different to one of the first things I read about Mercurial:
"Oops! Mercurial cut off your arm!
"Don't randomly try stuff to see if it'll magically fix it. Remember what you stand to lose, and set down the chainsaw while you still have one good arm."
Those quotes apply to any version control system, or for that matter, any system (randomly trying stuff and praying for magic rarely seems like good advice). The main guidance that relates to SVN vs Mercurial was the fact that Mercurial doesn't leave conflict markers. Git leaves conflict markers just like Subversion. There is also a couple of bits of guidance about Mercurial Queues, which is a very popular add-on to Mercurial that behaves in ways that are pretty specific to Mercurial (and is a chainsaw and a handgun all in one). The Git equivalent is Quilt, but most Git users don't use Quilt, because Git has some core functionality (rebase, ability to delete branches) that makes such an add-on less interesting.
Git rebase is a beast of its own, and there are many arguments pro and con about its use: Pro: http://darwinweb.net/articles/the-case-for-git-rebase Con: http://changelog.complete.org/archives/586-rebase-considered-harmful
Fervent Mercurial advocates are also a fine source of more "con" material for the git rebase option.
We will probably need to adopt some guidelines about the use of rebase assuming we move to Git.
The main argument is that merging is easy so you can branch without the slightest worry. I think this is an exaggeration. Interfaces change, and when they change, developers change all the references to those interfaces in the code which they can see in their working copy. The greater the time difference in the branch points, the more likely it is that your new code will stop working. As the branch point gap grows, merging becomes more a task of understanding the interface changes and rewriting the code, than just repeating the edits and copying in the new code.
Yes, merging is hard. Subversion is particularly bad at it. Git and Mercurial are both much, much better. That doesn't mean they are flawless, but they tend to work pretty well.
I'm not talking about the interfaces between core and extensions, which are reasonably stable. I'm mainly talking mainly about the interfaces which operate within and between core modules. These change all the time. The problem of changing interfaces is most severe when developers are working on different features within the same region of core code.
I get that there are many unsupported, volatile interfaces between and within components. However, if we're changing things around so much that it really negates the benefit that Git or some other DVCS brings us, then that's another conversation.
Doing regular reintegration merges from trunk to development branches doesn't help, it just means that you get the interface changes one at a time, instead of in batches.
Having a short path to trunk means that the maximum amount of code is visible to the developers who are doing the interface changes, so it avoids the duplication of effort that occurs when branch maintainers have to understand and account for every interface change that comes through.
We can still have the rough equivalent of the big happy trunk where everything goes in prior to review.
If we split up the extensions directory, each extension having its own repository, then this will discourage developers from updating the extensions in bulk. This affects both interface changes and general code maintenance. I'm sure translatewiki.net can set up a script to do the necessary 400 commits per day, but I'm not sure if every developer who wants to fix unused variables or change a core/extension interface will want to do the same.
On the flip side, I'm not sure if every developer who wants to make a clone of a single extension on Github or Gitorious wants to use up their quota getting the source of every single extension. Being able to have public clone for pushing/pulling on a hosting service is a large benefit of DVCS for our workflow; it means that no one even has to ask us for permission before effectively operating as a normal contributor. It would be unfortunate if we set up our repository in such a way as to deter people from doing this.
We don't need to switch to the one extension per repo model right away, though. We could throw all of the extensions into a single repository at first, and then split it later if we run into this or other similar problems.
I don't know enough about Git to know if these things are really an argument against it. This is just a comment on the ideas in this thread.
I'm not so hellbent on deploying Git that I would push a hasty deployment over reasonable, unmitigated objections. It's fair to wait until some time after we're done with 1.17, and wait until we've figured out how this will work with Translatewiki.
Rob