On Tue, Mar 22, 2011 at 7:46 PM, Tim Starling <tstarling(a)wikimedia.org> wrote:
On 23/03/11 04:24, Rob Lanphier wrote:
The most convincing general Subversion->DVCS
argument I've read is here:
http://hginit.com/00.html
This argument refers to Mercurial, but the same benefits apply to Git.
The article seems quite biased.
That doesn't mean it's wrong, and no one implied it was objective.
The tone is quite different to one of the first things
I read about
Mercurial:
"Oops! Mercurial cut off your arm!
"Don't randomly try stuff to see if it'll magically fix it. Remember
what you stand to lose, and set down the chainsaw while you still have
one good arm."
https://developer.mozilla.org/en/Mercurial_basics
Those quotes apply to any version control system, or for that matter,
any system (randomly trying stuff and praying for magic rarely seems
like good advice). The main guidance that relates to SVN vs Mercurial
was the fact that Mercurial doesn't leave conflict markers. Git
leaves conflict markers just like Subversion. There is also a couple
of bits of guidance about Mercurial Queues, which is a very popular
add-on to Mercurial that behaves in ways that are pretty specific to
Mercurial (and is a chainsaw and a handgun all in one). The Git
equivalent is Quilt, but most Git users don't use Quilt, because Git
has some core functionality (rebase, ability to delete branches) that
makes such an add-on less interesting.
Git rebase is a beast of its own, and there are many arguments pro and
con about its use:
Pro:
http://darwinweb.net/articles/the-case-for-git-rebase
Con:
http://changelog.complete.org/archives/586-rebase-considered-harmful
Fervent Mercurial advocates are also a fine source of more "con"
material for the git rebase option.
We will probably need to adopt some guidelines about the use of rebase
assuming we move to Git.
The main argument is that merging is easy so you can
branch without
the slightest worry. I think this is an exaggeration. Interfaces
change, and when they change, developers change all the references to
those interfaces in the code which they can see in their working copy.
The greater the time difference in the branch points, the more likely
it is that your new code will stop working. As the branch point gap
grows, merging becomes more a task of understanding the interface
changes and rewriting the code, than just repeating the edits and
copying in the new code.
Yes, merging is hard. Subversion is particularly bad at it. Git and
Mercurial are both much, much better. That doesn't mean they are
flawless, but they tend to work pretty well.
I'm not talking about the interfaces between core
and extensions,
which are reasonably stable. I'm mainly talking mainly about the
interfaces which operate within and between core modules. These change
all the time. The problem of changing interfaces is most severe when
developers are working on different features within the same region of
core code.
I get that there are many unsupported, volatile interfaces between and
within components. However, if we're changing things around so much
that it really negates the benefit that Git or some other DVCS brings
us, then that's another conversation.
Doing regular reintegration merges from trunk to
development branches
doesn't help, it just means that you get the interface changes one at
a time, instead of in batches.
Having a short path to trunk means that the maximum amount of code is
visible to the developers who are doing the interface changes, so it
avoids the duplication of effort that occurs when branch maintainers
have to understand and account for every interface change that comes
through.
We can still have the rough equivalent of the big happy trunk where
everything goes in prior to review.
If we split up the extensions directory, each
extension having its own
repository, then this will discourage developers from updating the
extensions in bulk. This affects both interface changes and general
code maintenance. I'm sure
translatewiki.net can set up a script to do
the necessary 400 commits per day, but I'm not sure if every developer
who wants to fix unused variables or change a core/extension interface
will want to do the same.
On the flip side, I'm not sure if every developer who wants to make a
clone of a single extension on Github or Gitorious wants to use up
their quota getting the source of every single extension. Being able
to have public clone for pushing/pulling on a hosting service is a
large benefit of DVCS for our workflow; it means that no one even has
to ask us for permission before effectively operating as a normal
contributor. It would be unfortunate if we set up our repository in
such a way as to deter people from doing this.
We don't need to switch to the one extension per repo model right
away, though. We could throw all of the extensions into a single
repository at first, and then split it later if we run into this or
other similar problems.
I don't know enough about Git to know if these
things are really an
argument against it. This is just a comment on the ideas in this thread.
I'm not so hellbent on deploying Git that I would push a hasty
deployment over reasonable, unmitigated objections. It's fair to wait
until some time after we're done with 1.17, and wait until we've
figured out how this will work with Translatewiki.
Rob