On 11-03-23 12:41 PM, Rob Lanphier wrote:
On Tue, Mar 22, 2011 at 7:46 PM, Tim
Starling<tstarling(a)wikimedia.org> wrote:
I'm not talking about the interfaces between
core and extensions,
which are reasonably stable. I'm mainly talking mainly about the
interfaces which operate within and between core modules. These change
all the time. The problem of changing interfaces is most severe when
developers are working on different features within the same region of
core code.
I get that there are many unsupported, volatile interfaces between and
within components. However, if we're changing things around so much
that it really negates the benefit that Git or some other DVCS brings
us, then that's another conversation.
Branching also isn't the only
advantage of a dvcs. If branching causes
too much problems, we can just not use it.
Personally I don't use branches all that much and work with a dirty
working copy, git's staging area and `git gui` repeatedly saves me a LOT
of trouble.
I develop on multiple servers, sometimes the prototype for a live site,
sometimes a dev instance of trunk. But I always commit from my home
computer, I don't trust any server with the privatekey that can commit
to Wikimedia's svn. This means I need to transfer my changes from the
server to my personal computer before committing.
((And before anyone says anything, there's no way I'm putting private
keys on servers operated by a 3rd party, or doing development of things
on my local machine -- reconfiguring apache and trying to get packaged
apache and mysql to NOT start up on my laptop except when wanted is a pain))
In svn this is a workflow like so:
server$ # Edit some code
server$ svn up # make sure things are up to date
desktop$ svn up # make sure things are up to date
desktop$ svn diff | pager # look at the changes that will be pulled
desktop$ ssh server "cd /path/to/code; svn diff" | patch -p0 # abuse
ssh, svn diff, and patch to pipe changes to the local copy
desktop$ scp server:/path/to/code/newfile . # if I added a new file, I
need to randomly copy it
desktop$ svn diff | pager # take a look at the changes to double check
everything is in place
desktop$ svn revert [...] # I often work with a dirty working copy with
multiple projects mixed together, so I have to revert some unrelated
changes or manually list files in svn commit
desktop$ # sometimes those changes have two different projects in one
file and I have to commit only part of it
desktop$ svn diff somefile.php > tmp.patch # I found the easiest fix is
putting the changes into a patch
desktop$ svn revert somefile.php # erasing all local changes in the file
desktop$ nano tmp.patch # editing out the unrelated code changes from
the patchfile
desktop$ patch -p0 < tmp.patch # and re-applying the changes
desktop$ rm tmp.patch
desktop$ svn commit [...] # And finally I can commit
server$ svn up # update the server, things get /fun/ if you try to pull
patches a second time
;) guess what, this is an explanation for half the reason why I commit
small changes to svn without testing them.
Now, here's my git workflow (I developed monaco-port in git so I have a
matching workflow):
server$ # Edit some code
server$ git gui # ssh -X, so git's gui works remotely too, it's a
lifesaver when partially committing changes
desktop$ git pull theserver master # pull the new commit to my local machine
desktop$ git push origin master # ;) and now push the change to the
public repo
Oh, and for a bonus, I have a script that pulls changes from all the
places I have monaco-port, pushes the changes to the public repo, then
has all those monaco-ports pull from the public repo to sync changes
everywhere.
If we split up
the extensions directory, each extension having its own
repository, then this will discourage developers from updating the
extensions in bulk. This affects both interface changes and general
code maintenance. I'm sure
translatewiki.net can set up a script to do
the necessary 400 commits per day, but I'm not sure if every developer
who wants to fix unused variables or change a core/extension interface
will want to do the same.
On the flip side, I'm not sure if every developer
who wants to make a
clone of a single extension on Github or Gitorious wants to use up
their quota getting the source of every single extension. Being able
to have public clone for pushing/pulling on a hosting service is a
large benefit of DVCS for our workflow; it means that no one even has
to ask us for permission before effectively operating as a normal
contributor. It would be unfortunate if we set up our repository in
such a way as to deter people from doing this.
Here's another good case for
extension-as-repository.
We branch extensions on release currently. I don't expect this to change
much after a switch to git, we'll probably type in a few quick commands
to create a new rel1_XX branch in each extension repo and push (however
I do contend that the higher visibility of branches in git might
convince a few more extension authors to check extension compatibility
with old versions and backport versions that still work more often).
Extensions are as much used as-is in both trunk and stable. In other
words, we use and develop the same version of an extension whether the
phase3 around it is trunk or something like REL1_16. However we do not
always develop every extension at once. If every extension is in the
same repo we are forced to checkout every extension (I'll skip how some
of us like to avoid this in the first place because of potential
security issues) and all of those extensions MUST be of the same branch.
This means that if you have a REL1_15 phase3, and you want to use some
extensions that try to be backwards compatible and others which have a
trunk that breaks you are forced to use branched extensions for EVERY
extension, and you forgo the features of the extensions that DO try to
be compatible (I DO use both branched and trunk extensions on stable
non-development). This is even worse when developing. If you are trying
to develop extension compatibility for an old release everything MUST be
trunk since you can't really develop in the release branch. As a result
any other extension you are using that doesn't have 1.15 compatibility
will end up causing unrelated fatal errors hampering your compatibility
testing.
I'll also contend that backporting commits to release branches for
specific extensions where some of those commits include en-masse code
style changes and twn commits will be easier in a model where each
extension has it's own repo. In a shared repo I have a feeling that the
shared nature of multiple extensions being modified in a single commit
could cause some conflicts when you try to apply a series of commits to
a branch but ONLY want the portions of those commits specific to one
extension (in this case one file path) to be comitted.
We don't need to switch to the one extension per
repo model right
away, though. We could throw all of the extensions into a single
repository at first, and then split it later if we run into this or
other similar problems.
Perhaps this is possible though I think we might want to
double check
that splitting IS possible. Someone else might have better luck, but I
don't remember splitting a git repo into multiple git repos and as a
result changing the path of all the files to be an easy thing.
Though I contend that rather than either of those options, the idea of
starting out with just phase3 is best. After that when we want to do
extensions we can setup the infrastructure that would handle extensions
in git and try it out on a few brand new extensions rather than throwing
500+ extensions into the fray. We can also try moving just a few
extensions for more experimentation. Experimenting with a few actively
developed extensions would be better than throwing hundreds of
extensions without many commits in right away.
I don't
know enough about Git to know if these things are really an
argument against it. This is just a comment on the ideas in this thread.
I'm
not so hellbent on deploying Git that I would push a hasty
deployment over reasonable, unmitigated objections. It's fair to wait
until some time after we're done with 1.17, and wait until we've
figured out how this will work with Translatewiki.
Me neither, though it would help
stop my own issues with my own workflow.
However I do believe it would be good to discuss how it would be
implemented, and do some proof-of-concept implementation and get things
working as a test before scrapping the actual data in it and doing a
real deployment after we've tried out the workflow.
There are some things to do like figuring out how to do the things on
the [[Git conversion]] page. And deciding how to setup the server. Do we
keep the model where we need to send request e-mails for commit access.
Or do we try using prior-art in git farming, ie: setting up a copy of
Giorious for ourself. Some of these things get in the way of others.
It's hard to build a tool for TWN when we don't even have extension
repos to build them on.
We can build a functional prototype before we even decide to deploy.
Rob
~Daniel Friesen (Dantman, Nadir-Seen-Fire)
[
http://daniel.friesen.name]
--
~Daniel Friesen (Dantman, Nadir-Seen-Fire) [
http://daniel.friesen.name]