On 11-03-23 12:41 PM, Rob Lanphier wrote:
On Tue, Mar 22, 2011 at 7:46 PM, Tim Starlingtstarling@wikimedia.org wrote:
I'm not talking about the interfaces between core and extensions, which are reasonably stable. I'm mainly talking mainly about the interfaces which operate within and between core modules. These change all the time. The problem of changing interfaces is most severe when developers are working on different features within the same region of core code.
I get that there are many unsupported, volatile interfaces between and within components. However, if we're changing things around so much that it really negates the benefit that Git or some other DVCS brings us, then that's another conversation.
Branching also isn't the only advantage of a dvcs. If branching causes too much problems, we can just not use it. Personally I don't use branches all that much and work with a dirty working copy, git's staging area and `git gui` repeatedly saves me a LOT of trouble.
I develop on multiple servers, sometimes the prototype for a live site, sometimes a dev instance of trunk. But I always commit from my home computer, I don't trust any server with the privatekey that can commit to Wikimedia's svn. This means I need to transfer my changes from the server to my personal computer before committing. ((And before anyone says anything, there's no way I'm putting private keys on servers operated by a 3rd party, or doing development of things on my local machine -- reconfiguring apache and trying to get packaged apache and mysql to NOT start up on my laptop except when wanted is a pain)) In svn this is a workflow like so: server$ # Edit some code server$ svn up # make sure things are up to date desktop$ svn up # make sure things are up to date desktop$ svn diff | pager # look at the changes that will be pulled desktop$ ssh server "cd /path/to/code; svn diff" | patch -p0 # abuse ssh, svn diff, and patch to pipe changes to the local copy desktop$ scp server:/path/to/code/newfile . # if I added a new file, I need to randomly copy it desktop$ svn diff | pager # take a look at the changes to double check everything is in place desktop$ svn revert [...] # I often work with a dirty working copy with multiple projects mixed together, so I have to revert some unrelated changes or manually list files in svn commit desktop$ # sometimes those changes have two different projects in one file and I have to commit only part of it desktop$ svn diff somefile.php > tmp.patch # I found the easiest fix is putting the changes into a patch desktop$ svn revert somefile.php # erasing all local changes in the file desktop$ nano tmp.patch # editing out the unrelated code changes from the patchfile desktop$ patch -p0 < tmp.patch # and re-applying the changes desktop$ rm tmp.patch desktop$ svn commit [...] # And finally I can commit server$ svn up # update the server, things get /fun/ if you try to pull patches a second time
;) guess what, this is an explanation for half the reason why I commit small changes to svn without testing them.
Now, here's my git workflow (I developed monaco-port in git so I have a matching workflow): server$ # Edit some code server$ git gui # ssh -X, so git's gui works remotely too, it's a lifesaver when partially committing changes desktop$ git pull theserver master # pull the new commit to my local machine desktop$ git push origin master # ;) and now push the change to the public repo
Oh, and for a bonus, I have a script that pulls changes from all the places I have monaco-port, pushes the changes to the public repo, then has all those monaco-ports pull from the public repo to sync changes everywhere.
If we split up the extensions directory, each extension having its own repository, then this will discourage developers from updating the extensions in bulk. This affects both interface changes and general code maintenance. I'm sure translatewiki.net can set up a script to do the necessary 400 commits per day, but I'm not sure if every developer who wants to fix unused variables or change a core/extension interface will want to do the same.
On the flip side, I'm not sure if every developer who wants to make a clone of a single extension on Github or Gitorious wants to use up their quota getting the source of every single extension. Being able to have public clone for pushing/pulling on a hosting service is a large benefit of DVCS for our workflow; it means that no one even has to ask us for permission before effectively operating as a normal contributor. It would be unfortunate if we set up our repository in such a way as to deter people from doing this.
Here's another good case for extension-as-repository.
We branch extensions on release currently. I don't expect this to change much after a switch to git, we'll probably type in a few quick commands to create a new rel1_XX branch in each extension repo and push (however I do contend that the higher visibility of branches in git might convince a few more extension authors to check extension compatibility with old versions and backport versions that still work more often). Extensions are as much used as-is in both trunk and stable. In other words, we use and develop the same version of an extension whether the phase3 around it is trunk or something like REL1_16. However we do not always develop every extension at once. If every extension is in the same repo we are forced to checkout every extension (I'll skip how some of us like to avoid this in the first place because of potential security issues) and all of those extensions MUST be of the same branch. This means that if you have a REL1_15 phase3, and you want to use some extensions that try to be backwards compatible and others which have a trunk that breaks you are forced to use branched extensions for EVERY extension, and you forgo the features of the extensions that DO try to be compatible (I DO use both branched and trunk extensions on stable non-development). This is even worse when developing. If you are trying to develop extension compatibility for an old release everything MUST be trunk since you can't really develop in the release branch. As a result any other extension you are using that doesn't have 1.15 compatibility will end up causing unrelated fatal errors hampering your compatibility testing.
I'll also contend that backporting commits to release branches for specific extensions where some of those commits include en-masse code style changes and twn commits will be easier in a model where each extension has it's own repo. In a shared repo I have a feeling that the shared nature of multiple extensions being modified in a single commit could cause some conflicts when you try to apply a series of commits to a branch but ONLY want the portions of those commits specific to one extension (in this case one file path) to be comitted.
We don't need to switch to the one extension per repo model right away, though. We could throw all of the extensions into a single repository at first, and then split it later if we run into this or other similar problems.
Perhaps this is possible though I think we might want to double check that splitting IS possible. Someone else might have better luck, but I don't remember splitting a git repo into multiple git repos and as a result changing the path of all the files to be an easy thing.
Though I contend that rather than either of those options, the idea of starting out with just phase3 is best. After that when we want to do extensions we can setup the infrastructure that would handle extensions in git and try it out on a few brand new extensions rather than throwing 500+ extensions into the fray. We can also try moving just a few extensions for more experimentation. Experimenting with a few actively developed extensions would be better than throwing hundreds of extensions without many commits in right away.
I don't know enough about Git to know if these things are really an argument against it. This is just a comment on the ideas in this thread.
I'm not so hellbent on deploying Git that I would push a hasty deployment over reasonable, unmitigated objections. It's fair to wait until some time after we're done with 1.17, and wait until we've figured out how this will work with Translatewiki.
Me neither, though it would help stop my own issues with my own workflow.
However I do believe it would be good to discuss how it would be implemented, and do some proof-of-concept implementation and get things working as a test before scrapping the actual data in it and doing a real deployment after we've tried out the workflow. There are some things to do like figuring out how to do the things on the [[Git conversion]] page. And deciding how to setup the server. Do we keep the model where we need to send request e-mails for commit access. Or do we try using prior-art in git farming, ie: setting up a copy of Giorious for ourself. Some of these things get in the way of others. It's hard to build a tool for TWN when we don't even have extension repos to build them on. We can build a functional prototype before we even decide to deploy.
Rob
~Daniel Friesen (Dantman, Nadir-Seen-Fire) [http://daniel.friesen.name]