On Wed, Oct 5, 2011 at 1:59 PM, Brion Vibber brion@pobox.com wrote:
On Oct 5, 2011 1:03 PM, "Platonides" Platonides@gmail.com wrote:
I know about ControlMaster (which we can only those of us with ssh+git can benefit), but just launching a new process and waiting if there's something new will slow-down. OTOH git skips the "recurse everything locking all subfolders" step, so it may be equivalent.
Maybe there's some way for fetching updates from aggregate repositories at once and I am just when everything is solved, though.
Submodules may actually work well for this, as long as something propagates the ext updates to the composite repo. The checked-out commit id of each submodule is stored in the tree, so if no changes were seen from that one containing repo it shouldn't have to pull anything from the submodule's repo.
(Not yet tested)
Ok, did some quick tests fetching updates for 16 repos sitting on my Gitorious account.
Ping round-trip from my office desktop to Gitorious's server is 173ms, making the theoretical *absolute best* possible time involving a round-trip for each at 2-3 seconds.
Running a simple loop of 'git fetch' over each repo (auth'ing with my ssh key, passphrase already provided) takes 53 seconds (about 3 seconds per repo). This does a separate ssh setup & poke into git for each repo.
Clearly unacceptable for 600+ extensions. :)
Turning on ControlMaster and starting a long-running git clone in the background, then running the same 'git fetch' loop took the time down to about 10 seconds (<1s per repo). ControlMaster lets those looped 'git fetch's piggyback on the existing SSH connection, but still has to start up git and run several round-trips.
Better, but still doesn't scale to hundreds of extensions: several minutes for a null update is too frustrating!
Checking them out as submodules via 'git submodule add' and then issuing a single 'git submodule update' command takes... 0.15 seconds. Nice!
Looks like it does indeed see that there's no changes, so nothing has to be pulled from the upstream repos. Good!
The downside is that maintaining submodules means constantly pushing commits to the containing repo so it knows there are updates. :(
Probably the most user-friendly way to handle this is with a wrapper script that can do a single query to fetch the current branch head positions of a bunch of repos, then does fetch/pull only on the ones that have changed.
This could still end up pulling from 600+ repos -- if there are actually changes in them all! -- but should make typical cases a *lot* faster.
We should check in a little more detail how Android & other big projects using multiple git repos are doing their helper tools to see if we can just use something that already does this or if we have to build it ourselves. :)
-- brion