2011/3/24 Neil Kandalgaonkar <neilk(a)wikimedia.org>rg>:
- Allows us to deploy trunk. At any time. Eliminate
the production
branch. Any developer in the world should be able to work on the code we
actually have in production without having to decide between trunk and a
production branch.
You're basically arguing for Linux-style pre-commit code review where
people e-mail patches back and forth. However, as long as we still
have SVN, that means that these pre-committed patches ARE NOT
VERSIONED, let alone necessarily public.
I believe this is bad because:
1) Keeping track of patches, collaborating on a larger feature, etc.
become harder (no benefits of a VCS)
2) Resolving conflicts between patches is done by reviewers when they
apply them instead of being conveniently outsourced to the
author-committers
3) If review capacity is low, patches don't get committed, their
authors bug reviewers a few times, give up, get demotivated and leave
the project
I think this workflow could work with a DVCS with Git, but I strongly
oppose implementing it while we're still using a centralized VCS like
Subversion.
Instead, let me outline my recollection of how code review and
deployment worked back when I joined this project, and explain how I
think this process can be resurrected. This was all a long time ago
and I was fairly new to MW, so please correct me where I'm wrong.
* Someone commits something
* A notification is sent to the mediawiki-cvs list. This is still the
case, except back then more than a few people were subscribed to it,
and traffic wasn't as high
* Optionally, a mediawiki-cvs reader (the usual suspects being Brion,
Tim and Rob Church) reads the diff and notices something is wrong with
it. They reply to the commit notification, citing parts of the diff
inline and raising their objections. This reply is automatically sent
to wikitech-l (we didn't have the CodeReview extension yet), which
committers are expected to be subscribed to. A discussion about the
commit takes place, possibly leading to followup commits
* The next Monday, Brion smoketests HEAD. If he finds breakage, he
tracks down the offending revision(s) and reverts things until
everything seems to work. ("Keep trunk runnable" was taken really
seriously, and we mostly had a revert->reapply cycle instead of a
fixme->followup cycle: it was perfectly acceptable to revert broken
things if they couldn't be fixed in 5 minutes, especially if you were
as busy as Brion.)
* In addition to smoketesting, Brion also reviews all revisions to
phase3 and WMF-run extensions (with the level of commit activity we
had back then, this wasn't an unreasonable job for one person to do on
one day) and reverts things as appropriate.
* trunk is now in a state where it seems to run fine on Brion's
laptop. Brion deploys trunk to testwiki, tests a bit more, then
deploys to the cluster
As you know, our workflow has become a bit different over the years.
At some point, CodeReview was written to make revision discussions
nicer and to provide status fields so Brion could outsource some
review work. Later, the WMF branch was introduced to not only track
live hacks and WMF-specific changes, but also to remove the dependency
on a runnable trunk.
The reason this workflow resulted in frequent deployments of trunk was
that review was that review was always close to HEAD (never behind
more than about 2 weeks). The reason it broke down in the end was that
Brion kept having less time to review more things, but that doesn't
mean we can't make it work again by having more than one reviewer. I
think the following conditions are necessary for this to happen:
* We need to have multiple reviewers (duh)
* Every reviewer needs to budget time for code review, and they need
to not get tangled up in other obligations to a degree where they
can't spend enough time on review. This is largely a management thing
* It needs to be clear who is responsible for reviewing what. This
doesn't need to be set in stone, but we have to avoid a situation
where revisions aren't reviewed because no one feels responsible. This
can be accomplished by agreeing on review assignments based on e.g.
path/subsystem, and having some manager-like person (possibly an EPM)
monitor the process and make sure nothing gets left by the wayside. If
conventional review assignments leave too much ambiguity, additional
criteria can be introduced, e.g. the day of the week something was
committed. More on this in a minute
* There needs to be a clear expectation that commits are generally
reviewed within a certain time (say, a week) after having been
committed. The same manager-like person should also be keeping an eye
on this and making sure overdue revs are reviewed pronto
* We need to set a clear policy for reverting problematic revisions
(fixme's) if they aren't addressed quickly enough (again, let's say
within a week). Currently we largely leave them be, but I think we
should go back to something more decisive and closer to the "keep
trunk runnable, or else Brion kicks your ass" paradigm and make it a
bit more formal this time
As for the who-reviews-what criteria, I think they should be designed
to ensure most things will get reviewed without EPM intervention being
necessary, while not getting in the way of getting things done. I
think the best way to do this would be to first assign revisions based
on path or subsystem (e.g. Tim for parser revs, Sam or me for API
revs, etc.) then for everything that doesn't neatly fall into one
subsystem someone's an expert on or can't easily be found by a path
search, we'd divide the week in a number of chunks and make one
reviewer responsible for reviewing the 'misc' revs in each chunk (or
throwing them to someone else if it's over their head). If this is
timed in a way that corresponds nicely with our reviewers'
availability (e.g. schedule around school for people like me, schedule
Tim for Sunday cause that's his Monday, etc.) we could have a system
where the median time to review is like 24-48 hours.
These are my thoughts, and I'm definitely interested in talking about
this in Berlin, as Sumana suggested.
Roan Kattouw (Catrope)