[Wikitech-l] Community vs. centralized development

3 Sep 2010


      Over the last couple of years, MediaWiki development has moved from
being almost entirely volunteer-based to having a large contingent of
paid developers.  A lot of people have noted that this has led to a
lot of work being done without much community involvement.  Just for a
basic statistic, in July, I estimate that about 90% of
non-localization commits to extensions/UsabilityInitiative/ were by
paid employees.  (I use "employee" loosely in this post, to include
all paid staff, such as contractors.)  By contrast, about 25%
(ballpark figure) of non-localization commits to phase3/ were by paid
employees, and the number of volunteer commits to phase3/ was much
higher than the total number of commits to UsabilityInitiative, so
this isn't just a matter of community members not doing as much work
overall.
I've commented on this a few times before, but never at length.  I
think there's widespread confusion about what the problem even is,
never mind how to solve it, so I'm writing this to set out at least my
own views on the topic.  Since my shorter remarks in other places
tended to be misunderstood, I'll start at the beginning and go into
considerable detail, which means this post will probably end up pretty
long.  I should say in advance that I'm discussing institutional
problems here, not anything specific to individuals or projects, and
no one should feel slighted if I pick them as an example.  If you
aren't really interested, start skimming.  ;)
Let me begin with definitions.  I will draw a basic distinction
between community development and centralized development.  I'll start
with two motivating examples.
Firefox is developed by a community.  Everything involved in the
project and its development is open.  Most of the work is done by
employees of Mozilla, and all important decisions are made by
employees of Mozilla, but anyone on the Internet can view what's
happening and get involved.  Bugs you open might get ignored forever,
and you might have to poke people a bunch to get patches reviewed, and
you might have to tolerate a considerable amount of bluntness and
follow other people's marching orders if you want to contribute
anything.  But in principle, any random person in the world can make
largely the same contributions as a Mozilla employee.
Internet Explorer is developed by a centralized team.  They have blogs
where they sometimes share detailed info about their development
process and reasoning.  They very carefully read all user feedback
left in the comments.  They have a bug tracker where anyone can file
bugs, and they guarantee that they'll look at and attempt to reproduce
every single bug filed in a timely fashion.  But although they pay
close attention to feedback, giving feedback is the only way you can
really participate without getting hired by Microsoft.  You can't
write any code, or have a voice in discussions at all comparable to an
IE team member.
These examples illustrate some important things:
* Community development does not mean democracy.  Even in a totally
community-oriented project, all decisions might ultimately be made by
a small group of individuals.  (For instance, in the case of the Linux
kernel, one person.)
* Community development does not mean community members do most of the
work.  From what I've heard, employees of Mozilla write most of
Firefox's code, but it's still completely community-oriented
development.
* Listening to feedback is not the same as actually involving the
community.  Even a totally closed project can be extremely attentive
to feedback.  In fact, it's common for community projects to be *less*
receptive to feedback, taking a "we'll listen to you when you write
the code" attitude.
Keeping these in mind, I'll characterize a perfectly community-based
development process like this: your say in the project is proportional
to your contributions, and nothing prevents you from contributing as
much as your time and ability allow.  If you happen to be paid, it
doesn't give you any additional say -- you just happen to be able to
spend more time contributing.  The decision-making process is open and
transparent, and arguments are weighed on the basis of their merits
and the speaker's history of contributions.  This is of course not
fully attainable in practice, but one can see how close or far a
project is from the ideal.
Centralized and community development processes both have advantages
and disadvantages.  Some of the advantages of centralized development
(as relevant to open-source projects) are:
* Paid employees don't have to spend time reviewing code from a lot of
people who will only ever contribute a few patches, so they don't
duplicate effort teaching everyone their project's coding conventions,
or even educating them on basic things like XSS.
* Because discussion can be private and everyone is more likely to be
in similar time zones, it's possible to rely heavily on face-to-face
or voice communication, which a lot of people are more comfortable
with and which is a lot more efficient.
* Since there are many fewer developers, they can socialize and get to
know each other, reducing conflict and argument.
* Full-time developers don't have to try coordinating with volunteers
who may only be available at odd times or who may disappear randomly
for weeks.
In short, centralized development allows employees' time to be spent
more on actual coding, and less on communication.  It's (at least
superficially) more efficient.  On the other hand, community
development has advantages as well:
* You get work done for free.  If it's easier to volunteers to make a
meaningful difference, you'll get many more volunteers.  Once they're
up to speed, you don't have to watch over them much more than you
would an employee, but you get their work for free.
* You can hire community developers.  You already know how good they
are and they don't need to be brought up to speed with your codebase,
saving you a lot of money and trouble compared to advertising for
applicants.
* Your software becomes more versatile, because volunteers will work
on aspects that interest them even if they aren't in the interest of
the controlling organization.  This gets you more users and more
developers.
Although there are superficial efficiency advantages to centralizing
development, experience indicates that community-based development can
be much more cost-effective in practice.  Projects like Mozilla and
Apache (and for that matter Wikimedia until recently) make software
that's very competitive with centrally-developed competitors at a
fraction of the cost.
On top of that, of course, the idea of centralized development is
contrary to Wikimedia's ideals.  Just as the Board is trying to pursue
individual donations over corporate sponsorship, it fits with
Wikimedia's goals and structure to have as community-oriented a
structure as possible.  Projects like Mozilla make it clear that this
is attainable and productive.
Returning to the concept of community development, let's look at two
key things: actual coding, and decision-making.  In community-based
development, anyone who's willing to write good code can get it
submitted and included into the product.  Someone with a greater
history of contributions will be able to get their code included more
easily, but only because the development community is willing to trust
them more.  They get by with less review, and the review is more
readily given because of a greater expectation that it will be
productive.  Similarly, when it comes to decision-making, anyone has
an equal opportunity to try convincing the decision-makers (who might
be only one or a few people) of their point of view.  In the end, the
decision is made by appointed decision-makers, but with great
deference toward the opinions of other established contributors.
...
From my perspective as a volunteer developer since 2006
(notwithstanding a few hours of contracting just now), Wikimedia has
been failing badly on both of these issues for months, at least.
There's a giant code review backlog, so very little code of the last
several months gets synced -- except code by employees.  Some
employees apparently have shell access for the sole purpose of syncing
their own code without going through the normal review process.  No
volunteer has been given such access, to my knowledge -- indeed, AFAIK
it's been years since any non-employee has been given shell access at
all.  This is a bright line that deprives volunteers of any semblance
of parity with staff.
Communication is a serious problem as well.  I can't pin this one down
so well, because I simply have no idea how employees are
communicating, but I can observe that there's a ton of code being
written with no discussion on #mediawiki or wikitech-l or any other
MediaWiki development forum I know of.  There are a lot of paid
developers who I've never seen in either #mediawiki or wikitech-l.  I
infer that they must be communicating somehow, unless they all have a
policy of committing code without speaking to anyone about it.
A lot of employees are in the same office, so I guess there's
face-to-face communication going on.  There's a secret staff IRC
channel, and a staff-only mailing list or list alias or something
(which I know about because a staff member complained about it in the
secret staff IRC channel), and I think I've heard rumor of
teleconferences.  There are have also been various nominally public
fora that only particular groups of employees use much in practice,
like the Usability wiki and IRC channel (the latter now kind of
discontinued but not really).  I don't know, but it doesn't matter in
the end.  What it amounts to is that volunteers are often completely
cut out of planning and design.
That's what leads to things like
http://www.mediawiki.org/wiki/Special:Code/MediaWiki/67299.  Some
people said that maybe that could have been phrased better, or
something.  But the revert wasn't the problem, it was a symptom of the
problem.  The problem was that the design was decided on somewhere
that volunteers couldn't or wouldn't participate.  Of course you
revert something that contradicts an agreed-upon design -- the problem
is that the agreed-upon design was only agreed upon by a small group
of employees.  How are volunteers supposed to contribute in that
environment, if they don't know what tune they're supposed to be
dancing to?
The interlanguage link issue perfectly highlights the problem of
mentality we have right now.  (I'm not picking on the Usability
Initiative particularly here, by the way -- it just provides the most
ready examples because it's the largest.)  You just need to look at
this e-mail: http://lists.wikimedia.org/pipermail/foundation-l/2010-June/058936.html
 It begins "The Usability team discussed this issue at length this
afternoon."  The Usability Team is a separate body from the community,
which holds its discussions separately.  "We listened closely to the
feedback and have come up with solution which we hope will work for
everyone."  Listening to feedback, not discussing the merits of the
issue with peers.
What should have happened in that case is that each individual
Usability Team member who saw the complaint should have posted their
own individual, unrehearsed thoughts as an individual.  What actually
happened was a quintessentially centralized response: secret internal
discussion followed by an official position statement.  That is not
the way that you treat peers.  It's how you treat customers or
clients.
I've seen this mentality again and again over the last year or two.
One time I was discussing a design issue with a Wikimedia employee in
#mediawiki, and after a brief discussion, he said (paraphrased)
"Sorry, I need to get back to work."  Apparently it's only "work" when
you're talking to other employees.
http://www.mediawiki.org/wiki/Development_process_improvement draws
a clear line at every step between employees and developers.  This is
not the way to attract or keep a healthy volunteer development
community.
The solution is not to increase communication between staff and
volunteers.  It's to make the distinction as irrelevant as possible to
actual development.  They're all developers, and some happen to get
paid.  Specific changes I would propose include:
* Consider what to do about code review.  This is pretty much the
hardest problem on this list, which is why I don't propose a specific
solution here, but there has to be a better solution than "assume a
bunch of employees are trusted enough to sync their own code, force
everyone else to wait months for central review".
* Stop concentrating tech employees in San Francisco.  Either have
most of them work from home, or perhaps establish other small offices
so that they're split up.  The point is, make them rely on
telecommunication, because if you put people in the same office
they'll talk a lot face-to-face, and volunteers simply cannot
participate.  The purpose of putting people together in an office is
so that they work together as a team, and this is exactly what you do
*not* want, because volunteers cannot be part of that team.  This is
the second-hardest problem, or maybe the hardest, and I can't give a
full solution for it either.  I'd suggest checking with Mozilla about
how they do it, because I know they do have offices, but they're a
perfect example of community-oriented development.
* Explicitly encourage all paid developers to do everything in public
and to treat volunteer developers as they would paid ones.  I'm not
saying this should be enforced in any particular manner, but it should
be clearly stated so that everyone knows how things are intended to
be.
* Shut down the secret staff IRC channel.  Development discussion can
take place in #mediawiki, ops in #wikimedia-tech, other stuff in
#wikimedia or whatever.  If users interfere with ops' discussions
sometimes in #wikimedia-tech during outages or such, set all sysadmins
+v and set the channel +m as necessary.  That's worked in the past.
* Shut down #wikimedia-dev (formerly #wikipedia_usability, kind of).
The explicit purpose of the channel is to allow development discussion
with less noise, but "noise" here means community involvement.  In
community development, you do get a lot more discussion, but that's
not something you should try avoiding.  In general, use existing
discussion fora wherever possible, and if you do fragment them, make
sure you don't have too much of a staff-volunteer split in which fora
people use.
* Don't conduct teleconferences about development, ever.  Even if
volunteers are invited (are they?), time zones and non-MediaWiki
obligations make all synchronous communication much harder for
volunteers to participate in.  Rely primarily on mailing lists, and
secondarily on publicly-logged IRC channels (where at least it's easy
to read backscroll).
* Stop using private e-mail for development, at least to any
significant extent.  If there are any internal development mailing
lists or aliases or whatever used for development, retire them.
I don't know how seriously these suggestions will be taken in practice
by the powers that be, but I hope I've made a detailed and cogent
enough case to make at least some impact.

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

[Wikitech-l] Community vs. centralized development