Over the last couple of years, MediaWiki development has moved from being almost entirely volunteer-based to having a large contingent of paid developers. A lot of people have noted that this has led to a lot of work being done without much community involvement. Just for a basic statistic, in July, I estimate that about 90% of non-localization commits to extensions/UsabilityInitiative/ were by paid employees. (I use "employee" loosely in this post, to include all paid staff, such as contractors.) By contrast, about 25% (ballpark figure) of non-localization commits to phase3/ were by paid employees, and the number of volunteer commits to phase3/ was much higher than the total number of commits to UsabilityInitiative, so this isn't just a matter of community members not doing as much work overall.
I've commented on this a few times before, but never at length. I think there's widespread confusion about what the problem even is, never mind how to solve it, so I'm writing this to set out at least my own views on the topic. Since my shorter remarks in other places tended to be misunderstood, I'll start at the beginning and go into considerable detail, which means this post will probably end up pretty long. I should say in advance that I'm discussing institutional problems here, not anything specific to individuals or projects, and no one should feel slighted if I pick them as an example. If you aren't really interested, start skimming. ;)
Let me begin with definitions. I will draw a basic distinction between community development and centralized development. I'll start with two motivating examples.
Firefox is developed by a community. Everything involved in the project and its development is open. Most of the work is done by employees of Mozilla, and all important decisions are made by employees of Mozilla, but anyone on the Internet can view what's happening and get involved. Bugs you open might get ignored forever, and you might have to poke people a bunch to get patches reviewed, and you might have to tolerate a considerable amount of bluntness and follow other people's marching orders if you want to contribute anything. But in principle, any random person in the world can make largely the same contributions as a Mozilla employee.
Internet Explorer is developed by a centralized team. They have blogs where they sometimes share detailed info about their development process and reasoning. They very carefully read all user feedback left in the comments. They have a bug tracker where anyone can file bugs, and they guarantee that they'll look at and attempt to reproduce every single bug filed in a timely fashion. But although they pay close attention to feedback, giving feedback is the only way you can really participate without getting hired by Microsoft. You can't write any code, or have a voice in discussions at all comparable to an IE team member.
These examples illustrate some important things:
* Community development does not mean democracy. Even in a totally community-oriented project, all decisions might ultimately be made by a small group of individuals. (For instance, in the case of the Linux kernel, one person.) * Community development does not mean community members do most of the work. From what I've heard, employees of Mozilla write most of Firefox's code, but it's still completely community-oriented development. * Listening to feedback is not the same as actually involving the community. Even a totally closed project can be extremely attentive to feedback. In fact, it's common for community projects to be *less* receptive to feedback, taking a "we'll listen to you when you write the code" attitude.
Keeping these in mind, I'll characterize a perfectly community-based development process like this: your say in the project is proportional to your contributions, and nothing prevents you from contributing as much as your time and ability allow. If you happen to be paid, it doesn't give you any additional say -- you just happen to be able to spend more time contributing. The decision-making process is open and transparent, and arguments are weighed on the basis of their merits and the speaker's history of contributions. This is of course not fully attainable in practice, but one can see how close or far a project is from the ideal.
Centralized and community development processes both have advantages and disadvantages. Some of the advantages of centralized development (as relevant to open-source projects) are:
* Paid employees don't have to spend time reviewing code from a lot of people who will only ever contribute a few patches, so they don't duplicate effort teaching everyone their project's coding conventions, or even educating them on basic things like XSS. * Because discussion can be private and everyone is more likely to be in similar time zones, it's possible to rely heavily on face-to-face or voice communication, which a lot of people are more comfortable with and which is a lot more efficient. * Since there are many fewer developers, they can socialize and get to know each other, reducing conflict and argument. * Full-time developers don't have to try coordinating with volunteers who may only be available at odd times or who may disappear randomly for weeks.
In short, centralized development allows employees' time to be spent more on actual coding, and less on communication. It's (at least superficially) more efficient. On the other hand, community development has advantages as well:
* You get work done for free. If it's easier to volunteers to make a meaningful difference, you'll get many more volunteers. Once they're up to speed, you don't have to watch over them much more than you would an employee, but you get their work for free. * You can hire community developers. You already know how good they are and they don't need to be brought up to speed with your codebase, saving you a lot of money and trouble compared to advertising for applicants. * Your software becomes more versatile, because volunteers will work on aspects that interest them even if they aren't in the interest of the controlling organization. This gets you more users and more developers.
Although there are superficial efficiency advantages to centralizing development, experience indicates that community-based development can be much more cost-effective in practice. Projects like Mozilla and Apache (and for that matter Wikimedia until recently) make software that's very competitive with centrally-developed competitors at a fraction of the cost.
On top of that, of course, the idea of centralized development is contrary to Wikimedia's ideals. Just as the Board is trying to pursue individual donations over corporate sponsorship, it fits with Wikimedia's goals and structure to have as community-oriented a structure as possible. Projects like Mozilla make it clear that this is attainable and productive.
Returning to the concept of community development, let's look at two key things: actual coding, and decision-making. In community-based development, anyone who's willing to write good code can get it submitted and included into the product. Someone with a greater history of contributions will be able to get their code included more easily, but only because the development community is willing to trust them more. They get by with less review, and the review is more readily given because of a greater expectation that it will be productive. Similarly, when it comes to decision-making, anyone has an equal opportunity to try convincing the decision-makers (who might be only one or a few people) of their point of view. In the end, the decision is made by appointed decision-makers, but with great deference toward the opinions of other established contributors.
From my perspective as a volunteer developer since 2006
(notwithstanding a few hours of contracting just now), Wikimedia has been failing badly on both of these issues for months, at least. There's a giant code review backlog, so very little code of the last several months gets synced -- except code by employees. Some employees apparently have shell access for the sole purpose of syncing their own code without going through the normal review process. No volunteer has been given such access, to my knowledge -- indeed, AFAIK it's been years since any non-employee has been given shell access at all. This is a bright line that deprives volunteers of any semblance of parity with staff.
Communication is a serious problem as well. I can't pin this one down so well, because I simply have no idea how employees are communicating, but I can observe that there's a ton of code being written with no discussion on #mediawiki or wikitech-l or any other MediaWiki development forum I know of. There are a lot of paid developers who I've never seen in either #mediawiki or wikitech-l. I infer that they must be communicating somehow, unless they all have a policy of committing code without speaking to anyone about it.
A lot of employees are in the same office, so I guess there's face-to-face communication going on. There's a secret staff IRC channel, and a staff-only mailing list or list alias or something (which I know about because a staff member complained about it in the secret staff IRC channel), and I think I've heard rumor of teleconferences. There are have also been various nominally public fora that only particular groups of employees use much in practice, like the Usability wiki and IRC channel (the latter now kind of discontinued but not really). I don't know, but it doesn't matter in the end. What it amounts to is that volunteers are often completely cut out of planning and design.
That's what leads to things like http://www.mediawiki.org/wiki/Special:Code/MediaWiki/67299. Some people said that maybe that could have been phrased better, or something. But the revert wasn't the problem, it was a symptom of the problem. The problem was that the design was decided on somewhere that volunteers couldn't or wouldn't participate. Of course you revert something that contradicts an agreed-upon design -- the problem is that the agreed-upon design was only agreed upon by a small group of employees. How are volunteers supposed to contribute in that environment, if they don't know what tune they're supposed to be dancing to?
The interlanguage link issue perfectly highlights the problem of mentality we have right now. (I'm not picking on the Usability Initiative particularly here, by the way -- it just provides the most ready examples because it's the largest.) You just need to look at this e-mail: http://lists.wikimedia.org/pipermail/foundation-l/2010-June/058936.html It begins "The Usability team discussed this issue at length this afternoon." The Usability Team is a separate body from the community, which holds its discussions separately. "We listened closely to the feedback and have come up with solution which we hope will work for everyone." Listening to feedback, not discussing the merits of the issue with peers.
What should have happened in that case is that each individual Usability Team member who saw the complaint should have posted their own individual, unrehearsed thoughts as an individual. What actually happened was a quintessentially centralized response: secret internal discussion followed by an official position statement. That is not the way that you treat peers. It's how you treat customers or clients.
I've seen this mentality again and again over the last year or two. One time I was discussing a design issue with a Wikimedia employee in #mediawiki, and after a brief discussion, he said (paraphrased) "Sorry, I need to get back to work." Apparently it's only "work" when you're talking to other employees. http://www.mediawiki.org/wiki/Development_process_improvement draws a clear line at every step between employees and developers. This is not the way to attract or keep a healthy volunteer development community.
The solution is not to increase communication between staff and volunteers. It's to make the distinction as irrelevant as possible to actual development. They're all developers, and some happen to get paid. Specific changes I would propose include:
* Consider what to do about code review. This is pretty much the hardest problem on this list, which is why I don't propose a specific solution here, but there has to be a better solution than "assume a bunch of employees are trusted enough to sync their own code, force everyone else to wait months for central review". * Stop concentrating tech employees in San Francisco. Either have most of them work from home, or perhaps establish other small offices so that they're split up. The point is, make them rely on telecommunication, because if you put people in the same office they'll talk a lot face-to-face, and volunteers simply cannot participate. The purpose of putting people together in an office is so that they work together as a team, and this is exactly what you do *not* want, because volunteers cannot be part of that team. This is the second-hardest problem, or maybe the hardest, and I can't give a full solution for it either. I'd suggest checking with Mozilla about how they do it, because I know they do have offices, but they're a perfect example of community-oriented development. * Explicitly encourage all paid developers to do everything in public and to treat volunteer developers as they would paid ones. I'm not saying this should be enforced in any particular manner, but it should be clearly stated so that everyone knows how things are intended to be. * Shut down the secret staff IRC channel. Development discussion can take place in #mediawiki, ops in #wikimedia-tech, other stuff in #wikimedia or whatever. If users interfere with ops' discussions sometimes in #wikimedia-tech during outages or such, set all sysadmins +v and set the channel +m as necessary. That's worked in the past. * Shut down #wikimedia-dev (formerly #wikipedia_usability, kind of). The explicit purpose of the channel is to allow development discussion with less noise, but "noise" here means community involvement. In community development, you do get a lot more discussion, but that's not something you should try avoiding. In general, use existing discussion fora wherever possible, and if you do fragment them, make sure you don't have too much of a staff-volunteer split in which fora people use. * Don't conduct teleconferences about development, ever. Even if volunteers are invited (are they?), time zones and non-MediaWiki obligations make all synchronous communication much harder for volunteers to participate in. Rely primarily on mailing lists, and secondarily on publicly-logged IRC channels (where at least it's easy to read backscroll). * Stop using private e-mail for development, at least to any significant extent. If there are any internal development mailing lists or aliases or whatever used for development, retire them.
I don't know how seriously these suggestions will be taken in practice by the powers that be, but I hope I've made a detailed and cogent enough case to make at least some impact.