I'd like to think about cost v.s benefit again. Why exactly should we do this?

You are asking me to repeat the wall of text from the first mail? Please read it again and open discussion on specific points you disagree with.

A separate extension for each component means maintaining a lot of compatibility
info somehow, somewhere.

In case of DataValues we already have one extension per component. The compatbility info is also quite manageable. In fact, it becomes a lot more clear what works together if things are properly kept separate and versioned. Right now I am getting questions from confused users using something based on DataValues and have to tell them to "get latest master of everything" or even things such as "any revision before somehash".

Maybe having the components as submodules, instead of separate extensions, would
help... Something to ask the Foundation.

That is a question on how we make those components available to Wikibase. Exactly how we do this has not all that much effect on the sensibility of the split into multiple git repos. On this particular topic I have no strong opinions, though I am concerned with submodules, as this does not seem to work well when the repos pointed to by those submodules are needed by multiple components. You'll end up having them multiple times no?

is a pain for development

I disagree this is a pain. Or perhaps it is, if you define a "pain" as the effects you have of using type hinting, of not just using globals, and properly injecting dependencies. All these things force explicitness of some sort, which you have to deal with. This explicitness is there to help you and prevent errors. If you try to ignore it, of course you will end up being frustrated. Or if you do not keep it in mind at all, you'll also end up frustrated. Since managing dependencies is one of the most important tasks in software development, you really ought to keep it in mind though.

(there's a major refactoring of the formatter/parser stuff imminent).

Those two components are separate. They do not even know about each other. And that is a very important property. So how does work on them affect a split in any way? I can see several advantages to having a split, such as it being more clear when changes are being made in one component, or being able to release one without being blocked by the other since it is in the middle of a refactor. What are the disadvantages in this case?

Now you can again bring up "oh no, we'll have to constantly make changes in multiple repos, and keeping track of this all will be hell". My answer to this also has not changed: if you split up distinct components and keep in mind all the relevant principles and trade-offs, then having to make changes across multiple repos should be very rare indeed. Almost all of the components that are not lib, repo and client have been created by me. I also did most of the work in these. And yet I did not run into significant hassle. If I had, I'd certainly not be advocating going further down this road.

Cheers

--
Jeroen De Dauw
http://www.bn2vs.com
Don't panic. Don't be evil. ~=[,,_,,]:3
--