Thanks for your input, Ori!
On 13.01.2013 01:35, Ori Livneh wrote:
As I said, I found the API well-designed on the whole, but:
- "getForFoo" (getForModelID, getDefaultModelFor) is a confusing pattern for
method names. getDefaultModelFor is especially weird: I get what it does, but I don't know why it is where it is, or what need it is fulfilling.
Yea, in retrospect, i'm not very happy with the naming of getForModelID either, and getForTitle could just die in favor of Title::getContentHandler.
ContentHandler::getDefaultModelFor determines the model to apply per default to a given title - maybe this should have been Title::getDefaultContentModel? But I wanted to centralize the factory logic in the ContentHandler class. So I think this is in the right place, at least.
- I don't have a clear mental model of the dividing line between Content and
ContentHandler. The documentation (contenthandler.txt) explains that "all manipulation and analysis of page content must be done via the appropriate methods of the Content object", but it's the ContentHandler class that implements serializeContent, getPageLanguage, getAutoSummary, etc.
The reason for the devision of ContentHandler and Content is mostly efficiency: to get a Content object, you have to load the actual content blob from the database. But a lot of operation depend on the content model (aka type), but not (necessarily) on the content itself, so they can be performed by the appropriate ContentHandler singleton:
getPageLanguage for example will always return "en" for JavaScript content and the wiki's content language for wikitext. It *could* load the content and look whether there's something in here that specifies a different language.
serializeContent could be implemented in Content, but unserializeContent couldn't, since it's what is used to create Content objects. I thought it would be good to have the serialize and unserialize methods in the same place.
If I think about it, I can sort of understand why things are on one class rather than the other, but it isn't so clear that I know where to look if I need to do something related to content. I usually look both places.
Yes, I suppose the documentation could explain this some more.
- The way validation is handled is a bit mysterious. Content defines an
isValid interface and (if I recall correctly) a return value of false would prevent the content from getting saved. But in such cases you want a helpful error.
You are right, it would be better to have a validate() method that returns a Status object. isValid() could then just call that and return $status->isOK(), for compatibility. If you like, file a bug for that - or just write it :)
- I would expect something like ContentHandler to provide a generic interface
for supplying an editor suitable for a particular Content, in lieu of the default editor.
It actually had that in some early version, but it did not work well with the way MediaWiki handles actions like edit. The correct way is to provide a custom handler class for the edit action via the getActionOverrides method. Wikibase makes extensive use of that mechanism.
This isn't very obvious or pretty, but very flexible, and fits well with the existing infrastructure.
I suppose the documentation should explain this in detail, though.
- I wasn't sure initially which classes to extend for JsonSchemaContent and
JsonSchemaContentHandler. I concluded that for all textual content types it's better to extend WikitextContent / WikitextContentHandler rather than the base or abstract content / content handler classes.
All *textual* (not "text based") content should derive from TextContent resp TextContentHandler. Such content can be edited using the standard edit page, will work in system messages, etc. There are also some extensions and maintenance scripts that only operate on content derived from TextContent (e.g. things that do search-and-replace).
Non-textual content (including anything with a strict syntax, like JSON, XML, whatever) should derive from AbstractContent and the generic ContentHandler. For such content, a custom editor is typically needed. A custom diff engine is also useful.
After working with the API for a while I had a head-explodes moment when I realized that MediaWiki is now a generic framework for collaboratively fashioning and editing content objects, and that it provides a generic implementation of a creative workflow based on the concepts of versioning, diffing, etc. I think it's a fucking amazing model for the web and I hope MediaWiki's code and community is nimble enough to fully realize it.
Yes, that's exactly it! You said that far better than I could have, I suppose I still expect people to just *see* that :P
Spread the word!
Thanks, daniel