The more I look at ContentHandler, the more it seems like a major new feature for MW.
Are there any examples of this in use on Labs?
Is there more information on how to use and extend it for things like WorkingWiki?
There is some talk of how the Gadgets extension may be changed by the use of ContentHandler. Has any work been done on that yet? Do gadget authors on-wiki need to know about these changes?
I like what I'm seeing in ContentHandler, but I'm a little afraid that this will be like the introduction of ResourceLoader -- there is a lot of promise here and new capability, but it sounds very disruptive to current Gadget users (to give one example).
How can we begin to prepare third party users of MW for these changes now?
On 01/12/2013 09:15 AM, Mark A. Hershberger wrote:
The more I look at ContentHandler, the more it seems like a major new feature for MW.
Are there any examples of this in use on Labs?
Is there more information on how to use and extend it for things like WorkingWiki?
There is some talk of how the Gadgets extension may be changed by the use of ContentHandler. Has any work been done on that yet? Do gadget authors on-wiki need to know about these changes?
Gadgets 2.0 could potentially use a new content page type to store information. Currently it looks like https://en.wikipedia.org/wiki/MediaWiki:Gadgets-definition . As you can see, as more information has been added, the lines have gotten increasingly awkwardly formatted.
There is a proposal to use JSON (https://www.mediawiki.org/wiki/Gadgets_2.0#Implementation_proposal) and a custom UI, I believe that could be hooked into ContentHandler. Gadgets 2.0 is also meant to provide very useful features like i18n for gadgets.
Last I heard, significant progress was made on 2.0, but the project is currently on hold. Thus, there's not a need to notify people right away. When the time comes, I don't think initial migration will be overly complicated, because the existing syntax has a clear mapping to the new one.
Matt Flaschen
On 01/12/2013 09:32 AM, Matthew Flaschen wrote:
On 01/12/2013 09:15 AM, Mark A. Hershberger wrote:
Are there any examples of this in use on Labs?
Is there more information on how to use and extend it for things like WorkingWiki?
Matt,
Thanks for your insight on ContentHandler and Gadgets 2.0. Do you have any on the above two questions?
Mark.
On 01/12/2013 09:59 AM, Mark A. Hershberger wrote:
On 01/12/2013 09:32 AM, Matthew Flaschen wrote:
On 01/12/2013 09:15 AM, Mark A. Hershberger wrote:
Are there any examples of this in use on Labs?
Is there more information on how to use and extend it for things like WorkingWiki?
Matt,
Thanks for your insight on ContentHandler and Gadgets 2.0. Do you have any on the above two questions?
The main documentation is at https://www.mediawiki.org/wiki/Manual:ContentHandler . It isn't quite a tutorial, but there are several useful links. It can definitely use some more work.
I also created https://www.mediawiki.org/wiki/Category:ContentHandler and https://www.mediawiki.org/wiki/Category:ContentHandler_extensions . ContentHandler is now a supported type extensions can include in https://www.mediawiki.org/wiki/Template:Extension to show they use it.
That should help track down more information, including working code using it. I am also fairly familiar with https://www.mediawiki.org/wiki/Extension:EventLogging (mostly written by Ori Livneh). People may want to look at its code (https://gerrit.wikimedia.org/r/gitweb?p=mediawiki/extensions/EventLogging.gi...), then ask one of us questions about how it's using ContentHandler.
Matt Flaschen
On 01/12/2013 09:32 AM, Matthew Flaschen wrote:
Last I heard, significant progress was made on 2.0, but the project is currently on hold. Thus, there's not a need to notify people right away. When the time comes, I don't think initial migration will be overly complicated, because the existing syntax has a clear mapping to the new one.
Clear mapping or no, it is a change and the old Gadget 1.0 pages will cease to work unless the people integrating ContentHandler make backwards compatibility a priority.
That will cause problems throughout wikidom.
On 01/12/2013 10:02 AM, Mark A. Hershberger wrote:
On 01/12/2013 09:32 AM, Matthew Flaschen wrote:
Last I heard, significant progress was made on 2.0, but the project is currently on hold. Thus, there's not a need to notify people right away. When the time comes, I don't think initial migration will be overly complicated, because the existing syntax has a clear mapping to the new one.
Clear mapping or no, it is a change and the old Gadget 1.0 pages will cease to work unless the people integrating ContentHandler make backwards compatibility a priority.
That will cause problems throughout wikidom.
Yes, if the issue was ignored. However, I believe a bot could do the one-time conversion.
Matt Flaschen
On 01/12/2013 04:33 PM, Matthew Flaschen wrote:
That will cause problems throughout wikidom.
Yes, if the issue was ignored. However, I believe a bot could do the one-time conversion.
This solution is just as good as ignoring the problem for non-WMF users of MediaWiki.
A solution can probably involve tools, but making them happen in, say, update.php will ensure that they happen for other users, too.
Of course, then we need to have users prepared for these sorts of changes so they don't go in and "fix" what was changed in the update.
On 01/12/2013 06:40 PM, Mark A. Hershberger wrote:
On 01/12/2013 04:33 PM, Matthew Flaschen wrote:
That will cause problems throughout wikidom.
Yes, if the issue was ignored. However, I believe a bot could do the one-time conversion.
This solution is just as good as ignoring the problem for non-WMF users of MediaWiki.
A solution can probably involve tools, but making them happen in, say, update.php will ensure that they happen for other users, too.
I agree that could be better, as long as people are aware update.php would be modifying wiki-specific pages in the MW namespace (I'm not sure how common that is).
Matt Flaschen
On 12.01.2013 16:02, Mark A. Hershberger wrote:
On 01/12/2013 09:32 AM, Matthew Flaschen wrote:
Last I heard, significant progress was made on 2.0, but the project is currently on hold. Thus, there's not a need to notify people right away. When the time comes, I don't think initial migration will be overly complicated, because the existing syntax has a clear mapping to the new one.
Clear mapping or no, it is a change and the old Gadget 1.0 pages will cease to work unless the people integrating ContentHandler make backwards compatibility a priority.
That will cause problems throughout wikidom.
Changing the way something is represented always causes compatibility issues. But that's a problem of the respective application (read: MediaWiki Extension), not the framework. Of any by itself, ContentHandler does not change anything about how Gadgets are defined or stored. It just *allows* for new ways of storing gadget definitions. If the Gadget extension starts to use the new way, it needs to worry about b/c. The ContentHandler framework provides support for this by recording the content model and, separately, the serialization format for every revision of a page (at last if $wgContentHandlerUseDB is turned on).
So: The introduction of ContentHandler doesn't mean anything for Gadgets. The migration from Gadget 1.0 to 2.0 does.
-- daniel
On 01/13/2013 07:17 AM, Daniel Kinzler wrote:
On 12.01.2013 16:02, Mark A. Hershberger wrote:
That will cause problems throughout wikidom.
Of any by itself, ContentHandler does not change anything about how Gadgets are defined or stored. It just *allows* for new ways of storing gadget definitions. If the Gadget extension starts to use the new way, it needs to worry about b/c.
Agreed. And to be clear, this is what I meant. As you said quite succinctly:
So: The introduction of ContentHandler doesn't mean anything for Gadgets. The migration from Gadget 1.0 to 2.0 does.
On Sat, Jan 12, 2013 at 3:15 PM, Mark A. Hershberger mah@everybody.orgwrote:
The more I look at ContentHandler, the more it seems like a major new feature for MW.
Are there any examples of this in use on Labs?
It's used in the Wikibase extensions for Wikidata. We have test wikis on labs but not sure really what you mean or want?
Is there more information on how to use and extend it for things like WorkingWiki?
There is some talk of how the Gadgets extension may be changed by the use of ContentHandler. Has any work been done on that yet? Do gadget authors on-wiki need to know about these changes?
I like what I'm seeing in ContentHandler, but I'm a little afraid that this will be like the introduction of ResourceLoader -- there is a lot of promise here and new capability, but it sounds very disruptive to current Gadget users (to give one example).
ContentHandler is fully backwards compatibility so not meant to be disruptive, but has potential to make things easier and better in the future for new features. There are a bunch of functions throughout the code that are deprecated and new extensions should use the new functions, of course.
The feature is fully enabled with a configuration variable $wgContentHandlerUseDB to support multiple content formats in the same namespace. (e.g. user/site Javascript and CSS pages without the .js or .css suffix, along with wikitext pages in the MediaWiki namespace)
The configuration setting is set to have this switched off currently for all the wikis, except for wikidata, since it was such a major, new feature. Hopefully it's proved stable enough now and we can think about turning it on in more places soon.
Right now, Wikipedia uses ugly hacks (shocked!) like http://en.wikipedia.org/wiki/Template:Attached_KML. Hopefully with Wikidata and/or elsewhere we can support such content in a nicer way.
Cheers, Katie
How can we begin to prepare third party users of MW for these changes now?
Language will always shift from day to day. It is the wind blowing through our mouths. -- http://hexm.de/np
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
On 01/12/2013 10:14 AM, aude wrote:
ContentHandler is fully backwards compatibility so not meant to be disruptive, but has potential to make things easier and better in the future for new features.
Sure. That makes total sense.
The feature is fully enabled with a configuration variable $wgContentHandlerUseDB to support multiple content formats in the same namespace. (e.g. user/site Javascript and CSS pages without the .js or .css suffix, along with wikitext pages in the MediaWiki namespace)
Right now, I'm focused on non-WMF users of MediaWiki and this sounds like something they should be aware of. If they install a new wiki and have $wgContentHandlerUseDB enabled, then what new risks do they need to be aware of? What are things they should be thinking about?
If someone installs MW and wants to use and expand this feature (as the WorkingWiki people might want to), where do they go to find information on it?
Right now, the on-wiki documentation refers to docs/contenthandler.txt. It seems like this area is ripe for on-wiki documentation, tutorials, and how-tos.
On Sat, Jan 12, 2013 at 4:36 PM, Mark A. Hershberger mah@everybody.orgwrote:
On 01/12/2013 10:14 AM, aude wrote:
ContentHandler is fully backwards compatibility so not meant to be disruptive, but has potential to make things easier and better in the future for new features.
Sure. That makes total sense.
The feature is fully enabled with a configuration variable $wgContentHandlerUseDB to support multiple content formats in the same namespace. (e.g. user/site Javascript and CSS pages without the .js or .css suffix, along with wikitext pages in the MediaWiki namespace)
Right now, I'm focused on non-WMF users of MediaWiki and this sounds like something they should be aware of. If they install a new wiki and have $wgContentHandlerUseDB enabled, then what new risks do they need to be aware of? What are things they should be thinking about?
I don't think there are many impacts, if any, of enabling the content handler to use the database. By default, it stores the type in database as "null". null === default content type (content_model) for the namespace.
It will set content type in the database for JavaScript or CSS pages, as default content type for MediaWiki namespace is wikitext.
One important change with introducing the content handler is that JavaScript and CSS pages don't allow categories and such wiki markup anymore. This is true regardless of how $wgContentHandlerUseDB is set.
If someone installs MW and wants to use and expand this feature (as the WorkingWiki people might want to), where do they go to find information on it?
Right now, the on-wiki documentation refers to docs/contenthandler.txt. It seems like this area is ripe for on-wiki documentation, tutorials, and how-tos.
The information in docs/contenthandler.txt is probably the most useful at this point, along with http://www.mediawiki.org/wiki/ContentHandler
They can look at the Wikibase code to see examples of how we are implementing new content types.
It would certainly be nice to have more examples, tutorials, etc. but I'm not aware of them yet.
Cheers, Katie
Language will always shift from day to day. It is the wind blowing through our mouths. -- http://hexm.de/np
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Thanks aude for replying to Mark's questions!
On 12.01.2013 17:08, aude wrote:
Right now, I'm focused on non-WMF users of MediaWiki and this sounds like something they should be aware of. If they install a new wiki and have $wgContentHandlerUseDB enabled, then what new risks do they need to be aware of? What are things they should be thinking about?
Not that I can think of, no. ContentHandler itself just encapsulates knowledge about specific kinds of content, so it can easily be replaced by some other kind of content, with the rest of the wiki system still working the same.
One thing to be aware of (regardless of how $wgContentHandlerUseDB is set) is that changing the default content model for a namespace may make content in that namespace inaccessible. Kind of like changing a namespace ID.
This however shouldn't usually happen, since custom content models are generally governed by the extension the introduces them. There's just no reason to mess with them (as there's no reason to mess with the standard namespaces, and I'm sure you could have quite some fun breaking those).
I don't think there are many impacts, if any, of enabling the content handler to use the database. By default, it stores the type in database as "null". null === default content type (content_model) for the namespace.
Slight correction here, about what $wgContentHandlerUseDB does. It's not directly related to namespace. Consider:
* a pages default content model is derived from it's title. The namespace is only one factor. For .js and .css pages in the MediaWiki namespace and user subpages, the suffix determines the default model.
* the namespace's default model is used if there are no special rules governing the default content model. There's also a hook that con override this.
* if $wgContentHandlerUseDB is enabled (the default), MediaWiki can handle pages that have different content models for different revisions. It can then also handle pages with content models that are different from the one derived from their title. There is no UI for this atm, but it can happen e.g. through export/import.
* with $wgContentHandlerUseDB disabled, MediaWiki has no record of the page's *actual* content model, but must go solely by the title. That's usually sufficient but less robust. The only reason to do this is to avoid db schema changes in existing large wikis like wikipedia.
It will set content type in the database for JavaScript or CSS pages, as default content type for MediaWiki namespace is wikitext.
No, MediaWiki will use the JS/CSS content type for these pages regardless of $wgContentHandlerUseDB. But if you want a page called MediaWiki:Twiddlefoo to have the CSS content model, you can only do that if $wgContentHandlerUseDB is enabled (and you hack up some UI for this).
One important change with introducing the content handler is that JavaScript and CSS pages don't allow categories and such wiki markup anymore. This is true regardless of how $wgContentHandlerUseDB is set.
Indeed. They also don't allow section editing.
If someone installs MW and wants to use and expand this feature (as the WorkingWiki people might want to), where do they go to find information on it?
It's pretty useless on a vanilla install, unless you want to make a namespace where everything is per default JS or something. Generally, it's a framework to be used by extensions.
Right now, the on-wiki documentation refers to docs/contenthandler.txt. It seems like this area is ripe for on-wiki documentation, tutorials, and how-tos.
The information in docs/contenthandler.txt is probably the most useful at this point, along with http://www.mediawiki.org/wiki/ContentHandler
They can look at the Wikibase code to see examples of how we are implementing new content types.
It would certainly be nice to have more examples, tutorials, etc. but I'm not aware of them yet.
It would be great to have them, but I find it hard to anticipate what people may want or need. In any case, this would be aimed at extension developers, not sysops setting up wikis. As I said, there's not much you can do with it on a vanilla install, it just allows more powerful and flexible extensions.
-- daniel
On Sat, Jan 12, 2013 at 9:24 PM, Daniel Kinzler daniel@brightbyte.dewrote:
Thanks aude for replying to Mark's questions!
On 12.01.2013 17:08, aude wrote:
It will set content type in the database for JavaScript or CSS pages, as default content type for MediaWiki namespace is wikitext.
No, MediaWiki will use the JS/CSS content type for these pages regardless of $wgContentHandlerUseDB.
It will still set the page_content_model field to css or js, even for pages with the .js/.css suffix.
Without using the database, those will still be handled with the .js or .css content models, as content model can be determined by the suffix.
Cheers, Katie
-- daniel
On Saturday, January 12, 2013 at 7:36 AM, Mark A. Hershberger wrote:
If someone installs MW and wants to use and expand this feature (as the WorkingWiki people might want to), where do they go to find information on it?
Right now, the on-wiki documentation refers to docs/contenthandler.txt. It seems like this area is ripe for on-wiki documentation, tutorials, and how-tos.
ContentHandler powers the Schema: namespace on metawiki, with the relevant code residing in Extension:EventLogging. Here's an example:
http://meta.wikimedia.org/wiki/Schema:SavePageAttempts
I found the ContentHandler API to be useful and extensible, and would be happy to be approached on IRC or whatever with questions.
-- Ori Livneh
On 12.01.2013 20:14, Ori Livneh wrote:
ContentHandler powers the Schema: namespace on metawiki, with the relevant code residing in Extension:EventLogging. Here's an example:
http://meta.wikimedia.org/wiki/Schema:SavePageAttempts
I found the ContentHandler API to be useful and extensible, and would be happy to be approached on IRC or whatever with questions.
Oh, cool, I didn't know that!
Perhaps you can tell us what you would have liked more information about when first learning about the ContentHandler? Were there any concepts you had trouble with?
-- daniel
On Saturday, January 12, 2013 at 12:25 PM, Daniel Kinzler wrote:
On 12.01.2013 20:14, Ori Livneh wrote:
ContentHandler powers the Schema: namespace on metawiki, with the relevant code residing in Extension:EventLogging. Here's an example:
http://meta.wikimedia.org/wiki/Schema:SavePageAttempts
I found the ContentHandler API to be useful and extensible, and would be happy to be approached on IRC or whatever with questions.
Oh, cool, I didn't know that!
Perhaps you can tell us what you would have liked more information about when first learning about the ContentHandler? Were there any concepts you had trouble with?
-- daniel
As I said, I found the API well-designed on the whole, but:
* "getForFoo" (getForModelID, getDefaultModelFor) is a confusing pattern for method names. getDefaultModelFor is especially weird: I get what it does, but I don't know why it is where it is, or what need it is fulfilling.
* I don't have a clear mental model of the dividing line between Content and ContentHandler. The documentation (contenthandler.txt) explains that "all manipulation and analysis of page content must be done via the appropriate methods of the Content object", but it's the ContentHandler class that implements serializeContent, getPageLanguage, getAutoSummary, etc. If I think about it, I can sort of understand why things are on one class rather than the other, but it isn't so clear that I know where to look if I need to do something related to content. I usually look both places.
* The way validation is handled is a bit mysterious. Content defines an isValid interface and (if I recall correctly) a return value of false would prevent the content from getting saved. But in such cases you want a helpful error. I ended up implementing a validate() method on the Content class that throws helpful exceptions, calling it from an EditFilterMerged hook handler, but I had the feeling that I was deviating from the design goals of ContentHandler somehow, since it seemed like the sort of thing ContentHandler would handle for you.
* I would expect something like ContentHandler to provide a generic interface for supplying an editor suitable for a particular Content, in lieu of the default editor. For example, it seems clear that the CodeEditor extension should use ContentHandler to associate itself with particular content types (and to load the relevant highlighting / linting support), but I'm not sure what is the best way to implement it. I created a bug for this: https://bugzilla.wikimedia.org/show_bug.cgi?id=42593. There should probably be some kind of GetEditInterface hook that passes handlers a content model, to provide an easy way for extensions to supply enhanced editors for particular types of content.
* I wasn't sure initially which classes to extend for JsonSchemaContent and JsonSchemaContentHandler. I concluded that for all textual content types it's better to extend WikitextContent / WikitextContentHandler rather than the base or abstract content / content handler classes. But I did not feel sufficiently confident to document this as a best practice.
If I remember any more I'll update the thread.
After working with the API for a while I had a head-explodes moment when I realized that MediaWiki is now a generic framework for collaboratively fashioning and editing content objects, and that it provides a generic implementation of a creative workflow based on the concepts of versioning, diffing, etc. I think it's a fucking amazing model for the web and I hope MediaWiki's code and community is nimble enough to fully realize it.
-- Ori Livneh
Thanks for your input, Ori!
On 13.01.2013 01:35, Ori Livneh wrote:
As I said, I found the API well-designed on the whole, but:
- "getForFoo" (getForModelID, getDefaultModelFor) is a confusing pattern for
method names. getDefaultModelFor is especially weird: I get what it does, but I don't know why it is where it is, or what need it is fulfilling.
Yea, in retrospect, i'm not very happy with the naming of getForModelID either, and getForTitle could just die in favor of Title::getContentHandler.
ContentHandler::getDefaultModelFor determines the model to apply per default to a given title - maybe this should have been Title::getDefaultContentModel? But I wanted to centralize the factory logic in the ContentHandler class. So I think this is in the right place, at least.
- I don't have a clear mental model of the dividing line between Content and
ContentHandler. The documentation (contenthandler.txt) explains that "all manipulation and analysis of page content must be done via the appropriate methods of the Content object", but it's the ContentHandler class that implements serializeContent, getPageLanguage, getAutoSummary, etc.
The reason for the devision of ContentHandler and Content is mostly efficiency: to get a Content object, you have to load the actual content blob from the database. But a lot of operation depend on the content model (aka type), but not (necessarily) on the content itself, so they can be performed by the appropriate ContentHandler singleton:
getPageLanguage for example will always return "en" for JavaScript content and the wiki's content language for wikitext. It *could* load the content and look whether there's something in here that specifies a different language.
serializeContent could be implemented in Content, but unserializeContent couldn't, since it's what is used to create Content objects. I thought it would be good to have the serialize and unserialize methods in the same place.
If I think about it, I can sort of understand why things are on one class rather than the other, but it isn't so clear that I know where to look if I need to do something related to content. I usually look both places.
Yes, I suppose the documentation could explain this some more.
- The way validation is handled is a bit mysterious. Content defines an
isValid interface and (if I recall correctly) a return value of false would prevent the content from getting saved. But in such cases you want a helpful error.
You are right, it would be better to have a validate() method that returns a Status object. isValid() could then just call that and return $status->isOK(), for compatibility. If you like, file a bug for that - or just write it :)
- I would expect something like ContentHandler to provide a generic interface
for supplying an editor suitable for a particular Content, in lieu of the default editor.
It actually had that in some early version, but it did not work well with the way MediaWiki handles actions like edit. The correct way is to provide a custom handler class for the edit action via the getActionOverrides method. Wikibase makes extensive use of that mechanism.
This isn't very obvious or pretty, but very flexible, and fits well with the existing infrastructure.
I suppose the documentation should explain this in detail, though.
- I wasn't sure initially which classes to extend for JsonSchemaContent and
JsonSchemaContentHandler. I concluded that for all textual content types it's better to extend WikitextContent / WikitextContentHandler rather than the base or abstract content / content handler classes.
All *textual* (not "text based") content should derive from TextContent resp TextContentHandler. Such content can be edited using the standard edit page, will work in system messages, etc. There are also some extensions and maintenance scripts that only operate on content derived from TextContent (e.g. things that do search-and-replace).
Non-textual content (including anything with a strict syntax, like JSON, XML, whatever) should derive from AbstractContent and the generic ContentHandler. For such content, a custom editor is typically needed. A custom diff engine is also useful.
After working with the API for a while I had a head-explodes moment when I realized that MediaWiki is now a generic framework for collaboratively fashioning and editing content objects, and that it provides a generic implementation of a creative workflow based on the concepts of versioning, diffing, etc. I think it's a fucking amazing model for the web and I hope MediaWiki's code and community is nimble enough to fully realize it.
Yes, that's exactly it! You said that far better than I could have, I suppose I still expect people to just *see* that :P
Spread the word!
Thanks, daniel
By the way, in case it wasn't obvious: thank you to Ori, Aude, Matt and Daniel for all you guys help in understanding ContentHandler.
On 01/12/2013 02:14 PM, Ori Livneh wrote:
ContentHandler powers the Schema: namespace on metawiki, with the relevant code residing in Extension:EventLogging.
Awesome. I'll point to your extension in the release notes for how to use ContentHandler.
wikitech-l@lists.wikimedia.org