Erik Moeller wrote:
I'd like to start a broader conversation about language support in MW core [...]
Mailing lists are good for conversation, but a lot of your e-mail was insightful notes that I want to make sure don't get lost. I hope you'll eventually put together an RFC (https://www.mediawiki.org/wiki/RFC) or equivalent.
[...]
I'll stop there - I'm sure you can think of other issues with the current approach. For third party users, the effort of replicating something like the semi-acceptable Commons or Meta user experience is pretty significant, as well, due to the large number of templates and local hacks employed.
Well, for Commons, clearly the answer is for everyone to write in glyphs. Wingdings, Webdings, that fancy new color Unicode that Apple has. Meta-Wiki, on the other hand, now that's a real problem. ;-)
Would it make sense to add a language property to pages, so it can be used to solve a lot of the above issues, and provide appropriate and consistent user experience built on them? (Keeping in mind that some pages would be multilingual and would need to be identified as such.) If so, this seems like a major architectural undertaking that should only be taken on as a partnership between domain experts (site and platform architecture, language engineering, Visual Editor/Parsoid, etc.).
I'm not sure I'd call what you're proposing a major architectural undertaking, though perhaps I'm defining a much narrower problem scope. Below is my take on where we are currently and where we should head with regard to page properties.
We need better page properties (metadata) support. A few years ago, a page_props table was added to MediaWiki:
* https://www.mediawiki.org/wiki/Manual:Page_props_table
Within the past year, MediaWiki core has seen the info action resuscitated and Special:PagesWithProp implemented:
* https://www.mediawiki.org/w/index.php?title=MediaWiki&action=info * https://www.mediawiki.org/wiki/Special:PagesWithProp
That is, a lot of the infrastructure needed to support a basic language property field already exists, in my mind.
However, where we currently fall short is providing a reasonable interface for adding or modifying page properties. Currently, we use the page text to set nearly any property, via magic words (e.g., __NEWSECTIONLINK__ or {{DISPLAYTITLE:}}). The obvious advantage to doing this is the accountability, transparency, and reversibility of using the same system that edits rely on (text table, revision table). The obvious disadvantage is that the input system is a giant textarea.
If we could design a sane interface for modifying page properties (such as display title and a default category sort key) that included logging and accountability and reversibility, adding page content language as an additional page property would be pretty trivial. (MediaWiki could even do neat tricks like take a hint from either the user interface language of the page creator or examine the page contents themselves to make an educated guess about the page content language.) And as a fallback, I believe every site already defines a site-wide content language (even Meta-Wiki and Commons). The info action can then report this information on a per-page basis and Special:PagesWithProp can allow lookups by page property (i.e., by page content language).
MZMcBride