Long! tl;dr is in the first two paragraphs.
With the merging of https://gerrit.wikimedia.org/r/#/c/122787/ , probably the largest patch set for MediaWiki ever (+548314, -714438), MediaWiki core is now using JSON for localisation of interface messages, per a recently adopted RfC[1]. Thanks Krinkle/Timo for reviewing!
Please be aware that if you have open patch sets touching *.i18n.php messages files or MessagesXx.php files, you will have to update your patch sets to match the new file layout and format.
In December 2013, the first MediaWiki extensions have already been migrated to use the JSON format. Today, Antoine/hashar enabled a JSON linter on the jslint job that runs on many Gerrit repositories' patch sets.
Since last week I've started to migrate first all MediaWiki extensions that are used by WIkimedia to use JSON i18n. At this time, 1.23wmf20 has about 50% of its extensions using the updated format. Migration of two extensions is taking a little longer[2], but Matt Flaschen is helping with that, and I expect that to be resolved soon.
Migration of all extensions has been going very smoothly - it's about 80% done. With the help of reedy/Sam Reed, Raimond Spekking/Raymond, Niklas Laxström/Nikerabbit and Adam Wight, so far 427 patch sets related to this project have already been reviewed and merged[3], 40 await review and I expect some 90 more to be submitted for the project to be completed.
Thanks also go to Roan Kattouw/Catrope for implementation of parts of the RfC together with Niklas, to Niklas for rewriting LocalisationUpdate to support the JSON format and more, and all who helped draft the RfC, including James Forrester, Santhosh Thottingal, David Chan, Ed Sanders, Robert Thomas Moen, and those who deserve credit but I have forgotten to mention.
Once all migrations are complete, I'll be doing a full export from translatewiki.net, which will cause a lot of JSON files to be touched, but will mostly update encoding (full UTF-8) and add a newline at enf of file where missing.
What's next? With this project almost completed, next order of business is creating an RfC on where to go with the data that now remains in the MessagesXx.php files (like date formatting, fallback, directionality, namespace names, special page names, etc.) and localisation for special page names, magic words and namespace names that are still being implemented using $wgExtensionMessagesDirs. Maybe this is something we could discuss and prototype during the hackathon. Please let me know if this is something you'd like to work on.
Again, thanks for the help, and apologies for the inconvenience these changes may have caused you!
[1] https://www.mediawiki.org/wiki/Requests_for_comment/Localisation_format [2] https://gerrit.wikimedia.org/r/#/q/status:open+topic:json-i18n-special,n,z [3] https://gerrit.wikimedia.org/r/#/q/status:merged+topic:json-i18n,n,z
Cheers!
Cool. Thanks for all the work you put into this.
Once all migrations are complete, I'll be doing a full export from translatewiki.net, which will cause a lot of JSON files to be touched, but will mostly update encoding (full UTF-8) and add a newline at enf of file where missing.
If the export is going to touch all the files, is it too late to change to use tabs for indentation instead of spaces to be like the rest of mediawiki?
--bawolff
On Tue, Apr 1, 2014 at 11:59 PM, Brian Wolff bawolff@gmail.com wrote:
use tabs for indentation instead of spaces to be like the rest of mediawiki?
I was going to say the same thing. Why wasn't that caught in code review?
If these files continue to use spaces, I'd expect people will write patches to en.json and qqq.json using tabs and then l10n-bot will change them to spaces.
Also, I see we've lost the helpful comments that used to be in some of these files to visually divide things into sections.
use tabs for indentation instead of spaces to be like the rest of mediawiki?
I was going to say the same thing. Why wasn't that caught in code review?
Spaces were chosen because that is what we get with FormatJson::encode and there is no way to change it except by post-processing. I'm fine with both tabs and spaces.
Unless you change FormatJson::encode (which is used by many other things as well), you would need to apply this post-processing in multiple places, making it harder to alter these files from PHP code and you would take a small performance hit for the extra processing.
Also, I see we've lost the helpful comments that used to be in some of these files to visually divide things into sections.
That is true. JSON does not allow comments.
On the other hand, we can now stop updating messages.inc.
-Niklas
Am 02.04.2014 15:45, schrieb Brad Jorsch (Anomie):
On Tue, Apr 1, 2014 at 11:59 PM, Brian Wolff bawolff@gmail.com wrote:
use tabs for indentation instead of spaces to be like the rest of mediawiki?
I was going to say the same thing. Why wasn't that caught in code review?
If these files continue to use spaces, I'd expect people will write patches to en.json and qqq.json using tabs and then l10n-bot will change them to spaces.
Also, I see we've lost the helpful comments that used to be in some of these files to visually divide things into sections.
some projects like Etherpad-lite use a modified JSON format which allows comments.
On Wed, Apr 2, 2014 at 6:45 AM, Brad Jorsch (Anomie) bjorsch@wikimedia.orgwrote:
On Tue, Apr 1, 2014 at 11:59 PM, Brian Wolff bawolff@gmail.com wrote:
use tabs for indentation instead of spaces to be like the rest of mediawiki?
I was going to say the same thing. Why wasn't that caught in code review?
If these files continue to use spaces, I'd expect people will write patches to en.json and qqq.json using tabs and then l10n-bot will change them to spaces.
I would prefer that the JSON files use tabs instead of spaces even if it requires some post-processing as our coding conventions specify tabs for all code other than Python. I brought this up a couple weeks ago, but was just told that I should teach my IDE to use spaces for JSON files. Rather than having 100 developers waste time messing with their IDEs, many of which I imagine don't have such a preference, it seems like it would be more efficient to implement a post-processing script and keep the MediaWiki codebase consistent regarding indentation.
Ryan Kaldari
Hi Ryan,
Is there a bug open on this request (using tabs instead of spaces). Please share when you can :-)
Best, Alolita
Alolita Sharma आलोलिता शर्मा Director of Engineering Internationalization & Localization Wikimedia Foundation
On Wed, Apr 2, 2014 at 11:22 AM, Ryan Kaldari rkaldari@wikimedia.orgwrote:
On Wed, Apr 2, 2014 at 6:45 AM, Brad Jorsch (Anomie) bjorsch@wikimedia.orgwrote:
On Tue, Apr 1, 2014 at 11:59 PM, Brian Wolff bawolff@gmail.com wrote:
use tabs for indentation instead of spaces to be like the rest of mediawiki?
I was going to say the same thing. Why wasn't that caught in code review?
If these files continue to use spaces, I'd expect people will write
patches
to en.json and qqq.json using tabs and then l10n-bot will change them to spaces.
I would prefer that the JSON files use tabs instead of spaces even if it requires some post-processing as our coding conventions specify tabs for all code other than Python. I brought this up a couple weeks ago, but was just told that I should teach my IDE to use spaces for JSON files. Rather than having 100 developers waste time messing with their IDEs, many of which I imagine don't have such a preference, it seems like it would be more efficient to implement a post-processing script and keep the MediaWiki codebase consistent regarding indentation.
Ryan Kaldari _______________________________________________ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Opened a bug here for the tabs: https://bugzilla.wikimedia.org/show_bug.cgi?id=63444
Otherwise, I'm glad to hear the migration is nearly complete. This should give us a lot more capabilities on the client-side. Great work on this effort!
Ryan Kaldari
On Wed, Apr 2, 2014 at 11:32 AM, Alolita Sharma asharma@wikimedia.orgwrote:
Hi Ryan,
Is there a bug open on this request (using tabs instead of spaces). Please share when you can :-)
Best, Alolita
Alolita Sharma आलोलिता शर्मा Director of Engineering Internationalization & Localization Wikimedia Foundation
On Wed, Apr 2, 2014 at 11:22 AM, Ryan Kaldari <rkaldari@wikimedia.org
wrote:
On Wed, Apr 2, 2014 at 6:45 AM, Brad Jorsch (Anomie) bjorsch@wikimedia.orgwrote:
On Tue, Apr 1, 2014 at 11:59 PM, Brian Wolff bawolff@gmail.com
wrote:
use tabs for indentation instead of spaces to be like the rest of mediawiki?
I was going to say the same thing. Why wasn't that caught in code
review?
If these files continue to use spaces, I'd expect people will write
patches
to en.json and qqq.json using tabs and then l10n-bot will change them
to
spaces.
I would prefer that the JSON files use tabs instead of spaces even if it requires some post-processing as our coding conventions specify tabs for all code other than Python. I brought this up a couple weeks ago, but was just told that I should teach my IDE to use spaces for JSON files. Rather than having 100 developers waste time messing with their IDEs, many of which I imagine don't have such a preference, it seems like it would be more efficient to implement a post-processing script and keep the
MediaWiki
codebase consistent regarding indentation.
Ryan Kaldari _______________________________________________ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Thanks Ryan :-)
Best, Alolita
Alolita Sharma आलोलिता शर्मा Director of Engineering Internationalization & Localization Wikimedia Foundation
On Wed, Apr 2, 2014 at 12:44 PM, Ryan Kaldari rkaldari@wikimedia.orgwrote:
Opened a bug here for the tabs: https://bugzilla.wikimedia.org/show_bug.cgi?id=63444
Otherwise, I'm glad to hear the migration is nearly complete. This should give us a lot more capabilities on the client-side. Great work on this effort!
Ryan Kaldari
On Wed, Apr 2, 2014 at 11:32 AM, Alolita Sharma <asharma@wikimedia.org
wrote:
Hi Ryan,
Is there a bug open on this request (using tabs instead of spaces).
Please
share when you can :-)
Best, Alolita
Alolita Sharma आलोलिता शर्मा Director of Engineering Internationalization & Localization Wikimedia Foundation
On Wed, Apr 2, 2014 at 11:22 AM, Ryan Kaldari <rkaldari@wikimedia.org
wrote:
On Wed, Apr 2, 2014 at 6:45 AM, Brad Jorsch (Anomie) bjorsch@wikimedia.orgwrote:
On Tue, Apr 1, 2014 at 11:59 PM, Brian Wolff bawolff@gmail.com
wrote:
use tabs for indentation instead of spaces to be like the rest of mediawiki?
I was going to say the same thing. Why wasn't that caught in code
review?
If these files continue to use spaces, I'd expect people will write
patches
to en.json and qqq.json using tabs and then l10n-bot will change them
to
spaces.
I would prefer that the JSON files use tabs instead of spaces even if
it
requires some post-processing as our coding conventions specify tabs
for
all code other than Python. I brought this up a couple weeks ago, but
was
just told that I should teach my IDE to use spaces for JSON files.
Rather
than having 100 developers waste time messing with their IDEs, many of which I imagine don't have such a preference, it seems like it would be more efficient to implement a post-processing script and keep the
MediaWiki
codebase consistent regarding indentation.
Ryan Kaldari _______________________________________________ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
On 2 apr. 2014, at 20:22, Ryan Kaldari rkaldari@wikimedia.org wrote:
On Wed, Apr 2, 2014 at 6:45 AM, Brad Jorsch (Anomie) bjorsch@wikimedia.orgwrote:
On Tue, Apr 1, 2014 at 11:59 PM, Brian Wolff bawolff@gmail.com wrote:
use tabs for indentation instead of spaces to be like the rest of mediawiki?
I was going to say the same thing. Why wasn't that caught in code review?
If these files continue to use spaces, I'd expect people will write patches to en.json and qqq.json using tabs and then l10n-bot will change them to spaces.
I would prefer that the JSON files use tabs instead of spaces even if it requires some post-processing as our coding conventions specify tabs for all code other than Python. I brought this up a couple weeks ago, but was just told that I should teach my IDE to use spaces for JSON files. Rather than having 100 developers waste time messing with their IDEs, many of which I imagine don't have such a preference, it seems like it would be more efficient to implement a post-processing script and keep the MediaWiki codebase consistent regarding indentation.
You mean like this ? https://gerrit.wikimedia.org/r/#/c/121632/2/i18n/en.json
DJ
Just a question: Are the messages still able to be accessed like they were before, or will new methods be introduced to gather them and decode the JSON?
On Wed, Apr 2, 2014 at 6:02 PM, Derk-Jan Hartman < d.j.hartman+wmf_ml@gmail.com> wrote:
On 2 apr. 2014, at 20:22, Ryan Kaldari rkaldari@wikimedia.org wrote:
On Wed, Apr 2, 2014 at 6:45 AM, Brad Jorsch (Anomie) bjorsch@wikimedia.orgwrote:
On Tue, Apr 1, 2014 at 11:59 PM, Brian Wolff bawolff@gmail.com wrote:
use tabs for indentation instead of spaces to be like the rest of mediawiki?
I was going to say the same thing. Why wasn't that caught in code
review?
If these files continue to use spaces, I'd expect people will write
patches
to en.json and qqq.json using tabs and then l10n-bot will change them to spaces.
I would prefer that the JSON files use tabs instead of spaces even if it requires some post-processing as our coding conventions specify tabs for all code other than Python. I brought this up a couple weeks ago, but was just told that I should teach my IDE to use spaces for JSON files. Rather than having 100 developers waste time messing with their IDEs, many of which I imagine don't have such a preference, it seems like it would be more efficient to implement a post-processing script and keep the
MediaWiki
codebase consistent regarding indentation.
You mean like this ? https://gerrit.wikimedia.org/r/#/c/121632/2/i18n/en.json
DJ
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
2014-04-03 16:53 GMT+03:00 Justin Folvarcik jfolvarcik@gmail.com:
Just a question: Are the messages still able to be accessed like they were before, or will new methods be introduced to gather them and decode the JSON?
There are no visible changes for developers using the messages api: https://www.mediawiki.org/wiki/Manual:Messages_API
-Niklas
Good to know, thanks.
On Thu, Apr 3, 2014 at 10:07 AM, Niklas Laxström niklas.laxstrom@gmail.comwrote:
2014-04-03 16:53 GMT+03:00 Justin Folvarcik jfolvarcik@gmail.com:
Just a question: Are the messages still able to be accessed like they
were
before, or will new methods be introduced to gather them and decode the JSON?
There are no visible changes for developers using the messages api: https://www.mediawiki.org/wiki/Manual:Messages_API
-Niklas
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
This is epic. Thanks a bunch Siebrand! On 1 Apr 2014 15:10, "Siebrand Mazeland" siebrand@kitano.nl wrote:
Long! tl;dr is in the first two paragraphs.
With the merging of https://gerrit.wikimedia.org/r/#/c/122787/ , probably the largest patch set for MediaWiki ever (+548314, -714438), MediaWiki core is now using JSON for localisation of interface messages, per a recently adopted RfC[1]. Thanks Krinkle/Timo for reviewing!
Please be aware that if you have open patch sets touching *.i18n.php messages files or MessagesXx.php files, you will have to update your patch sets to match the new file layout and format.
In December 2013, the first MediaWiki extensions have already been migrated to use the JSON format. Today, Antoine/hashar enabled a JSON linter on the jslint job that runs on many Gerrit repositories' patch sets.
Since last week I've started to migrate first all MediaWiki extensions that are used by WIkimedia to use JSON i18n. At this time, 1.23wmf20 has about 50% of its extensions using the updated format. Migration of two extensions is taking a little longer[2], but Matt Flaschen is helping with that, and I expect that to be resolved soon.
Migration of all extensions has been going very smoothly - it's about 80% done. With the help of reedy/Sam Reed, Raimond Spekking/Raymond, Niklas Laxström/Nikerabbit and Adam Wight, so far 427 patch sets related to this project have already been reviewed and merged[3], 40 await review and I expect some 90 more to be submitted for the project to be completed.
Thanks also go to Roan Kattouw/Catrope for implementation of parts of the RfC together with Niklas, to Niklas for rewriting LocalisationUpdate to support the JSON format and more, and all who helped draft the RfC, including James Forrester, Santhosh Thottingal, David Chan, Ed Sanders, Robert Thomas Moen, and those who deserve credit but I have forgotten to mention.
Once all migrations are complete, I'll be doing a full export from translatewiki.net, which will cause a lot of JSON files to be touched, but will mostly update encoding (full UTF-8) and add a newline at enf of file where missing.
What's next? With this project almost completed, next order of business is creating an RfC on where to go with the data that now remains in the MessagesXx.php files (like date formatting, fallback, directionality, namespace names, special page names, etc.) and localisation for special page names, magic words and namespace names that are still being implemented using $wgExtensionMessagesDirs. Maybe this is something we could discuss and prototype during the hackathon. Please let me know if this is something you'd like to work on.
Again, thanks for the help, and apologies for the inconvenience these changes may have caused you!
[1] https://www.mediawiki.org/wiki/Requests_for_comment/Localisation_format [2] https://gerrit.wikimedia.org/r/#/q/status:open+topic:json-i18n-special,n,z [3] https://gerrit.wikimedia.org/r/#/q/status:merged+topic:json-i18n,n,z
Cheers!
-- Siebrand Mazeland Kitano ICT
M: +31 6 50 69 1239 Skype: siebrand _______________________________________________ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
On Tue, Apr 1, 2014 at 9:22 PM, Jon Robson jdlrobson@gmail.com wrote:
This is epic. Thanks a bunch Siebrand!
Agreed - really exciting to see this come to fruition! :) Kudos to Siebrand & everyone involved. I'm sure there will be bumps along the road but it's clearly a bit architectural step forward. It's also nice to see how the RFC process was used for this.
Erik
On Tue, Apr 1, 2014 at 3:09 PM, Siebrand Mazeland siebrand@kitano.nlwrote:
With the merging of https://gerrit.wikimedia.org/r/#/c/122787/ , probably the largest patch set for MediaWiki ever (+548314, -714438), MediaWiki core is now using JSON for localisation of interface messages, per a recently adopted RfC[1]. Thanks Krinkle/Timo for reviewing!
Congratulations on getting this done! This is a big deal, having long made MediaWiki flexible enough to handle many human languages, this is now a huge help for developers using different programming languages. And you know you've hit the big time when the discussion about the feature can expand to the weighty and architecturally-significant topic of tabs versus spaces. :-)
What's next? With this project almost completed, next order of business is
creating an RfC on where to go with the data that now remains in the MessagesXx.php files (like date formatting, fallback, directionality, namespace names, special page names, etc.) and localisation for special page names, magic words and namespace names that are still being implemented using $wgExtensionMessagesDirs. Maybe this is something we could discuss and prototype during the hackathon. Please let me know if this is something you'd like to work on.
I'm looking forward to seeing progress on this. Any initial ideas/biases/etc on this?
Rob
On Wed, Apr 2, 2014 at 8:41 PM, Rob Lanphier robla@wikimedia.org wrote:
On Tue, Apr 1, 2014 at 3:09 PM, Siebrand Mazeland <siebrand@kitano.nl
wrote:
With the merging of https://gerrit.wikimedia.org/r/#/c/122787/ ,
probably
the largest patch set for MediaWiki ever (+548314, -714438), MediaWiki
core
is now using JSON for localisation of interface messages, per a recently adopted RfC[1]. Thanks Krinkle/Timo for reviewing!
Congratulations on getting this done! [..] you know you've hit the big time when the discussion about the feature can expand to the weighty and architecturally-significant topic of tabs versus spaces. :-)
:-)
What's next? With this project almost completed, next order of business is
creating an RfC on where to go with the data that now remains in the MessagesXx.php files (like date formatting, fallback, directionality, namespace names, special page names, etc.) and localisation for special page names, magic words and namespace names that are still being implemented using $wgExtensionMessagesDirs. Maybe this is something we could discuss and prototype during the hackathon. Please let me know if this is something you'd like to work on.
I'm looking forward to seeing progress on this. Any initial ideas/biases/etc on this?
The only thing we've very briefly entertained is adding more @metadata keys to stuff "things" as associative arrays. James Forrester visualised that 2013-12-12 in the referred thread. Because it was explicitly out of scope for the Localisation format RfC, I think we focused on the RfC contents.
Questions: I. If the format that James proposed adequate? II. Should we extend on the existing RfC or create a new one?
[1] https://www.mediawiki.org/wiki/Thread:Talk:Requests_for_comment/Localisation...
wikitech-l@lists.wikimedia.org