[Toolserver-l] Toolserver Intuition - Tech spec (Toolserver goes I18N!)

Krinkle krinklemail at gmail.com
Tue Mar 29 00:46:40 UTC 2011


So, as announced in the previous mailing (Agreed, I should've sent  
this one first...).
Regarding the technical implementation.

First a few points that you are likely aware of, that have stopped,  
disabled or scared off doing this untill now. ie. creating a dedicated  
solution that can scale for more than just 1 tool or group of tools –  
And is easy to use for translators as well for developers.

-- Past issues / obstacles

* Separated Toolserver SVN repositories
** For MediaWiki extensions, TranslateWiki can easily translate things  
and can sync in a single commit because it has a partial check out of  
a single SVN repository. Having to add all ts-account repos to TW's  
configuration is way too much work and pretty much not-done. Not to  
mention the fact that most tools are either not in SVN at all, or are  
maintained outside SVN and pushed towards SVN for public source  
viewing every once in a while.

* Language of choice
** Users want to set their choice once and not have to search re-do it  
for all tools independently and have to find the right place to 'do  
it' on everybody's tool. Nor do we want to click the same link again  
every visit. And developers prefer not to have userlang-parameters  
dangling in the url and have to make sure it's preserved throughout  
the app with every link.
This can (and sometimes is) solved by using cookies (one example is  
Luxo's contributions tool that sets a cookie)

* Prevent vandalism but also slow-down and other down sides of a  
regular MediaWiki page.
** When translated on one wiki-page (ie. at Meta-Wiki such as Magnus'  
implementation, which I think is the best implementation so far) there  
isn't a good translation-oriented workflow for translators or  
developers. Of course pages could be protected by sysops - and then  
have to be updated from another page on request. And then there is  
FlaggedRevs. But neither are not optimized towards translators (ie.  
there's no way to FUZZY a message, or see translation suggestions from  
services like Google Translate, or a description of the message while  
modifying it (aka "/qqq"). Nor is it not ideal (if not impossible) to  
keep track of changes to the original message without having to  
manually check it (eg. perform "FUZZY"). And no easy workflow to  
translate many messages at once.
This is all taken care of by Extension:Translate [4] on TranslateWiki.

* Fallback languages
** Aside from the management involved, fallback is also an important  
point. It shouldn't be required that a translator has to translate  
everything at once or nothing at all. Some implementations around  
would fail if a new message wasn't available in a translated language  
yet. Other implementation created a way to fallback to English if  
there was no message-key in the selected language. But I haven't seen  
any real fine grained fallback (such falling back from NDS (Low  
German) to DE (German), or from ACE (Aceh) to ID (Indonesian) for  
untranslated messages, like MediaWiki does).

* Universal messages
** Some messages like 'welcome', 'login', 'submit' etc. are generic  
and should not required to be duplicated around everywhere for each  
tool. Eventhough TranslateWiki has {{Identical}}, it save work by just  
having a group of generic messages. Tools that are mostly data or  
visually oriented may not even have to request to be added to the  
project if they only need a few of these generic messages to control  
the input form.

* Keeping up with latest versions
** Another implementation currently around (the only one that actually  
uses TranslateWiki afaik) is done through a fake extension named  
ToolserverTools [1] in Wikimedia SVN. For the translators side this  
was perfect (since they could use the TranslateWiki workflow they know  
and love). But not so for the developers. Messages all had a prefix,  
and in order to actually use them in the tools some wheel-reinvention  
took place (like getting the message from the array in the correct  
language, providing fallback, replacing variables like $1/$2, etc.).  
Also they still had to re-create a way for users to choose a language,  
store it, remember it and apply it the next visit. Lots of wasted  
time. And of course staying up2date with the latest version in  
Wikimedia SVN was sometimes forgotten and translators are known to get  
especially motivated if there is no work required from the developer  
to put the new translations into use (ie. TranslateWiki having the  
ability to push the updates and there being no extra action required).  
We could have everybody create a cronjob to update their svn-checkout  
of the messages file from the "/extensions/ToolserverTools/" directory  
in SVN, but that's not ideal either.

-- 

All of the above have been solved with my proposal. Either because of  
the fact that it's powered via TranslateWiki, or because it's taken  
care of by the central i18n system.

-- Tool developer workflow:

I'll describe how the system would work from a tool developers point  
of view. [3]

So here's what you'd do to make it work, three easy steps:
1) The toolserver tool developer includes a single php file (eg. / 
p_i18n/ToolStart.php). This makes the i18n class available.
2) A new instance of the class is created like $I18N = new  
TsIntuition( 'mygroup' );
3) Messages can now be returned with either _('message-key') [2] or  
$TSi18n->msg( 'message-key', $options ).

The msg() function can optionally be passed a text domain name (or  
'group name' if you will) as second argument to get a message from a  
different group eg. the group 'general' for messages like 'welcome',  
'login', 'submit' etc. Or an array if you need multiple options like  
escape, variable replacement etc. (more on that in a minute).

-- Other features

Although the I18N class will be able to do a lot more, this above is  
the core principle. Here's a list of items in no particular order for  
other things that it will have:

* Variable replacement ($1, $2, etc.)

	$welcome = $I18N->msg('welcomeback', array( 'variables' =>  
array( $username, $lastvisit ) )
	from [[Toolserver:Mytool-welcomeback]] which contains "Welcome back  
$1 (last visit: $2)".

* Fallback languages:
   If a message is not found in the current user language, a fallback  
will be used. And if that one isn't found English is used.

* Getting language names (eg. de -> 'Deutsch', en -> English) is built- 
in. Currently uses a copy MediaWiki's Names.php, could be made to use  
sql.toolserver.language if that is preferable but I think it's good  
this way)

* Escaping (ie. options = array( escape => html )

* Automated updates: Since the messages are file-stored in the  
messages-directory of the tool. There's no need to keep track or  
update anything.
ToolStart.php will load the appropriate class from the correct file,  
and when initializing the class and using msg(), will load needed  
message files on demand.

* Direction (Get which direction a language is, LTR or RTL). Handy  
when constructing your <html> element:
<html dir="{$I18N->getDir()}" lang="{$I18N->getLang()}">

* Automatic detection and remembering of the right user language  
(users can choose a language from a central i18n preferences page.  
This is stored as a cookie and (if no cookies available, in session).  
It can still be overridden by using the userlang GET parameter [2].  
One can also pass the desired message language to the getMsg()  
function to force a certain language for one message.

* No prefixes or collisions for MediaWiki messages:
To avoid conflicts with other tools, message-keys are automatically  
prefixed with the name of the group. So you won't have to prefix every  
key internally to avoid conflicts with messages of another Toolserver  
tool. Also (still in talks with TranslateWiki) we're planning to put  
them in a dedicated namespace and not in the MediaWiki:-namespace on  
TranslateWiki.
Example:
* A message at [[translatewiki:Toolserver:Luxocontris-usernotfound]]
* will be available through $I18N->msg( 'usernotfound' ), assuming  
$i18n = new TsIntuition( 'luxocontris' )
* otherwise $I18N->msg('usernotfound', 'luxocontris');

* Localize other fronts as well: There's several popular tools out  
there that have an additional (or only) front-end via JavaScript  
implementation on a wiki. Since the i18n system will have an API that  
has a JSON-output format with callback (JSONP) you can get the  
messages in there as well.

* API: When not in PHP (ie. JavaScript or Python) you can do queries  
(GET or POST) like api.php? 
action=getmessages&group=luxo&message=foobar|lorem|ipsum|logout|login  
&format=json&callback=myTool.initLang

* More.. (see design specification on Toolserver wiki) [5]

-- TranslateWiki

I'm currently in talks with TranslateWiki how to best set up the  
syncing system. Although initial chat with Nikerabbit didn't bring up  
any expected problem (as it's fairly similar other projects they  
translate), it still needs to be set up. I expect to have something  
going within one or two weeks.

The source files have been added to Wikimedia SVN [6] and checked out  
in the TsIntuition directory [7] at the Toolserver.

-- Documentation / design specification

The initial concept for the class has been documented at Toolserver  
Wiki [5]. Most of it has already been implemented in SVN [6] and can  
be tested. The implemention is subject to change based on feedback  
from you.

-- Already translated
The following tools have been translated already. Log in at the  
toolserver and look at their source to learn how they work:

* http://toolserver.org/~krinkle/TsIntuition/
* http://toolserver.org/~jarry/svgtranslate/
* http://toolserver.org/~krinkle/getWikiAPI.php


--
Krinkle

[1] http://svn.wikimedia.org/viewvc/mediawiki/trunk/extensions/ToolserverTools/

[2] Yes, there's a way to disable the global function _() if you don't  
want it or have a function named like that already.

[3] Right now this is only for PHP tools (which are the most common),  
but I'm currently working on providing an API to make this available  
in other formats as well (ie. jsonp-callback for usage in javascript  
gadgets. Some toolserver tools interact with a wiki-side javascript  
companion. And a format that can be easily loaded into languages like  
Python (xml/json). I will focus on that as soon as the initial system  
is up and running.

[4] http://www.mediawiki.org/wiki/Extension:Translate

[5] https://wiki.toolserver.org/view/Toolserver_Intuition

[6] http://svn.wikimedia.org/viewvc/mediawiki/trunk/tools/ToolserverI18N/

[7] http://toolserver.org/~krinkle/TsIntuition/


More information about the Toolserver-l mailing list