Merlijn van Deen wrote:
> Would it be possible just to use translatewiki to export python-based
> translations? Pywikipediabot currently uses this for some of its
> translations. It does not support plurals at the moment, though, as it's a
> simple dictionary-based system ({'translatewiki-key': {'lang': 'value'}})
Generally, translatewiki.net can do that, and is already doing it for some
pywikipediabot pieces, with likely more to come.
I expect that we shall have a PLURAL implementation in python as well
some time in the future.
I have not investigated that, but I assume, we have some similar
functionality in JavaScript in the new ressource loader context already.
--
Greetings - Purodha
I have noticed within the last 7 days or so several issues that have cropped
up without any real source. I am wondering if this is isolated or if others
are having unusual issues. The two main issues that I have seen is that
sendmail functionality either recently changed or broke. Ive had a script
that emails me logs every day at 0100UTC via a python wrapper to sendmail,
however the last emails that I received where for the 6th and after that all
I received was error messages, (one about no recipient, and the second about
encoding ) even though the last time that file was march 3rd. Ive gone head
and fixed those issues. I also have had an issue with a db query that I've
been running since July 2009 with just a few minor teaks which used to
execute in 5-10 minutes its now taking over an hour for the same query to
run (this started happening within the last 24 hours). Has anyone else
experienced similar issues or things breaking or is it just me?
Betacommand
Krinkle wrote:
> * Python
> - Not sure how to pull this one off yet, as a work around it could
> make http request to the api in json (since Python supports that) but
> a way without making http calls would be preferred. I'm not a heavy
> Python developer (I've used it a couple of times to write simple
> ircbots (irclib) and wikibots (pywikibot), but that's about it). Any
> suggestions from python developers how they would like to have it
> delivered, let me know how you want it served and I'll see what I can
> do.
Count me in for supporting python, too. My experience is similar to yours,
and I cannot start rightaway, but I think, getting the php implementation
as a reference should have preceedence anyways.
Btw., I have been thinking about PLURAL implementations. It might
be possible to automatically extract plural routines from MediaWikis
LanguageZxx.php files, and wrapping them into a piece of code such
as a big switch on language codes. Since the functions are pretty small,
and there are not really many languages having their own PLURAL
implementations, this might at least be a simple and effective start.
--
Greetings - Purodha
I've got a little program to index dump files that supports Windows
and Linux but it doesn't compile on the Toolserver with either cc or
gcc due to the lack of the function vasprintf(). It's a GNU extension
so I'm surprised it didn't work even with gcc.
Why doesn't the Toolserver gcc have it, and does anybody know of a workaround?
vasprintf() is a version of vsprintf() which writes to a memory buffer
of just the right size that it allocates and which the caller must
call free() on when done.
Andrew Dunbar (hippietrail)
Is there any suggestion on implementing Toolserver Intuition in
non-PHP projects, for example, in bash-based ones? For now it seems to
be a bit compliated.
--
З павагай,
Павел Селіцкас/Paul Selitskas
Wizardist @ Wikimedia projects
p.selitskas(a)gmail.com, +375257408304
Skype: p.selitskas
On Friday, 8. April 2011 03:35:42 Purodha Blissenbach wrote:
Krinkle wrote:
> * Direction (Get which direction a language is, LTR or RTL). Handy
> when constructing your <html> element:
> <html dir="{$I18N->getDir()}" lang="{$I18N->getLang()}">
The function getDir() doe not exist currently.
Since directionality is a function of script, it might be wise and come handy,
to associate a script code with languages, too. I known, some (very few)
languages have several scripts, even ones with differing directionality.
In these cases, a script subtag or country subtag is needed anyways for the
language from which one can determine the directionality.
Greetings - Purodha
--
Greetings - Purodha
Hi all!
Wikimedia Deutschland is offering a contract for implementing the GraphServ
component for our Graph Processor project. Anyone interested is invited to
apply, the official call for bids is at
<http://wikimedia.de/wiki/Ausschreibung/GraphServ>.
The Graph Processor project aims to develop an infrastructure for rapidly
analyzing and evaluating Wikipedia's category structure. It's supposed to become
part of the Toolserver infrastructure (and eventually, the WMF search cluster)
that allows CatScan-like queries to run in under a second instead of minutes.
The contract offered here covers the implementation of the GraphServ component,
which is to function as a service by which applications can access the category
structures of different wikis, similar to the way a database server would
provide access to information stored in databases. Technically, GraphServ is a
server that manages TCP connections and attaches them to instances of GraphCore,
which do the actual processing of the category structures
<https://github.com/jkroll20/graphcore/blob/master/spec.rst>. The server will be
accessed by applications via client libraries written in PHP, Python, etc, which
are not in scope of the contract but will be developed in parallel by Wikimedia
Deutschland.
A rough specification of the GraphServ component along with requirements for the
implementations can be found at
<http://wikimedia.de/wiki/Ausschreibung/GraphServ/Spec>.
Note that GraphServ will be released as Open Source Software. While Wikimedia
Deutschland will be the copyright holder for the software developed under
contract, we will include the name of the actual authors in the copyright notice.
Applications should include the following:
* The applicant's prior experience with designing and implementing client/server
software, as well as any other relevant qualifications
* An overview of the intended architecture of the implementation and the
technologies used, along with a rationale for choosing this architecture and
technologies over others.
* A rough road map of the implementation, documentation and testing phases, with
the appropriate mile stones.
* Estimate of working hours needed
* Time frame for the implementation (calendar weeks)
* Total expected cost, including taxes
Please send your application to <technik(a)wikimedia.de> by April 29.
Cheers,
Daniel
PS: please forward this to anyone you think could be interested. thanks!
On Friday, 8. April 2011 03:35:42 Purodha Blissenbach wrote:
Krinkle wrote:
> * Direction (Get which direction a language is, LTR or RTL). Handy
> when constructing your <html> element:
> <html dir="{$I18N->getDir()}" lang="{$I18N->getLang()}"
This is not enough. The use of fallback languages requiers the same
functions to exist for each message, because its message text may be coming
from the requested language or any of its fallback languages, or from
English, or from the zxx pseudo language (n linguisitc content) if no
message text exists, even not in English, and a message ey is returned.
That means, it must be possible, to ask for the actual language/locale code
and directionality of each message text returned.
One possible way to do that in one go would be to use an array return code
and list() in PHP, such as:
list( $htmltext, $lang, $dir) = full_msg( ...);
echo('<div lang="'.$lang.'" dir="'.$dir.'">'.$htmltext.'</div>');
If you're not interested in $dir, e.g. because you know your language and
the fallbacks are all LTR, and English, and zxx are, too, you can use
list( $htmltext, $lang) = full_msg( ...);
etc.
> So $I18N->msg('blabla') (or _('blabla') ) will return the "[blabla]"
> string.
Just to mention: MediaWiki uses "<blabla>" instead of "[blabla]" in these
cases. I would not mind making it so,too, but also do not really care for
full compability in this point.
Greetings - Purodha
--
Greetings - Purodha
Hello all,
because of some corruptions (mostly double entries) inside the table
toolserver.namespacename I will do a maintaince on Wednesday beginning
21:30UTC*.
I will truncate the table and regenerate it which should take between 1 and 2
minutes. During this time the table will be incomplete and so tools which rely
on it can be fail. To minimize the effect I will handle each server
independently (truncate table and regenerate it on server a, then truncate
table and regenerate it on server b and so on).
sql-toolserver (the master of the toolserver-db) will be the first one to
update, then the others.
Not affected will be the old table toolserver.namespace which is deprecated and
you should NOT use it anymore.
Sincerly,
DaB.
* http://time.tcx.org.uk/utc/2011-04-06/21:30
--
Userpage: [[:w:de:User:DaB.]] — PGP: 2B255885
So, as announced in the previous mailing (Agreed, I should've sent
this one first...).
Regarding the technical implementation.
First a few points that you are likely aware of, that have stopped,
disabled or scared off doing this untill now. ie. creating a dedicated
solution that can scale for more than just 1 tool or group of tools –
And is easy to use for translators as well for developers.
-- Past issues / obstacles
* Separated Toolserver SVN repositories
** For MediaWiki extensions, TranslateWiki can easily translate things
and can sync in a single commit because it has a partial check out of
a single SVN repository. Having to add all ts-account repos to TW's
configuration is way too much work and pretty much not-done. Not to
mention the fact that most tools are either not in SVN at all, or are
maintained outside SVN and pushed towards SVN for public source
viewing every once in a while.
* Language of choice
** Users want to set their choice once and not have to search re-do it
for all tools independently and have to find the right place to 'do
it' on everybody's tool. Nor do we want to click the same link again
every visit. And developers prefer not to have userlang-parameters
dangling in the url and have to make sure it's preserved throughout
the app with every link.
This can (and sometimes is) solved by using cookies (one example is
Luxo's contributions tool that sets a cookie)
* Prevent vandalism but also slow-down and other down sides of a
regular MediaWiki page.
** When translated on one wiki-page (ie. at Meta-Wiki such as Magnus'
implementation, which I think is the best implementation so far) there
isn't a good translation-oriented workflow for translators or
developers. Of course pages could be protected by sysops - and then
have to be updated from another page on request. And then there is
FlaggedRevs. But neither are not optimized towards translators (ie.
there's no way to FUZZY a message, or see translation suggestions from
services like Google Translate, or a description of the message while
modifying it (aka "/qqq"). Nor is it not ideal (if not impossible) to
keep track of changes to the original message without having to
manually check it (eg. perform "FUZZY"). And no easy workflow to
translate many messages at once.
This is all taken care of by Extension:Translate [4] on TranslateWiki.
* Fallback languages
** Aside from the management involved, fallback is also an important
point. It shouldn't be required that a translator has to translate
everything at once or nothing at all. Some implementations around
would fail if a new message wasn't available in a translated language
yet. Other implementation created a way to fallback to English if
there was no message-key in the selected language. But I haven't seen
any real fine grained fallback (such falling back from NDS (Low
German) to DE (German), or from ACE (Aceh) to ID (Indonesian) for
untranslated messages, like MediaWiki does).
* Universal messages
** Some messages like 'welcome', 'login', 'submit' etc. are generic
and should not required to be duplicated around everywhere for each
tool. Eventhough TranslateWiki has {{Identical}}, it save work by just
having a group of generic messages. Tools that are mostly data or
visually oriented may not even have to request to be added to the
project if they only need a few of these generic messages to control
the input form.
* Keeping up with latest versions
** Another implementation currently around (the only one that actually
uses TranslateWiki afaik) is done through a fake extension named
ToolserverTools [1] in Wikimedia SVN. For the translators side this
was perfect (since they could use the TranslateWiki workflow they know
and love). But not so for the developers. Messages all had a prefix,
and in order to actually use them in the tools some wheel-reinvention
took place (like getting the message from the array in the correct
language, providing fallback, replacing variables like $1/$2, etc.).
Also they still had to re-create a way for users to choose a language,
store it, remember it and apply it the next visit. Lots of wasted
time. And of course staying up2date with the latest version in
Wikimedia SVN was sometimes forgotten and translators are known to get
especially motivated if there is no work required from the developer
to put the new translations into use (ie. TranslateWiki having the
ability to push the updates and there being no extra action required).
We could have everybody create a cronjob to update their svn-checkout
of the messages file from the "/extensions/ToolserverTools/" directory
in SVN, but that's not ideal either.
--
All of the above have been solved with my proposal. Either because of
the fact that it's powered via TranslateWiki, or because it's taken
care of by the central i18n system.
-- Tool developer workflow:
I'll describe how the system would work from a tool developers point
of view. [3]
So here's what you'd do to make it work, three easy steps:
1) The toolserver tool developer includes a single php file (eg. /
p_i18n/ToolStart.php). This makes the i18n class available.
2) A new instance of the class is created like $I18N = new
TsIntuition( 'mygroup' );
3) Messages can now be returned with either _('message-key') [2] or
$TSi18n->msg( 'message-key', $options ).
The msg() function can optionally be passed a text domain name (or
'group name' if you will) as second argument to get a message from a
different group eg. the group 'general' for messages like 'welcome',
'login', 'submit' etc. Or an array if you need multiple options like
escape, variable replacement etc. (more on that in a minute).
-- Other features
Although the I18N class will be able to do a lot more, this above is
the core principle. Here's a list of items in no particular order for
other things that it will have:
* Variable replacement ($1, $2, etc.)
$welcome = $I18N->msg('welcomeback', array( 'variables' =>
array( $username, $lastvisit ) )
from [[Toolserver:Mytool-welcomeback]] which contains "Welcome back
$1 (last visit: $2)".
* Fallback languages:
If a message is not found in the current user language, a fallback
will be used. And if that one isn't found English is used.
* Getting language names (eg. de -> 'Deutsch', en -> English) is built-
in. Currently uses a copy MediaWiki's Names.php, could be made to use
sql.toolserver.language if that is preferable but I think it's good
this way)
* Escaping (ie. options = array( escape => html )
* Automated updates: Since the messages are file-stored in the
messages-directory of the tool. There's no need to keep track or
update anything.
ToolStart.php will load the appropriate class from the correct file,
and when initializing the class and using msg(), will load needed
message files on demand.
* Direction (Get which direction a language is, LTR or RTL). Handy
when constructing your <html> element:
<html dir="{$I18N->getDir()}" lang="{$I18N->getLang()}">
* Automatic detection and remembering of the right user language
(users can choose a language from a central i18n preferences page.
This is stored as a cookie and (if no cookies available, in session).
It can still be overridden by using the userlang GET parameter [2].
One can also pass the desired message language to the getMsg()
function to force a certain language for one message.
* No prefixes or collisions for MediaWiki messages:
To avoid conflicts with other tools, message-keys are automatically
prefixed with the name of the group. So you won't have to prefix every
key internally to avoid conflicts with messages of another Toolserver
tool. Also (still in talks with TranslateWiki) we're planning to put
them in a dedicated namespace and not in the MediaWiki:-namespace on
TranslateWiki.
Example:
* A message at [[translatewiki:Toolserver:Luxocontris-usernotfound]]
* will be available through $I18N->msg( 'usernotfound' ), assuming
$i18n = new TsIntuition( 'luxocontris' )
* otherwise $I18N->msg('usernotfound', 'luxocontris');
* Localize other fronts as well: There's several popular tools out
there that have an additional (or only) front-end via JavaScript
implementation on a wiki. Since the i18n system will have an API that
has a JSON-output format with callback (JSONP) you can get the
messages in there as well.
* API: When not in PHP (ie. JavaScript or Python) you can do queries
(GET or POST) like api.php?
action=getmessages&group=luxo&message=foobar|lorem|ipsum|logout|login
&format=json&callback=myTool.initLang
* More.. (see design specification on Toolserver wiki) [5]
-- TranslateWiki
I'm currently in talks with TranslateWiki how to best set up the
syncing system. Although initial chat with Nikerabbit didn't bring up
any expected problem (as it's fairly similar other projects they
translate), it still needs to be set up. I expect to have something
going within one or two weeks.
The source files have been added to Wikimedia SVN [6] and checked out
in the TsIntuition directory [7] at the Toolserver.
-- Documentation / design specification
The initial concept for the class has been documented at Toolserver
Wiki [5]. Most of it has already been implemented in SVN [6] and can
be tested. The implemention is subject to change based on feedback
from you.
-- Already translated
The following tools have been translated already. Log in at the
toolserver and look at their source to learn how they work:
* http://toolserver.org/~krinkle/TsIntuition/
* http://toolserver.org/~jarry/svgtranslate/
* http://toolserver.org/~krinkle/getWikiAPI.php
--
Krinkle
[1] http://svn.wikimedia.org/viewvc/mediawiki/trunk/extensions/ToolserverTools/
[2] Yes, there's a way to disable the global function _() if you don't
want it or have a function named like that already.
[3] Right now this is only for PHP tools (which are the most common),
but I'm currently working on providing an API to make this available
in other formats as well (ie. jsonp-callback for usage in javascript
gadgets. Some toolserver tools interact with a wiki-side javascript
companion. And a format that can be easily loaded into languages like
Python (xml/json). I will focus on that as soon as the initial system
is up and running.
[4] http://www.mediawiki.org/wiki/Extension:Translate
[5] https://wiki.toolserver.org/view/Toolserver_Intuition
[6] http://svn.wikimedia.org/viewvc/mediawiki/trunk/tools/ToolserverI18N/
[7] http://toolserver.org/~krinkle/TsIntuition/