Well we are building the NAP-wikipedia and of course there are parts where one easily can transfer data by just translating it from one Wikipedia to the other. In this case we already uploaded the Calendar - and now it would make sense to transfer the contents of the Italian wikipedia there by translating it - people and events stay the same. So what I now would like to reach is:
1) having a dump from the Italian wikipedia 2) extracting all pages of the calendar 3) translate them with the help of OmegaT
Why OmegaT? Well in the sections "Born" and "Died" after the name of the person you vey often find just "actor, actress, writer, politician" or whatever - this means that there would be quite a lot of 100% matches and the translation would be much faster with the tool than without.
These lines all are created like this: *[[name of the person]] description
Now to have the possibility to get 100% matches I need at least a line break after *[[name of the person]] that needs to be taken out again after having translated the file.
Then in a second step, with the help of a bot, the translated parts can be transferred into the articles.
Well: what I now need is some advice on how to have this done - and then what can be done for Neapolitan can easily be repeated for other languages.
This means I need some help to get this regular expression ... I mean some code that runs through the data, inserts the line break and after translation takes it out again.
Who can help me with this? Btw. the TMX (translation memory) is going to be available under GFDL for anyone - well this should be obvious.
This text was originally posted here (in order to allow to collect how to's): http://www.wesolveitnet.com/modules/newbb/viewtopic.php?topic_id=83&post...
Thank you!!!
Ciao, Sabine
***** Sabine Cretella http://www.wordsandmore.it s.cretella@wordsandmore.it skype: sabinecretella
___________________________________ Yahoo! Messenger: chiamate gratuite in tutto il mondo http://it.messenger.yahoo.com
wikitech-l@lists.wikimedia.org