Kaixo!
On Mon, Apr 11, 2005 at 07:06:19PM -0400, zhengzhu wrote:
- interwiki links should be handled as exceptions too (as the list of valid interwiki domains is known (eg, the possible xx in [[xx:foo]]) it should be easy to implement
- it would be nice also to detect urls and not convert them
The conversion happens at a rather late stage of the wiki parser at which the input should be largely (x)html. I have attempted to avoid converting them but may have missed something. Can you provide an example at the test site when this is not done correctly?
For interwiki links, the test site doesn't recognize them as interwiki links, so I don't know if the code handles it correctly or not...
For urls, I didn't tested enclosed urls (eg: [http://wa.wikipedia.org/ foo]) but plain text ones (eg: http://wa.wikipedia.org/ or pablo@walon.org in the article, without square brackets around them).
I did some more tests, in order to have an url not translitered, I have to write:
[http://wa.wikipedia.org/ -{http://wa.wikipedia.org/%7D-]
and for an email (which hasn't any special meaning in wiki syntax, unlike the http url): -{pablo@walon.org}-
well, we can live with it, indeed.
maybe something like "blablabla ={Latn:Saratxaga|Cyrl:Сарачага}= blabla", that would be displayed as "blablabla Saratxaga blaba" or "блаблаблабла Сарачага блаблабла" but not as "блаблабла Саратхага блаблабла" maybe the Latn:/Cyrl: could be removed, as the script can be found from the strings, syntax will then be easier for the editors: "blablabla ={Saratxaga|Сарачага}= blabla" or, if they write in cyrillic: "блаблаблабла ={Сарачага|Saratxaga}= блаблабла"
This function is built in and is running at ZH. At the BE test site, you can use the following syntax (note how close it is to your suggestion:)
-{be-cyrillics: Foo; be-latin: Bar}- This will show "Foo" in cyrillics mode, and "Bar" in latin mode.
It would be better imho to standardize on ISO 15924 script codes (with possible site local aliases, same way as "Talk:" etc can be translated, but "Talk:" always work in all wikipedias)
and I also think that "|" would be a better separator (as it is used in a lot of other wikisyntax), and it would allow to have ";" included in the -{ }- block.
an auto-detection of the script, so that there is no need to explicitely tell it, will be the candy on the top (it is not always possible for Hans/Hant, indeed; but for most other cases of multi-sript needs there is no ambiguity at all), that is also why use of "|" would be better: in -{Foo;Bar}- you can't be sure that the ";" is used as a separator, while in -{Foo|Bar}- the likelihood od "|" being part of the text is much, much smaller (and in such odd cases, <nowiki>|</nowiki> could be used).
Is it possible to add a test case for a latin<->arabic site? (Kurdish or Azeri are two likely candidates), due to the right-to-left nature of arabic script, it could show some more problems that aren't seen in a cyrillic<->latin case.
Thanks