On 06/02/12 07:58, Pavel Tkachenko wrote:
Making markup language-neutral is easy enough: even a single person can carry on the research to find keyboard symbols that are easily accessible across different language standards. From my experience they are ! % * ( ) - = _ and +. This will eliminate the need to layout switches (for example, currently a Russian Wikipedia editor must switch layout 5 times in typing simple "<u>underline</u>" since neither < nor > are present in Russian layout;
No. It's not easy. It's painful. The goal of wikitext is to make html editing easy. HTML only needs a few special characters: <>&;=" but it's bothersome. So instead of <ul> <li>Dogs <li>Cats <li>Hens </ul>
We define that * is a bullet and serves to make lists: * Dogs * Cats * Hens
It's easier to type, and looks good.
Then we also want numbered lists. Instead of <ol> <li>One <li>Two <li>Three </ol>
We define # as the equivalent for numbered lists. Note that there's no usage of # for numbers in many cultures, so that's less 'visual' there. # One # Two # Three
You then continue adding tricks of "this looks like", sometimes needing a bery crazy mind. But each feature requires new symbols, and when you look at those available on every layout, you get *very* limited...
For example, I could decide to list imagemaps as `Image1´ `Image2´... (grave and acute), but oh, many keyboards don't have both accents. (I happen to have both, but had to copy the acute because it rejected being displayed alone, converting to an apostrophe...)
And obviously, you can't use something that would easily appear in a normal text (or you start defining escape codes which are uglier, too).
My study indicates that the number of available symbols will allow to avoid HTML-style tags completely - this will further simplify the markup. For instance, instead of <u> "__" can be used; <ref> can be replaced by "[[*ref]]" for uniformity with links; and so on. I am ready to give expanded explanation if anyone is interested.
How do you type the *content* of the references?
the same goes for links: "[[link]]" and "[[link|caption]]" - pipe is also not present [in Russian layout]
[]| are some of the very few forbidden characters in the titles. That's why we can take advantage of them for title splitting. How would you differenciate between link target (page title) and link caption? It's simple to start defining a sensible syntax. But when you want to "go further", you start being limited. The most sane approach is probably to fall back to <tags> and leave them in the "complex" section. Why <> and not anything else? Just because that's what the underlying html uses. Some people is already familiar with that, too. See, mediawiki didn't define <u> as wikitext for underline.
It allows some html as, including <i> <b> and <u>. <i> and <b> have friendly counterparts*, <u> has not. The reason being that use of underlining is discouraged.
*And that also turned out to have issues, ever tried to write wikitext in piedmontese?
Special tokens like #REDIRECT, {{-}}, <imagemap>, __TOC__, etc. that all use different syntaxes can be uniformized in a way similar to template insertions: {{redir New page}}, {{clear}}, {{imagemap image.png, title x y, ...}}, {{TOC}} and so on. Templates can be called as {{tpl template arg arg arg}} - even if we keep { and } that require layout switch in some languages we eliminate the pipe which just makes things worse and text - less readable.
{{-}} is not a wikitext token. #REDIRECT and __TOC__ are a sad effect of separate building of contents. They are incoherent with the rest of the syntax. Note you can (and some wikis do) use a {{TOC}} template. You can't wrap #REDIRECT in a template, though, because the redirect applies to the template itself (unless you use some odd escaping?)