On 06/02/12 07:58, Pavel Tkachenko wrote:
Making markup language-neutral is easy enough: even a
single person
can carry on the research to find keyboard symbols that are easily
accessible across different language standards. From my experience
they are ! % * ( ) - = _ and +. This will eliminate the need to layout
switches (for example, currently a Russian Wikipedia editor must
switch layout 5 times in typing simple "<u>underline</u>" since
neither < nor > are present in Russian layout;
No. It's not easy. It's painful.
The goal of wikitext is to make html editing easy.
HTML only needs a few special characters: <>&;=" but it's bothersome.
So instead of
<ul>
<li>Dogs
<li>Cats
<li>Hens
</ul>
We define that * is a bullet and serves to make lists:
* Dogs
* Cats
* Hens
It's easier to type, and looks good.
Then we also want numbered lists. Instead of
<ol>
<li>One
<li>Two
<li>Three
</ol>
We define # as the equivalent for numbered lists. Note that there's no
usage of # for numbers in many cultures, so that's less 'visual' there.
# One
# Two
# Three
You then continue adding tricks of "this looks like", sometimes needing
a bery crazy mind. But each feature requires new symbols, and when you
look at those available on every layout, you get *very* limited...
For example, I could decide to list imagemaps as `Image1´ `Image2´...
(grave and acute), but oh, many keyboards don't have both accents.
(I happen to have both, but had to copy the acute because it rejected
being displayed alone, converting to an apostrophe...)
And obviously, you can't use something that would easily appear in a
normal text (or you start defining escape codes which are uglier, too).
My study indicates that the number of available
symbols will allow to
avoid HTML-style tags completely - this will further simplify the
markup. For instance, instead of <u> "__" can be used; <ref> can
be
replaced by "[[*ref]]" for uniformity with links; and so on. I am
ready to give expanded explanation if anyone is interested.
How do you type the *content* of the references?
the same goes for links: "[[link]]" and
"[[link|caption]]" - pipe is
also not present [in Russian layout]
[]| are some of the very few forbidden characters in the titles. That's
why we can take advantage of them for title splitting.
How would you differenciate between link target (page title) and link
caption?
It's simple to start defining a sensible syntax. But when you want to
"go further", you start being limited. The most sane approach is
probably to fall back to <tags> and leave them in the "complex" section.
Why <> and not anything else? Just because that's what the underlying
html uses. Some people is already familiar with that, too.
See, mediawiki didn't define <u> as wikitext for underline.
It allows some html as, including <i> <b> and <u>. <i> and
<b> have
friendly counterparts*, <u> has not. The reason being that use of
underlining is discouraged.
*And that also turned out to have issues, ever tried to write wikitext
in piedmontese?
Special tokens like #REDIRECT, {{-}},
<imagemap>, __TOC__, etc. that
all use different syntaxes can be uniformized in a way similar to
template insertions: {{redir New page}}, {{clear}}, {{imagemap
image.png, title x y, ...}}, {{TOC}} and so on. Templates can be
called as {{tpl template arg arg arg}} - even if we keep { and } that
require layout switch in some languages we eliminate the pipe which
just makes things worse and text - less readable.
{{-}} is not a wikitext token. #REDIRECT and __TOC__ are a sad effect of
separate building of contents. They are incoherent with the rest of the
syntax.
Note you can (and some wikis do) use a {{TOC}} template. You can't wrap
#REDIRECT in a template, though, because the redirect applies to the
template itself (unless you use some odd escaping?)