I'm afraid latest maillist messages were
considered TLDR by most of
the subscribers so I will put this in the beginning: is there any
point in continuing our discussion on the subject? Platonides is a
constructive company but he seems to be the only one participating.
Is the community truly interested in reworking the markup?
The discussion is certainly valid.
Honestly, if i'm allowed to speak out my crazy optimistic utopian
dream, then: <crazy-optimistic-utopian-dream>i want the current-style
wiki markup to disappear completely. I'm referring to *,
'''''', {{}},
[[]] etc. It was very beneficial for the beginning, because it was for
the most part more intuitive to type than <ul><li></li></ul>,
<strong></strong> and <a href=""></a>, but for people
who want
easiness, the Visual Editor is supposed to provide it and after that
most of them should never look back to the markup.
For people who will want text-based markup, it should be mostly XHTML.
So, <section>, <poem>, <source>, and <nowiki> are kinda XHTML so
they
can stay. *, '''''' and [[]] are not XHTML, and they can and
should be
replaced by XHTML, althogh. And {{}} needs its own markup, but it
should be XHTML-like <template name="citation needed" />.
So there. My idea of a bright wikifuture is less home-grown parsers
and more standards. It's easier for the developers and works
organically with the browsers. It's not necessarily easier for people
who want to write articles in plain text with markup, but hey, they
asked for it.</crazy-optimistic-utopian-dream>
--
Amir Elisha Aharoni · אָמִיר אֱלִישָׁע אַהֲרוֹנִי
“We're living in pieces,
I want to live in peace.” – T. Moore
2012/2/8 Pavel Tkachenko <proger.xp(a)gmail.com>om>:
(Continuing the crunching. huh? But this message is
only 4 pages long.)
Forked into the new thread.
I have some knowledge and code assets that I will be happy to
contribute; I will gladly take part in discussions or help improve the
situation in some other way. But if Wikimedia team has different views
onto the markup evolution it's fruitless to spend so much time
chatting before the closed doors.
My reply follows.
On 08.02.2012 2:27, Platonides wrote:
Nobody proposed to change the template in that
way? :)
You mean that nobody has actually studied markup usability?
If you start creating inline, block and mixed
template modes, I
suspect the syntax will end up being chaotic (I'm thinking in
concrete cases in MW syntax).
True, that's why I propose only two modes: block
and inline, both with
clear distinctions and features.
That assumes that there's a non-ambiguous way
to express that in
natural language (plus that it is easily parseable by a machine).
Yes, added a few
simple rules an unambiguous language can be created.
I'm sure most of those business e-mails and official documents can be
processed by the machine without much effort. And we're talking about
even more formalized language here - text markup.
So, how do you split {{About Bijection, injection
and surjection}} ?
If that is supposed to be a long caption (4 words and a comma)
then just
use quotes - like in natural handwriting. {{About "Bijection, injection
and surjection"}}
The point of using an additional character not
used in normal
language is precisely for working the metalanguage.
I disagree, it only means that
this subject has not yet been researched
enough.
Also, there are colons as parameters. How would you
write as the
parameter the article [[Gypsy: A Musical Fable]] or [[Batman:
Year One]] ? By banning ':' in titles?
Have I said something about colons
and links? Links are fine with
colons or any other symbols.
You mentioned colons for template arguments I'm
acting as the
devil's advocate asking you how to provide those titles as parameters
to a template.
Uh, I have mistyped "comma" instead of "colon".
Let me correct this:
1. {{About Something}}
2. {{About Something, of kind}}
3. {{About "Something, something and something", of kind}}
4. {{About "Something, something and something", "of kind, kind and
kind"}}
As you can see, no character is banned from the title while in current
pipe-centric approach I don't thing it's possible to have pipes there
without a headache.
But if
we're touching this pipes in links are not that intuitive
either. Pipes are actually not present on many keyboard layouts but
even apart from that it's more natural to use an equality sign. Or
double, for the purpose of text markup.
It's consistent with the use of pipes
in templates (which do use
equal in that way to name parameters). Although link syntax was
probably earlier.
Right, and pipes should not appear in templates either. It's
too special
symbol.
So is [[Batman Forever]] your syntax for [[Batman
Forever|Batman
Forever]] or [[Batman|Forever]] ? So much cases are bad, KISS.
I do not see your
point. The processing is straightforward:
1. Link contains == - it separates address from title.
2. Link contains no == but contains a space - the first space separates
address from title.
3. There is neither == nor ' ' - link is titleless. This means that:
* local links get titles from page name, not page address (this is
important and differs from current MediaWiki implementation in a better way)
* remote links can also get their title from <title> after fetching
first 4 KiB of that page or something
Use cases:
* "[[http://google/search?q=%61%62%63 Google it]]" - for external links
== delimiter won't be used at all
* "See this [[page]]" - current wikitext is the same
* "See [[page that page]]" vs. current [[page|that page]]. Looks more
clean and easier to type (space is present on all keyboards and is quite
large in size). This covers not less than half local links.
* "See [[Some page==this page]]" vs. current [[Some page|this page]].
This case has less drastic differences than previous 3 but a pipe is
still both special to English layouts and less noticeable to human eye
than double equality sign.
Does "KISS" mean that every use case should be created with uniform but
because of this equally inconvenient syntax? I agree that more complex
cases should have correspondingly more complex syntax but this scaling
must be adequate. By placing pipe everywhere not only cross-language
usability is reduced but the fact that it's redundant in some cases (#1
and #3 items above) is ignored.
4.
Finally, in very rare cases when both space and equality symbol
is necessary a special markup-wise (!) escape symbol can be used.
As an example:
[[2 + 2 = 5]]
Your example contains no double equality symbol and is treated as
space-separated title: [[2| + 2 = 5]] in current wikitext.
Would you remove === headings?
No,
headings are consistent because the first heading starts with double
equality sign.
Currently
wikitext uses terrible "<nowiki>stuff</nowiki>" but it
doesn't always work and HTMLTidy comes in handy with its< and
>. And some places (such as link titles) cannot be escaped
altogether.
Really? I think you can.
Give some examples and we will examine
their adequateness.
Your proposal for forcing to edit the urls is
very bad. You can't
just paste, you need to go changing every = on it (which is a
frequent character) to ~==.
No, no, no, you have got a completely wrong idea. You
don't have to
escape SINGLE = because it is not special. You only need to escape
double ==. How much double == have you seen in the links? I have seen
them being used on my local bookstore site but it's surely an exception.
Pipes are banned from titles.
Great,
let's make machine's life easier.
I'm not sure this is a good analogy.
Copy-pasting chunks of code look
like copying phrases from other articles to make your own. That
should be original. OTOH, reusing the existing LaTeX template is much
more appropiate than writing your own from scratch trying to copy the
style of the provided one.
For such things templates must be created that will
reduce the number of
entities identical to all of their use cases to minimum. In MediaWiki
this is done using {{templates and=parameters}} and this is good. If you
were talking about copy-pasting these templates, their parameters and
empty values - this is fine. But if it was about copy-pasting the same
code with all rendering tricks ( , {{iejrhgy}} and other cryptic
things) - this is bad.
Even if I write a program from scratch, I should
make it consistent
with other tools. That means an appropiate arguments would be sort
-r --ignore-case --sort=month ./myfile instead of sort<- !case (sort
as month) \\\\./myfile\\\\
Standardizing is fine unless it starts looking
unnatural. The following
example might be argued but I can't think of another one quickly:
tar -czf file.tar.gz .
While this uses standard CLI syntax is in true *nix ideology this is
what (among other things) separate POSIX from Windows. For instance, I
could write:
tar file.tar.gz .
...and the program will detect -czf arguments on its own based on
-f is simply implied because there are 2 unnamed arguments (without
leading -X)
-c target file doesn't exist
-z target file has extension .gz
It's the same with templates or other markup: while {{About page=Earth
kind=planet}} or something similar is fine, {{About Earth, planet}} or
some other form is more appropriate in this particular use case.
You are giving many attributions to the machine.
Personally, I would
spit out an error, just in they were eg. in different units.
Yes, this is one of
the ways and I would opt for it if we want to have a
strict syntax.
But you are making up your syntax, then requiring
the system to adapt
for you.
Can you elaborate more on this point?
The goal of wikitext is to make html editing easy.
HTML editing? I thought wikitext was about text editing. Why not
edit HTML using HTML?
Because it's considered cumbersome. (Actually, it's
presentational
editing, but as the presentation is obtained by using HTML as an
intermediate language...)
Indeed, HTML is cumbersome, that's why wikitext and
all other text
markups have been invented. But they don't have to copy HTML syntax -
just the opposite.
And you have complicated the originally clean
syntax of 1, 2, 3
Clean syntax for whom? For Englishmen? And are hashes actually
clean? If
so, why don't we use them in our e-mail messages?
Would html links become italic? (that was a
problem of wikicreole, it
was defined as 'italic unless in links')
Not at all because we are talking
about context-specific grammar.
Addresses in links can hold no formatting and thus all but context
ending tokens (]], space and ==) are ignored there.
And yes, context-specific grammar is more than regular expressions can
handle. Regexps are good but this doesn't mean anything incompatible
with sed is beyond "too complex".
As already mentioned, I am using my own markup processor written in
PHP on my projects and it implements all markup already described
including the [[
http://italic]] (context-specific grammar) case. And
its parsing loop is under 350 lines of code.
Well, I have to say it seems well though, it
"doesn't look bad".
Thank you. I have given it a lot of thinking and
practice but I'm sure
there still are things to improve. I would be ecstatic if my
experience can help the world's largest free knowledge community.
Thanks again for your mail, Platonides.
Signed,
P. Tkachenko
_______________________________________________
Wikitext-l mailing list
Wikitext-l(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitext-l