- Current syntax: '''italic''plain => '<i>italic</i>plain
- Proposed syntax: '''italic''plain => <b>italic<i>plain</i></b>
- Previous behaviour with new syntax: <nowiki>'</nowiki>''italic''plain
For the record, I shall say that I fully expect riots on w:fr: if this new behaviour becomes mandatory :)
Some versions, months ago, did require the <nowiki> tag, and it was really afwul to manage. And we'd also need to *fix* all pages with this syntax, and we can't really fix automatically, need to check on a case-by-case basis.
Nicolas Weeger
Accédez au courrier électronique de La Poste : www.laposte.net ; 3615 LAPOSTENET (0,34/mn) ; tél : 08 92 68 13 50 (0,34/mn)
On 5/12/05, Nicolas Weeger nicolas.weeger@laposte.net wrote:
- Current syntax: '''italic''plain => '<i>italic</i>plain
- Proposed syntax: '''italic''plain => <b>italic<i>plain</i></b>
- Previous behaviour with new syntax: <nowiki>'</nowiki>''italic''plain
For the record, I shall say that I fully expect riots on w:fr: if this new behaviour becomes mandatory :)
Some versions, months ago, did require the <nowiki> tag, and it was really afwul to manage. And we'd also need to *fix* all pages with this syntax, and we can't really fix automatically, need to check on a case-by-case basis.
Not quite true; it could be fixed automatically by having the existing parser code apply the rule that it does now, and write a <nowiki> around the ' that it interprets literally. It wouldn't be quite semantically perfect, but articles would keep their appearances (and it could even be applied before/without removing the offending code). I agree that <nowiki> is somewhat unwieldy, but it's a solution to the problem that already exists (and always has) and doesn't require mangling the grammar so that it's imposible to know whether ''' means ''' or not.
A slightly more radical approach that just crossed my mind would be to add a token reminiscent of TeX's "/" which would produce no output, but break up tokens. For the sake of readability in this email, let's imagine it's ";"
* l';''italic''bold => "l'<i>italic</i>bold" * l'';'italic''bold => "l<i>'italic</i>bold"
gives you full control without the 17 characters of <nowiki></nowiki>.
-----BEGIN PGP SIGNED MESSAGE-----
Moin, On Thursday 12 May 2005 17:58, Andrew Rodland wrote:
On 5/12/05, Nicolas Weeger nicolas.weeger@laposte.net wrote:
- Current syntax: '''italic''plain => '<i>italic</i>plain
- Proposed syntax: '''italic''plain => <b>italic<i>plain</i></b>
- Previous behaviour with new syntax:
<nowiki>'</nowiki>''italic''plain
For the record, I shall say that I fully expect riots on w:fr: if this new behaviour becomes mandatory :)
Some versions, months ago, did require the <nowiki> tag, and it was really afwul to manage. And we'd also need to *fix* all pages with this syntax, and we can't really fix automatically, need to check on a case-by-case basis.
Not quite true; it could be fixed automatically by having the existing parser code apply the rule that it does now, and write a <nowiki> around the ' that it interprets literally. It wouldn't be quite semantically perfect, but articles would keep their appearances (and it could even be applied before/without removing the offending code). I agree that <nowiki> is somewhat unwieldy, but it's a solution to the problem that already exists (and always has) and doesn't require mangling the grammar so that it's imposible to know whether ''' means ''' or not.
A slightly more radical approach that just crossed my mind would be to add a token reminiscent of TeX's "/" which would produce no output, but break up tokens. For the sake of readability in this email, let's imagine it's ";"
- l';''italic''bold => "l'<i>italic</i>bold"
- l'';'italic''bold => "l<i>'italic</i>bold"
"" as in
* l'''italic''bold * l'''italic''bold
? "" escaping is used a lot elsewhere and "" would be the first char that springs to my mind.
Best wishes,
Tels
gives you full control without the 17 characters of <nowiki></nowiki>. _______________________________________________ Wikitech-l mailing list Wikitech-l@wikimedia.org http://mail.wikipedia.org/mailman/listinfo/wikitech-l
- -- Signed on Thu May 12 20:19:22 2005 with key 0x93B84C15. Visit my photo gallery at http://bloodgate.com/photos/ PGP key on http://bloodgate.com/tels.asc or per email.
"Spammed if you do, spammed if you don't." - Murphy's Law
"" as in
- l'''italic''bold
- l'''italic''bold
? "" escaping is used a lot elsewhere and "" would be the first char that springs to my mind.
That's too tech-savy. The aim of wikisyntax is to be easy to understand - we can't ask people to put a \ before ' or surround it by <nowiki></nowiki>. I don't mind a slightly more complex syntax for tables, templates & such, "advanced" behaviour. But for your daily usage (and ' is used often at least in french), something quite simple to use is imo waaaaaaay better.
Best wishes,
Tels
Regards Nicolas Weeger
-----BEGIN PGP SIGNED MESSAGE-----
Moin,
On Thursday 12 May 2005 20:50, Nicolas Weeger wrote:
"" as in
- l'''italic''bold
- l'''italic''bold
? "" escaping is used a lot elsewhere and "" would be the first char that springs to my mind.
That's too tech-savy. The aim of wikisyntax is to be easy to understand - we can't ask people to put a \ before ' or surround it by <nowiki></nowiki>. I don't mind a slightly more complex syntax for tables, templates & such, "advanced" behaviour. But for your daily usage (and ' is used often at least in french), something quite simple to use is imo waaaaaaay better.
Yes, I agree on the "simple" solution. However, the current situation isn't "simple" for things like l'''italic'' because it is unclear what exactly that should mean. Maybe it means
l<i>'italic</i>
or it means
l'<i>italic</i>
How is the computer (and the human writing it) to know which is which? You need at least one more bit information to distinguish between these two variants.
Best wishes,
Tels
- -- Signed on Thu May 12 21:26:43 2005 with key 0x93B84C15. Visit my photo gallery at http://bloodgate.com/photos/ PGP key on http://bloodgate.com/tels.asc or per email.
This email violates U.S. patent #4,197,590:
for (x = 0; x < widht; x++) { for (y = 0; y < height; y++) { setPixel (x+xm, y+ym, getPixel(x+xm,y+ym) ^ getCursorPixel(x,y); } }
On 5/12/05, Tels nospam-abuse@bloodgate.com wrote:
Moin,
On Thursday 12 May 2005 20:50, Nicolas Weeger wrote:
That's too tech-savy. The aim of wikisyntax is to be easy to understand - we can't ask people to put a \ before ' or surround it by <nowiki></nowiki>. I don't mind a slightly more complex syntax for tables, templates & such, "advanced" behaviour. But for your daily usage (and ' is used often at least in french), something quite simple to use is imo waaaaaaay better.
Yes, I agree on the "simple" solution. However, the current situation isn't "simple" for things like l'''italic'' because it is unclear what exactly that should mean. Maybe it means
l<i>'italic</i>
or it means
l'<i>italic</i>
How is the computer (and the human writing it) to know which is which? You need at least one more bit information to distinguish between these two variants.
Exactly. Now add to that the complication that ''' supposedly means something completely different from _both_ of those when it's not in the middle of a word, and that (as with much of the syntax) there has never been any well-defined rule governing this behavior, only a last-resort hack in PHP code. It's not simple conceptually, and it's far from simple for the computer, and it's bad for parsing. I hate to be argumentative, but I don't believe that the issue is as simple (heh) as you make it out to be, Nicolas.
Exactly. Now add to that the complication that ''' supposedly means something completely different from _both_ of those when it's not in the middle of a word, and that (as with much of the syntax) there has never been any well-defined rule governing this behavior, only a last-resort hack in PHP code. It's not simple conceptually, and it's far from simple for the computer, and it's bad for parsing. I hate to be argumentative, but I don't believe that the issue is as simple (heh) as you make it out to be, Nicolas.
I don't believe either it's simple, conceptually or computer-ally (whao, a neologism).
Merely pointing out that: 1) there'd be many broken page if the syntax was changed, and some huge work to fix everything 2) people (on w:fr:, can't tell other languages) would be simply really bothered by the change, and would simply revolt to go hang the people changing everything :)
Nicolas Weeger
Andrew Rodland wrote:
Exactly. Now add to that the complication that ''' supposedly means something completely different from _both_ of those when it's not in the middle of a word, and that (as with much of the syntax) there has never been any well-defined rule governing this behavior,
Just because you don't understand the rule doesn't mean it's not well-defined. It's perfectly well-defined: If there is another ''' in the same line, it means bold. If there isn't, but there's a '', it means apostrophe-plus-italics. If neither, it means three apostrophes. Normal editors don't have to know about or understand this rule in detail as long as the behaviour is what they expect, which apparently on the French Wikipedia it is.
only a last-resort hack in PHP code. It's not simple conceptually,
I think you're expecting a simple solution to a complex problem.
and it's far from simple for the computer, and it's bad for parsing.
It's perfectly easy for the computer, just as long as it's programmed right. I have demonstrated this both in the current MediaWiki parser and in flexbisonparse by replicating the same behaviour without trouble.
Timwi
Tels wrote:
Yes, I agree on the "simple" solution. However, the current situation isn't "simple" for things like l'''italic'' because it is unclear what exactly that should mean. Maybe it means
l<i>'italic</i>
or it means
l'<i>italic</i>
How is the computer (and the human writing it) to know which is which? You need at least one more bit information to distinguish between these two variants.
The computer knows which to output because it's been programmed a certain way.
The human can reasonably expect to get the behaviour that would clearly be more useful. We keep talking about French, but actually a great deal of languages is affected in the same way. All you need is an apostrophised contraction before a word that can potentially be italicised or bolded.
Therefore, l'''italic'' should always output l'<i>italic</i>, and the "one more bit of information" you talked about should only be added in the unusual case. The only language I have come across where you will commonly want the apostrophe to be italicised or bolded is Klingon, though I could imagine that Hawaiian may also be affected, as a word can begin with an apostrophe in both of these languages.
Greetings, Timwi
Tels nospam-abuse@bloodgate.com writes:
? "" escaping is used a lot elsewhere and "" would be the first char that springs to my mind.
Only on hacker keyboards (US flavor) the backslash is easy to enter...
Karl Eichwalder wrote:
Tels nospam-abuse@bloodgate.com writes:
? "" escaping is used a lot elsewhere and "" would be the first char that springs to my mind.
Only on hacker keyboards (US flavor) the backslash is easy to enter...
That is a moot argument, because there is no single character that is "easy to type" on every official national keyboard layout. Indeed, we already have a lot of characters ([[]], {{}} and ~~~~) that are a nightmare on German keyboards.
Timwi timwi@gmx.net writes:
That is a moot argument, because there is no single character that is "easy to type" on every official national keyboard layout. Indeed, we already have a lot of characters ([[]], {{}} and ~~~~) that are a nightmare on German keyboards.
This does not mean it does not hurt to add even more strange characters. On the contrary, think again about a sane markup language like XML.
And we'd also need to *fix* all pages with this syntax, and we can't really fix automatically, need to check on a case-by-case basis.
Why not? The current parser manages just fine with it, you could write a bot that changes everything with the same rule.
The proposed change is just that ''' always opens a bold tag and '' always opens an italic tag, not having to check *the entire wikitext* for context (as is done currently) would make things a lot easier, not just for machines but for people to, the only way to tell currently if ''' is really opening a whole lot of bold text or just opening some italic text is to find the matching '' or ''', that's not intuitive at all.
On Fri, 13 May 2005, [ISO-8859-1] Ævar Arnfjörð Bjarmason wrote:
And we'd also need to *fix* all pages with this syntax, and we can't really fix automatically, need to check on a case-by-case basis.
Why not? The current parser manages just fine with it, you could write a bot that changes everything with the same rule.
The proposed change is just that ''' always opens a bold tag and '' always opens an italic tag, not having to check *the entire wikitext* for context (as is done currently)
I'm pretty sure that only the current line of text is checked, not the entire article.
Alfio
Ævar Arnfjörð Bjarmason wrote:
The proposed change is just that ''' always opens a bold tag and '' always opens an italic tag, not having to check *the entire wikitext* for context (as is done currently)
This is not true, neither in MediaWiki's current parser, nor flexbisonparse.
would make things a lot easier, not just for machines but for people to, the only way to tell currently if ''' is really opening a whole lot of bold text or just opening some italic text is to find the matching '' or ''', that's not intuitive at all.
I can't follow this argumentation at all. The current behaviour is extremely easy for our human editors because it does *what they expect*. The construction "l'''homme''" should *not* open a bold because this is *not what they expect*. It is *way* easier for them than having to type absolutely *anything* extra (even if it's just a "").
wikitech-l@lists.wikimedia.org