How about this:
Word'''word -> always apostrophe+italics Word''''word -> always apostrophe+bold
Advantages: * French and Italian examples work correctly all the time * You can parse it with single-token lookahead. * No need to count matched/mismatched bold/italics * Broken wikitext at the end of the line does not interfere with correct wikitext at the start of the line * Simpler to understand than the current rule.
Disadvantages: * You lose the ability to easily apply bold mid-word. * The ambiguity will arise more often, so more people will have to know that ''' is not always bold. Not that it's always bold at the moment, but you know...
Thoughts?
Steve
-----Original Message----- From: wikitext-l-bounces@lists.wikimedia.org [mailto:wikitext-l-bounces@lists.wikimedia.org] On Behalf Of Steve Bennett Sent: 28 November 2007 01:39 To: Wikitext-l Subject: [Wikitext-l] So, a better algorithm for apostrophes?
How about this:
Word'''word -> always apostrophe+italics Word''''word -> always apostrophe+bold
Advantages:
- French and Italian examples work correctly all the time
- You can parse it with single-token lookahead.
- No need to count matched/mismatched bold/italics
- Broken wikitext at the end of the line does not interfere with
correct wikitext at the start of the line
- Simpler to understand than the current rule.
Disadvantages:
- You lose the ability to easily apply bold mid-word.
- The ambiguity will arise more often, so more people will have to
know that ''' is not always bold. Not that it's always bold at the moment, but you know...
Thoughts?
Steve
I create a DOM tree as it parses, which eliminates advantages 2 & 3.
2, It just creates elements as it goes, and corrects any erroneously made assumptions
Eg Seeing '''''five''' it creates <b><i>five then seeing the ''' realises it needs to rewrite to <i><b>five</b>.
3, When hit a newline, only have to see if still within <b> & <i> elements.
Jared
On 11/28/07, Jared Williams jared.williams1@ntlworld.com wrote:
I create a DOM tree as it parses, which eliminates advantages 2 & 3.
Right. I'm building an AST as well. I didn't say, "Let's change the semantics of apostrophes because I can't parse them", I'm saying "Let's change the semantics of apostrophes because they don't work well and because they're hard to parse."
In general, the easier the grammar is to parse, *without* having to build a tree and then manipulate it, the better.
But the real argument here is that the bold->italics conversion works very badly, and could be improved.
Would my suggested change have any downsides?
Steve
Would my suggested change have any downsides?
You already listed two downsides. I can't think of any more. The key thing is if the advantages are worth the disadvantages, and that's a tricky one to decide...
Would this be an accurate summary of your suggestion: Bold can only start immediately after whitespace, or an apostrophe? I think that's equivalent to what you said, but I'm not sure. If it is the same, then it might be a slighter easier way to describe it to users.
On 11/29/07, Thomas Dalton thomas.dalton@gmail.com wrote:
Would this be an accurate summary of your suggestion: Bold can only start immediately after whitespace, or an apostrophe? I think that's equivalent to what you said, but I'm not sure. If it is the same, then it might be a slighter easier way to describe it to users.
Um. I guess there is whitespace, there are letters, then there is a lot of grey area which is neither whitespace nor letters..
Since we're really targeting these particular language expressions, I think it makes sense to restrict this behaviour to *letters* rather than *non-whitespace*. Also, I only suggested changing the behaviour in the middle of a word, rather than at the end, which you rule out in your definition.
That is, this''' would be bold in my definition, but not yours.
So, the definition I would go with: Bold can be toggled anywhere except between two letters.
Where "letters" is suitably broad to cover all letters, accented or otherwise, in all languages that are likely to want to use this construct.
(This could need refining, but we'd need input from actual users of these languages. Is a French speaker likely to want to write l'''11'' for instance? Perhaps.)
Steve
On 11/28/07, Steve Bennett stevagewp@gmail.com wrote:
How about this:
Word'''word -> always apostrophe+italics Word''''word -> always apostrophe+bold
As a slight refinement:
Word'''word''' -> treated as bold.
Word'''word'''word -> ?
Disadvantages:
- You lose the ability to easily apply bold mid-word.
Now it's possible to easily have bold midword:
Die'''bold''' Election-rigging Services
And if for some reason you want to start bold midword and continue it after the word:
Ko'''bold''' '''for dinner again?'''
Steve
wikitext-l@lists.wikimedia.org