I've removed the automatic substitution of – and — for hyphen sequences from 1.5, as it seems to simply cause a neverending sequence of breakage of links and markup. The special cases that were added to try to keep it from breaking (some) links and (some) markup of course made it behave fairly inconsistently, and on the whole it seems to have been causing trouble far outweighing the utility of making some dashes slightly more attractive.
Perhaps some future parser that operates in a clean fashion instead of layering regexes on top of each other will be able to do this in a consistent, non-breaking manner. For now it doesn't seem worth it.
-- brion vibber (brion @ pobox.com)
On Sun, 2005-06-19 at 22:05 -0700, Brion Vibber wrote:
I've removed the automatic substitution of – and — for hyphen sequences from 1.5, as it seems to simply cause a neverending sequence of breakage of links and markup...
I was planning to add that feature to the parser code I'm currently working on (as well as something for curly-quotes). I get the impression from your note that the problems are all related to the current parser misinterpreting what is displayed text and what is markup, is that right? If so, my adding that feature to sample code for a spec that eliminates any such ambiguity should not be a problem. Or is there some other problem?
Or is there some other problem?
You have to view localization problems. Example in Russian ndash is absent, there are only mdash and hyphen. Сurly-quotes (and embedded сurly-quotes) are also complex thing in various languages and typographical styles.
In Russian Wikipedia we use javascript tool for "typography" text befor submit, i. e. it's user-side tool (user can make certain transformations to whole text or only highlighted text, or nothing). The script is "open", so can be modified by our sysops, since it sits in the source of ru:MediaWiki:Summary (http://ru.wikipedia.org/wiki/MediaWiki:Summary)
See also * http://en.wikipedia.org/wiki/Dash * http://en.wikipedia.org/wiki/Quotation_mark
On 6/20/05, Lee Daniel Crocker lee@piclab.com wrote:
On Sun, 2005-06-19 at 22:05 -0700, Brion Vibber wrote:
I've removed the automatic substitution of – and — for hyphen sequences from 1.5, as it seems to simply cause a neverending sequence of breakage of links and markup...
I was planning to add that feature to the parser code I'm currently working on (as well as something for curly-quotes). I get the impression from your note that the problems are all related to the current parser misinterpreting what is displayed text and what is markup, is that right? If so, my adding that feature to sample code for a spec that eliminates any such ambiguity should not be a problem. Or is there some other problem?
No, there's no other problem, first we had an issue with it because it changed - to – in image names, and of course there was no image by the dash name so the image failed to render at all, and presently (before brion took it out) it was changing - to – in ISO 8601 dates and failing to change - between two wikilinked years to – (because they had already been changed to html at that point so the previous regex didn't cut it.
wikitech-l@lists.wikimedia.org