There was some buzz on this list a while back about supporting Usenet syntax - such as *bold*, /italics/ and _underscores_.
I have created an extension which provides this facility, which you can read about here:
http://www.mediawiki.org/wiki/Extension:UsenetSyntax
I welcome all feedback - and thanks in advance to anyone who wants to give it a shot.
-- Jim R. Wilson (jimbojw)
On Tue, Feb 20, 2007 at 05:34:13PM -0600, Jim Wilson wrote:
There was some buzz on this list a while back about supporting Usenet syntax
- such as *bold*, /italics/ and _underscores_.
I have created an extension which provides this facility, which you can read about here:
Woohoo. :-)
I welcome all feedback - and thanks in advance to anyone who wants to give it a shot.
FWIW, the customary rendering (IME) is _italics_, since handwritten underlining is usually rendered as italics in typesetting, and most computer displays have, traditionally, not had a good way to render underlining...
Cheers, -- jra
Woohoo. :-)
Thanks - I'm glad you approve!
FWIW, the customary rendering (IME) is _italics_, since handwritten underlining is usually rendered as italics in typesetting, and most computer displays have, traditionally, not had a good way to render underlining...
Dang. When I asked everybody on the IRC channel, they told me it was *bold*, _underline_, and /italics/
The change is trivial to make in the code - I just need to know what the "accepted format" is - I've never used Usenet.
-- Jim
On 2/21/07, Jay R. Ashworth jra@baylink.com wrote:
On Tue, Feb 20, 2007 at 05:34:13PM -0600, Jim Wilson wrote:
There was some buzz on this list a while back about supporting Usenet
syntax
- such as *bold*, /italics/ and _underscores_.
I have created an extension which provides this facility, which you can
read
about here:
Woohoo. :-)
I welcome all feedback - and thanks in advance to anyone who wants to
give
it a shot.
FWIW, the customary rendering (IME) is _italics_, since handwritten underlining is usually rendered as italics in typesetting, and most computer displays have, traditionally, not had a good way to render underlining...
Cheers,
-- jra
Jay R. Ashworth jra@baylink.com Designer Baylink RFC 2100 Ashworth & Associates The Things I Think '87 e24 St Petersburg FL USA http://baylink.pitas.com +1 727 647 1274
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org http://lists.wikimedia.org/mailman/listinfo/wikitech-l
On Wed, Feb 21, 2007 at 01:55:13PM -0600, Jim Wilson wrote:
Woohoo. :-)
Thanks - I'm glad you approve!
FWIW, the customary rendering (IME) is _italics_, since handwritten underlining is usually rendered as italics in typesetting, and most computer displays have, traditionally, not had a good way to render underlining...
Dang. When I asked everybody on the IRC channel, they told me it was *bold*, _underline_, and /italics/
The change is trivial to make in the code - I just need to know what the "accepted format" is - I've never used Usenet.
One of the reasons the mainline isn't interested in this is, I suspect, precisely that it's not well defined. The communities I hang out in (and I was on Usenet since 1984 or so, back when I could still read The Entire Feed) tended to stick to *bold* and _whatever_; I didn't see /these sort of italics/ much, partially because it conflicts with search string notation, I suspect.
I don't think there's A Standard, myself.
Since it's an extension, and the mainline is unlikely to load it, could you make it an install-time configurable option whether _this_ renders as italics or underscore?
Cheers, -- jra
Jay R. Ashworth wrote:
The communities I hang out in (and I was on Usenet since 1984 or so, back when I could still read The Entire Feed) tended to stick to *bold* and _whatever_; I didn't see /these sort of italics/ much, partially because it conflicts with search string notation, I suspect.
FWIW, [[Crossmark]] uses *boldface*, /italicized/, _underlined_ and Mozilla Thunderbird automatically recognizes the same syntax in plaintext mail.
Jay R. Ashworth wrote:
Since it's an extension, and the mainline is unlikely to load it, could you make it an install-time configurable option whether _this_ renders as italics or underscore?
Done deal. See the extension page for details on how to configure (you'll need to re-download the extension for the changes to take effect)
-- Jim
On 2/21/07, Ivan Krstić krstic@solarsail.hcs.harvard.edu wrote:
Jay R. Ashworth wrote:
The communities I hang out in (and I was on Usenet since 1984 or so, back when I could still read The Entire Feed) tended to stick to *bold* and _whatever_; I didn't see /these sort of italics/ much, partially because it conflicts with search string notation, I suspect.
FWIW, [[Crossmark]] uses *boldface*, /italicized/, _underlined_ and Mozilla Thunderbird automatically recognizes the same syntax in plaintext mail.
-- Ivan Krstić krstic@solarsail.hcs.harvard.edu | GPG: 0x147C722D
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org http://lists.wikimedia.org/mailman/listinfo/wikitech-l
On Wed, Feb 21, 2007 at 03:30:26PM -0600, Jim Wilson wrote:
Jay R. Ashworth wrote:
Since it's an extension, and the mainline is unlikely to load it, could you make it an install-time configurable option whether _this_ renders as italics or underscore?
Done deal. See the extension page for details on how to configure (you'll need to re-download the extension for the changes to take effect)
Well, Jim; y'know; I like you a lot. :-)
I have to upgrade my home MW to something reasonably current; I'll install this when I do.
How did you avoid the confusion everyone warns of with
* list *item **marking?
Cheers, -- jra
The extension's hooking point "ParserBeforeTidy" happens very late in the parsing of a page. So late, that at this point all interpretation of MW syntax has already occurred.
For this reason, my extension has no knowledge of <nowiki> delimiters and will go right through them.
Also, I do no processing for strings that contain tags. So this:
Hi *some bold <span> text*, more text </span>
Becomes this:
Hi <b>some bold <span> text</b>, more text </span>
And I have no idea what Tidy would do to that one (probably something very ugly).
So this is by no means a real actual solution. The "correct solution" (if you are of the party that thinks there's a problem in the first place) would be much more complicated and would need to hook into the Parser much more deeply - so much so that it may not even be possible given current Parser hooking points.
But it works in a pinch - and as long as everybody uses reasonable syntax, it should work just fine.
-- Jim
On 2/21/07, Jay R. Ashworth jra@baylink.com wrote:
On Wed, Feb 21, 2007 at 03:30:26PM -0600, Jim Wilson wrote:
Jay R. Ashworth wrote:
Since it's an extension, and the mainline is unlikely to load it,
could
you make it an install-time configurable option whether _this_ renders as italics or underscore?
Done deal. See the extension page for details on how to configure
(you'll
need to re-download the extension for the changes to take effect)
Well, Jim; y'know; I like you a lot. :-)
I have to upgrade my home MW to something reasonably current; I'll install this when I do.
How did you avoid the confusion everyone warns of with
- list
*item **marking?
Cheers,
-- jra
Jay R. Ashworth jra@baylink.com Designer Baylink RFC 2100 Ashworth & Associates The Things I Think '87 e24 St Petersburg FL USA http://baylink.pitas.com +1 727 647 1274
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org http://lists.wikimedia.org/mailman/listinfo/wikitech-l
On Wed, Feb 21, 2007 at 03:53:25PM -0600, Jim Wilson wrote:
How did you avoid the confusion everyone warns of with
- list
*item **marking?
The extension's hooking point "ParserBeforeTidy" happens very late in the parsing of a page. So late, that at this point all interpretation of MW syntax has already occurred.
So, in short, you dealt with it by running late enough that MW got precedence, and grabbed all the asterisks it thought it was interested in.
That works for me.
Cheers, -- jra
Well yeah, if you wanna put it that way :)
* *So this will work* just fine **While this will* still be a 2nd level bullet
On 2/21/07, Jay R. Ashworth jra@baylink.com wrote:
On Wed, Feb 21, 2007 at 03:53:25PM -0600, Jim Wilson wrote:
How did you avoid the confusion everyone warns of with
- list
*item **marking?
The extension's hooking point "ParserBeforeTidy" happens very late in
the
parsing of a page. So late, that at this point all interpretation of MW syntax has already occurred.
So, in short, you dealt with it by running late enough that MW got precedence, and grabbed all the asterisks it thought it was interested in.
That works for me.
Cheers,
-- jra
Jay R. Ashworth jra@baylink.com Designer Baylink RFC 2100 Ashworth & Associates The Things I Think '87 e24 St Petersburg FL USA http://baylink.pitas.com +1 727 647 1274
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org http://lists.wikimedia.org/mailman/listinfo/wikitech-l
On Wed, Feb 21, 2007 at 04:04:29PM -0600, Jim Wilson wrote:
So, in short, you dealt with it by running late enough that MW got precedence, and grabbed all the asterisks it thought it was interested in.
That works for me.
Well yeah, if you wanna put it that way :)
- *So this will work* just fine
**While this will* still be a 2nd level bullet
Which, as far as I'm concerned, is the proper thing.
How does it handle this* situation?
Cheers, -- jra * IE: the use of an asterisk as an actual footnote marker
It wouldn't notice. The extension is only concerned with pairs of delimiters which occur at word boundaries (one before and one after).
So we have the following which all work:
Simple _word_ highlighting *is fine*
Usenet syntax within <span>*some tags*</span> works
This sentence *has bold text _and embedded underlined_ text*
And none of these would work:
Trying to _<span>wrap</span>_ some tags.
A sentence spanning *two lines where the second delimiter* occurs on the second line.
A sentence _with only a leading delimiter ... or only_ a trailing delimiter
Mismatched *delimiters_ have no effect either.
Trying to make something *_both underline_* and bold at the same time (leaves asterisks, only makes text underlined)
Or trying to _*have it the other way around*_ (only bold, leaves underscores)
Hope this helps.
-- Jim
On 2/21/07, Jay R. Ashworth jra@baylink.com wrote:
On Wed, Feb 21, 2007 at 04:04:29PM -0600, Jim Wilson wrote:
So, in short, you dealt with it by running late enough that MW got precedence, and grabbed all the asterisks it thought it was interested in.
That works for me.
Well yeah, if you wanna put it that way :)
- *So this will work* just fine
**While this will* still be a 2nd level bullet
Which, as far as I'm concerned, is the proper thing.
How does it handle this* situation?
Cheers, -- jra
- IE: the use of an asterisk as an actual footnote marker
-- Jay R. Ashworth jra@baylink.com Designer Baylink RFC 2100 Ashworth & Associates The Things I Think '87 e24 St Petersburg FL USA http://baylink.pitas.com +1 727 647 1274
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org http://lists.wikimedia.org/mailman/listinfo/wikitech-l
On 2/22/07, Jim Wilson wilson.jim.r@gmail.com wrote:
It wouldn't notice. The extension is only concerned with pairs of delimiters which occur at word boundaries (one before and one after).
Hmm so it wouldn't work with languages which don't have explicite word boundaries like Chinese, Japanese, and Thai then )-:
Andrew Dunbar (hippietrail)
So we have the following which all work:
Simple _word_ highlighting *is fine*
Usenet syntax within <span>*some tags*</span> works
This sentence *has bold text _and embedded underlined_ text*
And none of these would work:
Trying to _<span>wrap</span>_ some tags. A sentence spanning *two lines where the second delimiter* occurs on the second line. A sentence _with only a leading delimiter ... or only_ a trailing delimiter Mismatched *delimiters_ have no effect either. Trying to make something *_both underline_* and bold at the same time
(leaves asterisks, only makes text underlined)
Or trying to _*have it the other way around*_ (only bold, leaves
underscores)
Hope this helps.
-- Jim
On 2/21/07, Jay R. Ashworth jra@baylink.com wrote:
On Wed, Feb 21, 2007 at 04:04:29PM -0600, Jim Wilson wrote:
So, in short, you dealt with it by running late enough that MW got precedence, and grabbed all the asterisks it thought it was interested in.
That works for me.
Well yeah, if you wanna put it that way :)
- *So this will work* just fine
**While this will* still be a 2nd level bullet
Which, as far as I'm concerned, is the proper thing.
How does it handle this* situation?
Cheers, -- jra
- IE: the use of an asterisk as an actual footnote marker
-- Jay R. Ashworth jra@baylink.com Designer Baylink RFC 2100 Ashworth & Associates The Things I Think '87 e24 St Petersburg FL USA http://baylink.pitas.com +1 727 647 1274
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org http://lists.wikimedia.org/mailman/listinfo/wikitech-l
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org http://lists.wikimedia.org/mailman/listinfo/wikitech-l
Hmm so it wouldn't work with languages which don't have explicite word boundaries like Chinese, Japanese, and Thai then )-:
Nope, not at all. I should probably add this to the documentation.
I won't be much help implementing a solution though since wo bu hui shuo Zhong-wen.
On 2/22/07, Andrew Dunbar hippytrail@gmail.com wrote:
On 2/22/07, Jim Wilson wilson.jim.r@gmail.com wrote:
It wouldn't notice. The extension is only concerned with pairs of delimiters which occur at word boundaries (one before and one after).
Hmm so it wouldn't work with languages which don't have explicite word boundaries like Chinese, Japanese, and Thai then )-:
Andrew Dunbar (hippietrail)
So we have the following which all work:
Simple _word_ highlighting *is fine*
Usenet syntax within <span>*some tags*</span> works
This sentence *has bold text _and embedded underlined_ text*
And none of these would work:
Trying to _<span>wrap</span>_ some tags. A sentence spanning *two lines where the second delimiter* occurs on the second line. A sentence _with only a leading delimiter ... or only_ a trailing delimiter Mismatched *delimiters_ have no effect either. Trying to make something *_both underline_* and bold at the same
time
(leaves asterisks, only makes text underlined)
Or trying to _*have it the other way around*_ (only bold, leaves
underscores)
Hope this helps.
-- Jim
On 2/21/07, Jay R. Ashworth jra@baylink.com wrote:
On Wed, Feb 21, 2007 at 04:04:29PM -0600, Jim Wilson wrote:
So, in short, you dealt with it by running late enough that MW got precedence, and grabbed all the asterisks it thought it was
interested
in.
That works for me.
Well yeah, if you wanna put it that way :)
- *So this will work* just fine
**While this will* still be a 2nd level bullet
Which, as far as I'm concerned, is the proper thing.
How does it handle this* situation?
Cheers, -- jra
- IE: the use of an asterisk as an actual footnote marker
-- Jay R. Ashworth jra@baylink.com Designer Baylink
RFC
2100 Ashworth & Associates The Things I
Think '87
e24 St Petersburg FL USA http://baylink.pitas.com +1 727
647
1274
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org http://lists.wikimedia.org/mailman/listinfo/wikitech-l
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org http://lists.wikimedia.org/mailman/listinfo/wikitech-l
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org http://lists.wikimedia.org/mailman/listinfo/wikitech-l
On Wed, Feb 21, 2007 at 04:24:23PM -0600, Jim Wilson wrote:
A sentence spanning *two lines where the second delimiter* occurs on the second line.
Or, to use the more common terminology in a MediaWiki environment:
A sentence spanning *two paragraphs where a hard CR separates them* and the second delimiter occurs not in the same paragraph.
Since that's uncommon in the CR's-break-paragraphs-not-lines world of MediaWiki, it doesn't seem bothersome, to me at least.
Cheers, -- jra
On 2/22/07, Jim Wilson wilson.jim.r@gmail.com wrote:
Dang. When I asked everybody on the IRC channel, they told me it was *bold*, _underline_, and /italics/
That would be the simplest typesetting convention. But if it's semantics, not appearance, that you're after, then I believe the convention is _italics_ and *bold*.
Two questions: * Why *single* punctuation rather than **double**? * Since when does anyone use or want underline on MediaWiki? :) Didn't underline become evil when it became the web standard for hyperlinks?
Steve
On 2/22/07, Steve Bennett stevagewp@gmail.com wrote:
That would be the simplest typesetting convention. But if it's semantics, not appearance, that you're after, then I believe the convention is _italics_ and *bold*.
Though the official "typesetting interpretation" is the default, you can get this behavior (and blow away underlining altogether), add this to your LocalSettings.php after the extension's "require" statement:
$wgUsenetSyntaxMappings = array( '*' => 'bold', '_' => 'italics' );
Two questions:
- Why *single* punctuation rather than **double**?
Fixed. Now supports both *single* and **double** syntax for asterisks. No change has been made for slashes or underscores.
You'll need to download the latest source from here to see the change: http://jimbojw.com/wiki/index.php?title=UsenetSyntax
Note: I have not incremented the extension's version number yet since I still consider this "development time".
- Since when does anyone use or want underline on MediaWiki? :) Didn't
underline become evil when it became the web standard for hyperlinks?
Meh. You can disable this (as per above).
-- Jim
On 2/22/07, Steve Bennett stevagewp@gmail.com wrote:
On 2/22/07, Jim Wilson wilson.jim.r@gmail.com wrote:
Dang. When I asked everybody on the IRC channel, they told me it was *bold*, _underline_, and /italics/
That would be the simplest typesetting convention. But if it's semantics, not appearance, that you're after, then I believe the convention is _italics_ and *bold*.
Two questions:
- Why *single* punctuation rather than **double**?
- Since when does anyone use or want underline on MediaWiki? :) Didn't
underline become evil when it became the web standard for hyperlinks?
Steve
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org http://lists.wikimedia.org/mailman/listinfo/wikitech-l
On 2/23/07, Jim Wilson wilson.jim.r@gmail.com wrote:
Fixed. Now supports both *single* and **double** syntax for asterisks. No change has been made for slashes or underscores.
Heh, I'd assumed you would add // and __, too. But then, I was coming at this from a kind of pure wiki syntax point of view, where as much syntax as possible is just doubled punctuation: ** // __ {{ [[ == . Though I now see that __ would conflict with some magic words like __NOTOC__.
But as long as we're not actually trying to achieve WikiBabel (or whatever it's called), there's no compelling reason to do so.
Steve
You should probably look at reST, a popular wiki-like text format that is based on the same formatting conventions you're trying to emulate.
http://docutils.sourceforge.net/rst.html
~Evan
________________________________________________________________________ Evan Prodromou evan@prodromou.name http://evan.prodromou.name/
On Fri, Feb 23, 2007 at 02:24:09PM +1100, Steve Bennett wrote:
Two questions:
- Why *single* punctuation rather than **double**?
Because the goal is to leverage what people *actually do*. I almost *never* see doubled punctuation.
- Since when does anyone use or want underline on MediaWiki? :) Didn't
underline become evil when it became the web standard for hyperlinks?
Pretty much. :-)
Cheers, -- jra
Steve Bennet wrote:
Heh, I'd assumed you would add // and __, too. But then, I was coming at this from a kind of pure wiki syntax point of view, where as much syntax as possible is just doubled punctuation: ** // __ {{ [[ == . Though I now see that __ would conflict with some magic words like __NOTOC__.
Actually the underscores wouldn't conflict since the hook happens so late in the process. By the time the hook is run, those have already been removed.
And MW is smart enough to, for example __leave this alone__ since "leave this alone" is not a recognized magic word.
Jay R. Ashworth wrote:
Because the goal is to leverage what people *actually do*. I almost *never* see doubled punctuation.
That's a good point too though - I'm beginning to think I should probably take the doubles back out.
Evan Prodromou wrote:
You should probably look at reST, a popular wiki-like text format that is based on the same formatting conventions you're trying to emulate.
Yeah - that's interesting. reST is one of many _many_ light markup languages out there - like Markdown, Textile or APT[1].
Maybe what I should do is fix the current problems with UsenetSyntax (like clobbering through tags and affecting preformatted text blocks) by breaking it up into two extensions:
* One extension that doesn't do anything to the text itself, but adds a hook that other extensions leverage to safely parse for their own syntax * A demo implementation of such a leveraging extension which just so happens to implement Usenet style syntax.
The upside of the previous plan is that I'm less likely to get bogged down in "wouldn't it be cool if's" because extension devs can make their own.
The downside of the plan is that people might still send me "wouldn't it be cool if's", and I'd have gone through the work of abstracting the layers for no benefit.
-- Jim
[1] http://en.wikipedia.org/wiki/List_of_lightweight_markup_languages
On 2/23/07, Jay R. Ashworth jra@baylink.com wrote:
On Fri, Feb 23, 2007 at 02:24:09PM +1100, Steve Bennett wrote:
Two questions:
- Why *single* punctuation rather than **double**?
Because the goal is to leverage what people *actually do*. I almost *never* see doubled punctuation.
- Since when does anyone use or want underline on MediaWiki? :) Didn't
underline become evil when it became the web standard for hyperlinks?
Pretty much. :-)
Cheers,
-- jra
Jay R. Ashworth jra@baylink.com Designer Baylink RFC 2100 Ashworth & Associates The Things I Think '87 e24 St Petersburg FL USA http://baylink.pitas.com +1 727 647 1274
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org http://lists.wikimedia.org/mailman/listinfo/wikitech-l
On Fri, Feb 23, 2007 at 10:17:28AM -0600, Jim Wilson wrote:
Steve Bennet wrote:
Heh, I'd assumed you would add // and __, too. But then, I was coming at this from a kind of pure wiki syntax point of view, where as much syntax as possible is just doubled punctuation: ** // __ {{ [[ == . Though I now see that __ would conflict with some magic words like __NOTOC__.
Actually the underscores wouldn't conflict since the hook happens so late in the process. By the time the hook is run, those have already been removed.
And MW is smart enough to, for example __leave this alone__ since "leave this alone" is not a recognized magic word.
Thanks for the addendum; that was my next question. Of course, overlapping markup like that leaves you open to the possibility someone will expand the magic word list in a later release -- it's *still* not the best idea...
Jay R. Ashworth wrote:
Because the goal is to leverage what people *actually do*. I almost *never* see doubled punctuation.
That's a good point too though - I'm beginning to think I should probably take the doubles back out.
I would say so, myself.
Evan Prodromou wrote:
You should probably look at reST, a popular wiki-like text format that is based on the same formatting conventions you're trying to emulate.
Yeah - that's interesting. reST is one of many _many_ light markup languages out there - like Markdown, Textile or APT[1].
Markdown. Heh. :-)
Maybe what I should do is fix the current problems with UsenetSyntax (like clobbering through tags and affecting preformatted text blocks) by breaking it up into two extensions:
- One extension that doesn't do anything to the text itself, but adds a hook
that other extensions leverage to safely parse for their own syntax
- A demo implementation of such a leveraging extension which just so happens
to implement Usenet style syntax.
Hee. I love good factoring.
The upside of the previous plan is that I'm less likely to get bogged down in "wouldn't it be cool if's" because extension devs can make their own.
Good point.
The downside of the plan is that people might still send me "wouldn't it be cool if's", and I'd have gone through the work of abstracting the layers for no benefit.
Except that you can then say "see how easy it is to..."
Or point them to Asking Good Questions. :-)
Cheers, -- jra
wikitech-l@lists.wikimedia.org