http://lists.wikimedia.org/mailman/listinfo/wikitext-l
Wikitext-l was formed from a recent discussion on wikitech-l about the need to sanely reimplement the current parser, which is a Horrible Mess and pretty much impossible to reimplement in another language.
The MediaWiki parser definition is literally "whatever the PHP parser does." Some of what it does is arguably very wrong, pathological, magical or just a Stupid Parser Trick. So the list has been formed to come up with a grammar that defines all the useful parts of the present parser, and so can be used by anyone to implement a MediaWiki wikitext parser. This will be useful for other software, for WYSIWYG editing extensions ... all manner of things.
Some of what some people would think of as a "stupid parser trick" is in fact important - e.g. L'''uomo'' which renders as L<i>uomo</i> (necessary for French and Italian).
So: we need to know what MediaWiki quirks are supporting important constructs in languages other than English (which is the language the list is in, and is the native language of most of the participants), and particularly in non-European languages.
This list is unlikely to implement new features, e.g. (an example brought up by GerardM) the double-apostrophe in Neapolitan. But we really need to know about present important features that wouldn't be obvious to an English-speaker going through the present parser code.
- d.
Err, WikiCreole http://wikicreole.org/? It's already been done. But according to Chunk and Christopher, the MediaWiki developers were not willing to adapt MediaWiki to WikiCreole....
2007/11/17, David Gerard dgerard@gmail.com:
http://lists.wikimedia.org/mailman/listinfo/wikitext-l
Wikitext-l was formed from a recent discussion on wikitech-l about the need to sanely reimplement the current parser, which is a Horrible Mess and pretty much impossible to reimplement in another language.
The MediaWiki parser definition is literally "whatever the PHP parser does." Some of what it does is arguably very wrong, pathological, magical or just a Stupid Parser Trick. So the list has been formed to come up with a grammar that defines all the useful parts of the present parser, and so can be used by anyone to implement a MediaWiki wikitext parser. This will be useful for other software, for WYSIWYG editing extensions ... all manner of things.
Some of what some people would think of as a "stupid parser trick" is in fact important - e.g. L'''uomo'' which renders as L<i>uomo</i> (necessary for French and Italian).
So: we need to know what MediaWiki quirks are supporting important constructs in languages other than English (which is the language the list is in, and is the native language of most of the participants), and particularly in non-European languages.
This list is unlikely to implement new features, e.g. (an example brought up by GerardM) the double-apostrophe in Neapolitan. But we really need to know about present important features that wouldn't be obvious to an English-speaker going through the present parser code.
- d.
foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: http://lists.wikimedia.org/mailman/listinfo/foundation-l
Oops, I meant Christoph, not Christopher. The people from SC1 at Wikimaniahttp://wikimania2007.wikimedia.org/wiki/Proceedings:SC1, for those who attended.
2007/11/17, Jon Harald Søby jhsoby@gmail.com:
Err, WikiCreole http://wikicreole.org/? It's already been done. But according to Chunk and Christopher, the MediaWiki developers were not willing to adapt MediaWiki to WikiCreole....
2007/11/17, David Gerard dgerard@gmail.com:
http://lists.wikimedia.org/mailman/listinfo/wikitext-l
Wikitext-l was formed from a recent discussion on wikitech-l about the need to sanely reimplement the current parser, which is a Horrible Mess and pretty much impossible to reimplement in another language.
The MediaWiki parser definition is literally "whatever the PHP parser does." Some of what it does is arguably very wrong, pathological, magical or just a Stupid Parser Trick. So the list has been formed to come up with a grammar that defines all the useful parts of the present parser, and so can be used by anyone to implement a MediaWiki wikitext parser. This will be useful for other software, for WYSIWYG editing extensions ... all manner of things.
Some of what some people would think of as a "stupid parser trick" is in fact important - e.g. L'''uomo'' which renders as L<i>uomo</i> (necessary for French and Italian).
So: we need to know what MediaWiki quirks are supporting important constructs in languages other than English (which is the language the list is in, and is the native language of most of the participants), and particularly in non-European languages.
This list is unlikely to implement new features, e.g. (an example brought up by GerardM) the double-apostrophe in Neapolitan. But we really need to know about present important features that wouldn't be obvious to an English-speaker going through the present parser code.
- d.
foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: http://lists.wikimedia.org/mailman/listinfo/foundation-l
-- Jon Harald Søby http://meta.wikimedia.org/wiki/User:Jon_Harald_S%C3%B8by
On 17/11/2007, Jon Harald Søby jhsoby@gmail.com wrote:
Err, WikiCreole http://wikicreole.org/? It's already been done. But according to Chunk and Christopher, the MediaWiki developers were not willing to adapt MediaWiki to WikiCreole....
No, that's a different wikitext grammar altogether. This is about implementing a proper grammar that does MediaWiki wikitext. Something we could swap in and the users would almost not notice. Hence the request for odd-but-important bits that would look like quirks to those who don't know the language that needs them.
- d.
David Gerard wrote:
On 17/11/2007, Jon Harald Søby wrote:
Err, WikiCreole http://wikicreole.org/? It's already been done. But according to Chunk and Christopher, the MediaWiki developers were not willing to adapt MediaWiki to WikiCreole....
No, that's a different wikitext grammar altogether. This is about implementing a proper grammar that does MediaWiki wikitext. Something we could swap in and the users would almost not notice. Hence the request for odd-but-important bits that would look like quirks to those who don't know the language that needs them.
- d.
And WikiCreole is not easier to convert into a grammar than mediawiki's.
On 17/11/2007, Platonides Platonides@gmail.com wrote:
David Gerard wrote:
On 17/11/2007, Jon Harald Søby wrote:
Err, WikiCreole http://wikicreole.org/? It's already been done. But according to Chunk and Christopher, the MediaWiki developers were not willing to adapt MediaWiki to WikiCreole....
No, that's a different wikitext grammar altogether. This is about implementing a proper grammar that does MediaWiki wikitext. Something we could swap in and the users would almost not notice. Hence the request for odd-but-important bits that would look like quirks to those who don't know the language that needs them.
And WikiCreole is not easier to convert into a grammar than mediawiki's.
Yeah. This is about MediaWiki text, not WikiCreole.
Although if we can modularise the wikitext processing sufficiently (and making it independent of the code will help that), then attaching new grammar-processing modules will be much more feasible. But this is getting off-topic for this list :-)
Anyway. What features of the present parser does your language rely on that English doesn't? Please let us know!
- d.
Hoi, When you say that the needs of the Neaplotian language will be not honoured. What is the point of asking ? Thanks, GerardM
On Nov 17, 2007 4:32 PM, David Gerard dgerard@gmail.com wrote:
On 17/11/2007, Platonides Platonides@gmail.com wrote:
David Gerard wrote:
On 17/11/2007, Jon Harald Søby wrote:
Err, WikiCreole http://wikicreole.org/? It's already been done. But according to Chunk and Christopher, the MediaWiki developers were not willing to adapt MediaWiki to WikiCreole....
No, that's a different wikitext grammar altogether. This is about implementing a proper grammar that does MediaWiki wikitext. Something we could swap in and the users would almost not notice. Hence the request for odd-but-important bits that would look like quirks to those who don't know the language that needs them.
And WikiCreole is not easier to convert into a grammar than mediawiki's.
Yeah. This is about MediaWiki text, not WikiCreole.
Although if we can modularise the wikitext processing sufficiently (and making it independent of the code will help that), then attaching new grammar-processing modules will be much more feasible. But this is getting off-topic for this list :-)
Anyway. What features of the present parser does your language rely on that English doesn't? Please let us know!
- d.
foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: http://lists.wikimedia.org/mailman/listinfo/foundation-l
On 17/11/2007, GerardM gerard.meijssen@gmail.com wrote:
When you say that the needs of the Neaplotian language will be not honoured. What is the point of asking ?
I'm saying that if they can be (and I agree it's an important issue, which is why I mentioned it), then this is not the place it will be - it'll need something new in wikitext for old or new parser. That's a separate issue from reimplementing the present parser with as many of its quirks and flaws as are needed.
If someone can specify precisely what would allow a naturally-typed '' in Neapolitan to just work (without the parser flagging it as toggling italics) in a way that could go into the old parser, without breaking anything else, I expect good code would go in in the usual manner. And the specification of how to make '' work properly without breaking anything else would be a definite asset to the new parser grammar. So (I presume, not being a dev) what's needed now to solve the '' problem is good new code.
- d.
On 17/11/2007, David Gerard dgerard@gmail.com wrote:
I'm saying that if they can be (and I agree it's an important issue, which is why I mentioned it), then this is not the place it will be - it'll need something new in wikitext for old or new parser. That's a separate issue from reimplementing the present parser with as many of its quirks and flaws as are needed. If someone can specify precisely what would allow a naturally-typed '' in Neapolitan to just work (without the parser flagging it as toggling italics) in a way that could go into the old parser, without breaking anything else, I expect good code would go in in the usual manner. And the specification of how to make '' work properly without breaking anything else would be a definite asset to the new parser grammar. So (I presume, not being a dev) what's needed now to solve the '' problem is good new code.
Actually, I may be wrong - wikitext-l's full description is "MediaWiki's parser and syntax", so it might in fact be a good place to raise important missing features like a '' that Just Works.
Again, though, I expect the first thing for a new feature like this is to specify what you want the parser to do to pick it up that doesn't trigger "toggle italics." I recall the discussion on wikitech-l being very annoyed at how complicated the apostrophe-handling code is ...
So: describe precisely what's needed, file a bug (and ideally a code patch) and mention it on wikitext-l :-)
And are there any other missing features that would greatly benefit some languages? Same goes for them.
- d.
you know of any analysis of wiki-grammars concerning wikicreole and/or mediawiki text?
On Nov 17, 2007 4:17 PM, Platonides Platonides@gmail.com wrote:
David Gerard wrote:
On 17/11/2007, Jon Harald Søby wrote:
Err, WikiCreole http://wikicreole.org/? It's already been done. But according to Chunk and Christopher, the MediaWiki developers were not willing to adapt MediaWiki to WikiCreole....
No, that's a different wikitext grammar altogether. This is about implementing a proper grammar that does MediaWiki wikitext. Something we could swap in and the users would almost not notice. Hence the request for odd-but-important bits that would look like quirks to those who don't know the language that needs them.
- d.
And WikiCreole is not easier to convert into a grammar than mediawiki's.
foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: http://lists.wikimedia.org/mailman/listinfo/foundation-l
Hello,
David Gerard a écrit :
http://lists.wikimedia.org/mailman/listinfo/wikitext-l
Wikitext-l was formed from a recent discussion on wikitech-l about the need to sanely reimplement the current parser, which is a Horrible Mess and pretty much impossible to reimplement in another language.
The MediaWiki parser definition is literally "whatever the PHP parser does." Some of what it does is arguably very wrong, pathological, magical or just a Stupid Parser Trick. So the list has been formed to come up with a grammar that defines all the useful parts of the present parser, and so can be used by anyone to implement a MediaWiki wikitext parser. This will be useful for other software, for WYSIWYG editing extensions ... all manner of things.
Some of what some people would think of as a "stupid parser trick" is in fact important - e.g. L'''uomo'' which renders as L<i>uomo</i> (necessary for French and Italian).
Actually, the proper French apostrophe should be ’ (Unicode : U2019, Code HTML : ’) not ' On the French Wikisource, we systematically replace ' with ’ in all articles and titles with bots (keeping redirects). So actually, ''' should be ’'' in proper French typography.
The issue is that ’ is not in the standard French keyboard, and it does not exist in Latin1 (like œ for oe). There is also the problem with broken softwares, like copy-paste in a non compliant Unicode editor, etc. That's why it is so really used.
- d.
Regards,
Yann
On 18/11/2007, Claudio Mastroianni gattonero@gmail.com wrote:
Some of what some people would think of as a "stupid parser trick" is in fact important - e.g. L'''uomo'' which renders as L<i>uomo</i> (necessary for French and Italian).
That's a "false problem". ' is different than "...
No-one is talking about double-quotes, we're talking about two single-quotes in a row, which is the syntax for italics in wikitext. It *is* a real problem.
wikimedia-l@lists.wikimedia.org