Alan Post wrote:
- Interesting. Is the PEG grammar available for this parser?
*>* *>* -Alan
It's at https://github.com/AboutUs/kiwi/blob/master/src/syntax.leg
Get peg/leg from http://piumarta.com/software/peg/
I just tried it and already found a bug on the first Hello World (it surrounds headers inside paragraphs). It strangely converts templates into underscored words. They may be expecting some other parser piece to restore it. I'm pretty sure there are corner cases in the preprocessor (eg. just looking at the peg file they don't handle mixed case noincludes), but I don't think that should need to be handled by the parser itself.
The grammar looks elegant. I doubt it can really handle full wikitext. But it would be so nice if it did...
I'm one of the authors of the Kiwi parser and will be presenting it at the Data Summit on Friday. The parser is pretty complete but certainly we could use some community support and we encourage feedback and participation! It is a highly functional tool already but it can use some polish. It does actually handle most wikitext, though not absolutely everything.
From your post I can see that you are experiencing a couple of design
decisions we made in writing this parser. We did not set out to match the exact HTML output of MediaWiki, only to output something that will look the same in the browser. This might not be the best approach, but right now this is the case. Our site doesn't have the same needs as Wikipedia so when in doubt we leaned toward what suited our needs and not necessarily ultimate tolerance of poor syntax (though it is somewhat flexible). Another design decision is that everything that you put in comes out wrapped in paragraph tags. Usually this wraps the whole document, so if your whole document was just a heading, then yes it is wrapped in paragraph tags. This is probably not the best way to handle this but it's what it currently does. Feel free to contribute a different solution.
Templates, as you probably know, require full integration with an application to work in the way that MediaWiki handles them, because they require access to the data store, and possibly other configuration information. We built a parser that works independently of the data store (indeed, even on the command line in a somewhat degenerate form). In order to do that, we had to decouple template retrieval from the parse. If you take a look in the Ruby FFI examples, you will see a more elegant handling of templates(though it needs work). When a document is parsed, the parser library makes available a list of templates that were found, the arguments passed to the template, and the unique replacement tag in the document for inserting the template once rendered. Those underscored tags that come out are not a bug, they are those unique tags. There is a switch to disable templates and in that case it just swallows them instead. So the template handling work flow (simplistically) is:
1. Parse original document and generate list of templates, arguments, replacement tags 2. Fetch first template, if there is no recursion needed, insert into original document 3. Fetch next template, etc
We currently recurse 6 templates deep in the bindings we built for AboutUs.org (sysop-only at the moment). Template arguments don't work right now, but it's fairly trivial to do it. We just haven't done it yet.
Like templates, images require some different solutions if the parser is to be decoupled. Our parser does not re-size images, store them, etc. It just works with image URLs. If your application requires images to be regularized, you would need to implement resizing them at upload, or lazily at load time, or whatever works in your scenario. More work is needed in this area, though if you check out http://kiwi.drasticcode.com you can see that most image support is working (no resizing). You can also experiment with the parser there as needed.
Hope that at least helps explain what we've done. Again, feedback and particularly code contributions are appreciated!
Cheers, Karl
Please see below for the complete original message. If the archive is to be believed, Mailman truncated the original to only the first paragraph.
Cheers, Karl
---------- Forwarded message ---------- From: Karl Matthias karl@matthias.org Date: Tue, Feb 1, 2011 at 4:42 PM Subject: Re: [Wikitext-l] New parser: Kiwi To: wikitext-l@lists.wikimedia.org
Alan Post wrote:
Interesting. Is the PEG grammar available for this parser?
-Alan
It's at https://github.com/AboutUs/kiwi/blob/master/src/syntax.leg
Get peg/leg from http://piumarta.com/software/peg/
I just tried it and already found a bug on the first Hello World (it surrounds headers inside paragraphs). It strangely converts templates into underscored words. They may be expecting some other parser piece to restore it. I'm pretty sure there
are corner cases in the preprocessor (eg. just looking at the peg file they don't handle mixed case noincludes), but I don't think that should need to be handled by the parser itself.
The grammar looks elegant. I doubt it can really handle full wikitext.
But it would be so nice if it did...
I'm one of the authors of the Kiwi parser and will be presenting it at the Data Summit on Friday. The parser is pretty complete but certainly we could use some community support and we encourage feedback and participation! It is a highly functional tool already but it can use some polish. It does actually handle most wikitext, though not absolutely everything.
From your post I can see that you are experiencing a couple of design
decisions we made in writing this parser. We did not set out to match the exact HTML output of MediaWiki, only to output something that will look the same in the browser. This might not be the best approach, but right now this is the case. Our site doesn't have the same needs as Wikipedia so when in doubt we leaned toward what suited our needs and not necessarily ultimate tolerance of poor syntax (though it is somewhat flexible). Another design decision is that everything that you put in comes out wrapped in paragraph tags. Usually this wraps the whole document, so if your whole document was just a heading, then yes it is wrapped in paragraph tags. This is probably not the best way to handle this but it's what it currently does. Feel free to contribute a different solution.
Templates, as you probably know, require full integration with an application to work in the way that MediaWiki handles them, because they require access to the data store, and possibly other configuration information. We built a parser that works independently of the data store (indeed, even on the command line in a somewhat degenerate form). In order to do that, we had to decouple template retrieval from the parse. If you take a look in the Ruby FFI examples, you will see a more elegant handling of templates(though it needs work). When a document is parsed, the parser library makes available a list of templates that were found, the arguments passed to the template, and the unique replacement tag in the document for inserting the template once rendered. Those underscored tags that come out are not a bug, they are those unique tags. There is a switch to disable templates and in that case it just swallows them instead. So the template handling work flow (simplistically) is:
Parse original document and generate list of templates, arguments, replacement tags Fetch first template, if there is no recursion needed, insert into original document Fetch next template, etc
We currently recurse 6 templates deep in the bindings we built for AboutUs.org (sysop-only at the moment). Template arguments don't work right now, but it's fairly trivial to do it. We just haven't done it yet.
Like templates, images require some different solutions if the parser is to be decoupled. Our parser does not re-size images, store them, etc. It just works with image URLs. If your application requires images to be regularized, you would need to implement resizing them at upload, or lazily at load time, or whatever works in your scenario. More work is needed in this area, though if you check out http://kiwi.drasticcode.com you can see that most image support is working (no resizing). You can also experiment with the parser there as needed.
Hope that at least helps explain what we've done. Again, feedback and particularly code contributions are appreciated!
Cheers, Karl
Apologies... even the second attempt was truncated it seems. Here's one final try
Karl ----------- Alan Post wrote: > Interesting. Is the PEG grammar available for this parser?
> > -Alan
It's at https://github.com/AboutUs/kiwi/blob/master/src/syntax.leg
Get peg/leg from http://piumarta.com/software/peg/
I just tried it and already found a bug on the first Hello World (it surrounds headers inside paragraphs). It strangely converts templates into underscored words. They may be expecting some other parser piece to restore it. I'm pretty sure there
are corner cases in the preprocessor (eg. just looking at the peg file they don't handle mixed case noincludes), but I don't think that should need to be handled by the parser itself.
The grammar looks elegant. I doubt it can really handle full wikitext.
But it would be so nice if it did...
I'm one of the authors of the Kiwi parser and will be presenting it at the Data Summit on Friday. The parser is pretty complete but certainly we could use some community support and we encourage feedback and participation! It is a highly functional tool already but it can use some polish. It does actually handle most wikitext, though not absolutely everything.
From your post I can see that you are experiencing a couple of design
decisions we made in writing this parser. We did not set out to match the exact HTML output of MediaWiki, only to output something that will look the same in the browser. This might not be the best approach, but right now this is the case. Our site doesn't have the same needs as Wikipedia so when in doubt we leaned toward what suited our needs and not necessarily ultimate tolerance of poor syntax (though it is somewhat flexible). Another design decision is that everything that you put in comes out wrapped in paragraph tags. Usually this wraps the whole document, so if your whole document was just a heading, then yes it is wrapped in paragraph tags. This is probably not the best way to handle this but it's what it currently does. Feel free to contribute a different solution.
Templates, as you probably know, require full integration with an application to work in the way that MediaWiki handles them, because they require access to the data store, and possibly other configuration information. We built a parser that works independently of the data store (indeed, even on the command line in a somewhat degenerate form). In order to do that, we had to decouple template retrieval from the parse. If you take a look in the Ruby FFI examples, you will see a more elegant handling of templates(though it needs work). When a document is parsed, the parser library makes available a list of templates that were found, the arguments passed to the template, and the unique replacement tag in the document for inserting the template once rendered. Those underscored tags that come out are not a bug, they are those unique tags. There is a switch to disable templates and in that case it just swallows them instead. So the template handling work flow (simplistically) is:
1. Parse original document and generate list of templates, arguments, replacement tags 2. Fetch first template, if there is no recursion needed, insert into original document 3. Fetch next template, etc
We currently recurse 6 templates deep in the bindings we built for AboutUs.org (sysop-only at the moment). Template arguments don't work right now, but it's fairly trivial to do it. We just haven't done it yet.
Like templates, images require some different solutions if the parser is to be decoupled. Our parser does not re-size images, store them, etc. It just works with image URLs. If your application requires images to be regularized, you would need to implement resizing them at upload, or lazily at load time, or whatever works in your scenario. More work is needed in this area, though if you check out http://kiwi.drasticcode.com you can see that most image support is working (no resizing). You can also experiment with the parser there as needed.
Hope that at least helps explain what we've done. Again, feedback and particularly code contributions are appreciated!
Cheers, Karl
2011-02-02 01:48, Karl Matthias skrev:
Apologies... even the second attempt was truncated it seems. Here's one final try
You are hit by the same problem I was a few days ago on this list. You have a line that starts with "From your" in the text.
/Andreas
Karl
Alan Post wrote: > Interesting. Is the PEG grammar available for this parser? > > -Alan It's at https://github.com/AboutUs/kiwi/blob/master/src/syntax.leg Get peg/leg from http://piumarta.com/software/peg/ I just tried it and already found a bug on the first Hello World (it surrounds headers inside paragraphs). It strangely converts templates into underscored words. They may be expecting some other parser piece to restore it. I'm pretty sure there are corner cases in the preprocessor (eg. just looking at the peg file they don't handle mixed case noincludes), but I don't think that should need to be handled by the parser itself. The grammar looks elegant. I doubt it can really handle full wikitext. But it would be so nice if it did...
I'm one of the authors of the Kiwi parser and will be presenting it at the Data Summit on Friday. The parser is pretty complete but certainly we could use some community support and we encourage feedback and participation! It is a highly functional tool already but it can use some polish. It does actually handle most wikitext, though not absolutely everything.
From your post I can see that you are experiencing a couple of design
decisions we made in writing this parser. We did not set out to match the exact HTML output of MediaWiki, only to output something that will look the same in the browser. This might not be the best approach, but right now this is the case. Our site doesn't have the same needs as Wikipedia so when in doubt we leaned toward what suited our needs and not necessarily ultimate tolerance of poor syntax (though it is somewhat flexible). Another design decision is that everything that you put in comes out wrapped in paragraph tags. Usually this wraps the whole document, so if your whole document was just a heading, then yes it is wrapped in paragraph tags. This is probably not the best way to handle this but it's what it currently does. Feel free to contribute a different solution.
Templates, as you probably know, require full integration with an application to work in the way that MediaWiki handles them, because they require access to the data store, and possibly other configuration information. We built a parser that works independently of the data store (indeed, even on the command line in a somewhat degenerate form). In order to do that, we had to decouple template retrieval from the parse. If you take a look in the Ruby FFI examples, you will see a more elegant handling of templates(though it needs work). When a document is parsed, the parser library makes available a list of templates that were found, the arguments passed to the template, and the unique replacement tag in the document for inserting the template once rendered. Those underscored tags that come out are not a bug, they are those unique tags. There is a switch to disable templates and in that case it just swallows them instead. So the template handling work flow (simplistically) is:
- Parse original document and generate list of templates,
arguments, replacement tags 2. Fetch first template, if there is no recursion needed, insert into original document 3. Fetch next template, etc
We currently recurse 6 templates deep in the bindings we built for AboutUs.org (sysop-only at the moment). Template arguments don't work right now, but it's fairly trivial to do it. We just haven't done it yet.
Like templates, images require some different solutions if the parser is to be decoupled. Our parser does not re-size images, store them, etc. It just works with image URLs. If your application requires images to be regularized, you would need to implement resizing them at upload, or lazily at load time, or whatever works in your scenario. More work is needed in this area, though if you check out http://kiwi.drasticcode.com you can see that most image support is working (no resizing). You can also experiment with the parser there as needed.
Hope that at least helps explain what we've done. Again, feedback and particularly code contributions are appreciated!
Cheers, Karl
Wikitext-l mailing list Wikitext-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitext-l
Andreas Jonsson wrote:
2011-02-02 01:48, Karl Matthias skrev:
Apologies... even the second attempt was truncated it seems. Here's one final try
You are hit by the same problem I was a few days ago on this list. You have a line that starts with "From your" in the text.
/Andreas
Ah, it's a mailman bug. I thought it had been your MUA, Andreas. lists.wikimedia.org show "Pipermail 0.09" which is the same reported by the latest mailman (2.1.9), it's probably something to report upstream. http://lists.wikimedia.org/pipermail/wikitext-l/2011-February/thread.html
Will reply you later, Karl.
Ah, and I see that people did receive the original: it's just the archive that is broken. Thanks for that.
Cheers, Karl
On Tue, Feb 1, 2011 at 11:19 PM, Andreas Jonsson andreas.jonsson@kreablo.se wrote:
2011-02-02 01:48, Karl Matthias skrev:
Apologies... even the second attempt was truncated it seems. Here's one final try
You are hit by the same problem I was a few days ago on this list. You have a line that starts with "From your" in the text.
/Andreas
Karl
Alan Post wrote: > Interesting. Is the PEG grammar available for this parser?
> > -Alan
It's at https://github.com/AboutUs/kiwi/blob/master/src/syntax.leg
Get peg/leg from http://piumarta.com/software/peg/
I just tried it and already found a bug on the first Hello World (it surrounds headers inside paragraphs). It strangely converts templates into underscored words. They may be expecting some other parser piece to restore it. I'm pretty sure there
are corner cases in the preprocessor (eg. just looking at the peg file they don't handle mixed case noincludes), but I don't think that should need to be handled by the parser itself.
The grammar looks elegant. I doubt it can really handle full wikitext.
But it would be so nice if it did...
I'm one of the authors of the Kiwi parser and will be presenting it at the Data Summit on Friday. The parser is pretty complete but certainly we could use some community support and we encourage feedback and participation! It is a highly functional tool already but it can use some polish. It does actually handle most wikitext, though not absolutely everything.
From your post I can see that you are experiencing a couple of design
decisions we made in writing this parser. We did not set out to match the exact HTML output of MediaWiki, only to output something that will look the same in the browser. This might not be the best approach, but right now this is the case. Our site doesn't have the same needs as Wikipedia so when in doubt we leaned toward what suited our needs and not necessarily ultimate tolerance of poor syntax (though it is somewhat flexible). Another design decision is that everything that you put in comes out wrapped in paragraph tags. Usually this wraps the whole document, so if your whole document was just a heading, then yes it is wrapped in paragraph tags. This is probably not the best way to handle this but it's what it currently does. Feel free to contribute a different solution.
Templates, as you probably know, require full integration with an application to work in the way that MediaWiki handles them, because they require access to the data store, and possibly other configuration information. We built a parser that works independently of the data store (indeed, even on the command line in a somewhat degenerate form). In order to do that, we had to decouple template retrieval from the parse. If you take a look in the Ruby FFI examples, you will see a more elegant handling of templates(though it needs work). When a document is parsed, the parser library makes available a list of templates that were found, the arguments passed to the template, and the unique replacement tag in the document for inserting the template once rendered. Those underscored tags that come out are not a bug, they are those unique tags. There is a switch to disable templates and in that case it just swallows them instead. So the template handling work flow (simplistically) is:
1. Parse original document and generate list of templates, arguments, replacement tags 2. Fetch first template, if there is no recursion needed, insert into original document 3. Fetch next template, etc
We currently recurse 6 templates deep in the bindings we built for AboutUs.org (sysop-only at the moment). Template arguments don't work right now, but it's fairly trivial to do it. We just haven't done it yet.
Like templates, images require some different solutions if the parser is to be decoupled. Our parser does not re-size images, store them, etc. It just works with image URLs. If your application requires images to be regularized, you would need to implement resizing them at upload, or lazily at load time, or whatever works in your scenario. More work is needed in this area, though if you check out http://kiwi.drasticcode.com you can see that most image support is working (no resizing). You can also experiment with the parser there as needed.
Hope that at least helps explain what we've done. Again, feedback and particularly code contributions are appreciated!
Cheers, Karl
Wikitext-l mailing list Wikitext-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitext-l
Wikitext-l mailing list Wikitext-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitext-l
FYI: *we* are seeing your entire message, on-list -- j
----- Original Message -----
From: "Karl Matthias" karl@matthias.org To: wikitext-l@lists.wikimedia.org Sent: Tuesday, February 1, 2011 7:48:30 PM Subject: Re: [Wikitext-l] New parser: Kiwi Apologies... even the second attempt was truncated it seems. Here's one final try
Karl
Alan Post wrote:
Interesting. Is the PEG grammar available for this parser?
-Alan
It's at https://github.com/AboutUs/kiwi/blob/master/src/syntax.leg
Get peg/leg from http://piumarta.com/software/peg/
I just tried it and already found a bug on the first Hello World (it surrounds headers inside paragraphs). It strangely converts templates into underscored words. They may be expecting some other parser piece to restore it. I'm pretty sure there
are corner cases in the preprocessor (eg. just looking at the peg file they don't handle mixed case noincludes), but I don't think that should need to be handled by the parser itself.
The grammar looks elegant. I doubt it can really handle full wikitext.
But it would be so nice if it did...
I'm one of the authors of the Kiwi parser and will be presenting it at the Data Summit on Friday. The parser is pretty complete but certainly we could use some community support and we encourage feedback and participation! It is a highly functional tool already but it can use some polish. It does actually handle most wikitext, though not absolutely everything.
From your post I can see that you are experiencing a couple of design decisions we made in writing this parser. We did not set out to match the exact HTML output of MediaWiki, only to output something that will look the same in the browser. This might not be the best approach, but right now this is the case. Our site doesn't have the same needs as Wikipedia so when in doubt we leaned toward what suited our needs and not necessarily ultimate tolerance of poor syntax (though it is somewhat flexible). Another design decision is that everything that you put in comes out wrapped in paragraph tags. Usually this wraps the whole document, so if your whole document was just a heading, then yes it is wrapped in paragraph tags. This is probably not the best way to handle this but it's what it currently does. Feel free to contribute a different solution.
Templates, as you probably know, require full integration with an application to work in the way that MediaWiki handles them, because they require access to the data store, and possibly other configuration information. We built a parser that works independently of the data store (indeed, even on the command line in a somewhat degenerate form). In order to do that, we had to decouple template retrieval from the parse. If you take a look in the Ruby FFI examples, you will see a more elegant handling of templates(though it needs work). When a document is parsed, the parser library makes available a list of templates that were found, the arguments passed to the template, and the unique replacement tag in the document for inserting the template once rendered. Those underscored tags that come out are not a bug, they are those unique tags. There is a switch to disable templates and in that case it just swallows them instead. So the template handling work flow (simplistically) is:
- Parse original document and generate list of templates,
arguments, replacement tags 2. Fetch first template, if there is no recursion needed, insert into original document 3. Fetch next template, etc
We currently recurse 6 templates deep in the bindings we built for AboutUs.org (sysop-only at the moment). Template arguments don't work right now, but it's fairly trivial to do it. We just haven't done it yet.
Like templates, images require some different solutions if the parser is to be decoupled. Our parser does not re-size images, store them, etc. It just works with image URLs. If your application requires images to be regularized, you would need to implement resizing them at upload, or lazily at load time, or whatever works in your scenario. More work is needed in this area, though if you check out http://kiwi.drasticcode.com you can see that most image support is working (no resizing). You can also experiment with the parser there as needed.
Hope that at least helps explain what we've done. Again, feedback and particularly code contributions are appreciated!
Cheers, Karl
Wikitext-l mailing list Wikitext-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitext-l
Karl Matthias wrote:
I'm one of the authors of the Kiwi parser and will be presenting it at the Data Summit on Friday. The parser is pretty complete but certainly we could use some community support and we encourage feedback and participation! It is a highly functional tool already but it can use some polish. It does actually handle most wikitext, though not absolutely everything.
From your post I can see that you are experiencing a couple of design decisions we made in writing this parser. We did not set out to match the exact HTML output of MediaWiki, only to output something that will look the same in the browser. This might not be the best approach, but right now this is the case. Our site doesn't have the same needs as Wikipedia so when in doubt we leaned toward what suited our needs and not necessarily ultimate tolerance of poor syntax (though it is somewhat flexible).
I felt bad for pointing out issues just after first try. I understand that you have a much smaller content than wikipedia, and can use just a subset of the markup without about corner cases. I approach it as a tool which could work for the bigger parser, though. Currently, it looks as just another wiki syntax, looking similar to MediaWiki one.
Another design decision is that everything that you put in comes out wrapped in paragraph tags. Usually this wraps the whole document, so if your whole document was just a heading, then yes it is wrapped in paragraph tags. This is probably not the best way to handle this but it's what it currently does. Feel free to contribute a different solution.
It doesn't seem to be legal html*, so I wouldn't justify it just as a "design decision". Same could be argued for nested <p> tags.
* opening the <hX> seems to implicitely close the previous <p>, leading to an unmatched </p>.
Templates, as you probably know, require full integration with an application to work in the way that MediaWiki handles them, because they require access to the data store, and possibly other configuration information. We built a parser that works independently of the data store (indeed, even on the command line in a somewhat degenerate form). In order to do that, we had to decouple template retrieval from the parse. If you take a look in the Ruby FFI examples, you will see a more elegant handling of templates(though it needs work). When a document is parsed, the parser library makes available a list of templates that were found, the arguments passed to the template, and the unique replacement tag in the document for inserting the template once rendered. Those underscored tags that come out are not a bug, they are those unique tags.
I supposed that it was somehting like that, but it was odd that it did such conversion instead of leaving them as literals in such case. I used just the parser binary. I have been looking at the ruby code, and despite of the foreign language, understanding a bit more of its work.
Like templates, images require some different solutions if the parser is to be decoupled. Our parser does not re-size images, store them, etc. It just works with image URLs. If your application requires images to be regularized, you would need to implement resizing them at upload, or lazily at load time, or whatever works in your scenario.
A parser shouldn't really need to handle images. At most it would provide a callback so that the app could do something with the image urls.
More work is needed in this area, though if you check out http://kiwi.drasticcode.com you can see that most image support is working (no resizing). You can also experiment with the parser there as needed.
The url mapping used there, make some titles impossible to use, such as making an entry for [[Edit]] - http://en.wikipedia.org/wiki/Edit
Hope that at least helps explain what we've done. Again, feedback and particularly code contributions are appreciated!
Cheers, Karl
Just code lurking for now :)
On Wed, Feb 2, 2011 at 3:08 PM, Platonides platonides@gmail.com wrote:
I approach it as a tool which could work for the bigger parser, though. Currently, it looks as just another wiki syntax, looking similar to MediaWiki one.
I think it is a tool that shows promise in that regard as well. With regard to "just another syntax": we can probably support all or at least most of the most important edge cases using this methodology. It will make it much uglier, but it probably can work. The question is to what lengths you go to support poorly formed markup. That answer will probably be different based on the accumulated content at various sites. Our parser isn't too tolerant right now of poorly formed markup. On our site that's ok. If people want to help us make it more tolerant we'd be interested in seeing how that turns out. I suspect it could at least double the size of the grammar based on what Ward tells me that Dirk Riehle's group found with WikiCreole. But a community effort could probably make it doable.
It doesn't seem to be legal html*, so I wouldn't justify it just as a "design decision". Same could be argued for nested <p> tags.
It's not 100% legal right now and the most egregious spot is the paragraph tags. It can be modified but doing it this way got it off the ground faster. Hence it was a design decision. But we probably will modify it to behave better in that regard. If someone wants to contribute the changes to do it, that will make it happen much faster as it's low on the list right now. Fork it on GitHub and go for it! Make the changes and submit a pull request and we'll review it. Note that MediaWiki doesn't generate 100% valid markup (but it's cleaner than ours right now!).
- opening the <hX> seems to implicitely close the previous <p>, leading
to an unmatched </p>.
I hadn't noticed this, I'll check that out. Thanks!
Templates [...snip...] I supposed that it was somehting like that, but it was odd that it did such conversion instead of leaving them as literals in such case. I used just the parser binary. I have been looking at the ruby code, and despite of the foreign language, understanding a bit more of its work.
The replacement with the hashed tag is done so that we can use a simple context-unaware string replacement on the output. If we left them in the original form we would have to know the difference between a template call inside noinclude tags and one that isn't--at render time when we have no state on the document. Given that the help info for many templates show exact calls to the template placed within noinclude tags, this would be a common bug. It's not the only possible solution but it's a simple one.
Like templates, images require some different solutions if the parser is to be decoupled.[...snip...]
A parser shouldn't really need to handle images. At most it would provide a callback so that the app could do something with the image urls.
We don't do callbacks on purpose so that we can separate the parser completely from the calling code. Our design would put the information in a place where a calling application can get to it (e.g. the list of Templates). But consider that MediaWiki actually does handle images and adds markup for height and width, etc. It makes database calls to determine "bad images", etc. This is something a separate parser can't do in the same way. A mechanism needs to be put in place to allow the calling application to do this work if it so chooses. It's fairly straightforward to do it.
More work is needed in this area, though if you check out http://kiwi.drasticcode.com you can see that most image support is working (no resizing). You can also experiment with the parser there as needed.
The url mapping used there, make some titles impossible to use, such as making an entry for [[Edit]] - http://en.wikipedia.org/wiki/Edit
You are right about that. I'm sure Sam would be happy to accept contributions to change that. The site does support double-click to edit, though, so making links to Edit is kind of unnecessary.
Just code lurking for now :)
No worries, the feedback is appreciated.
Cheers, Karl
Karl Matthias wrote:
The url mapping used there, make some titles impossible to use, such as making an entry for [[Edit]] - http://en.wikipedia.org/wiki/Edit
You are right about that. I'm sure Sam would be happy to accept contributions to change that. The site does support double-click to edit, though, so making links to Edit is kind of unnecessary.
It's not just edit, but all actions, such as upload. The real solution is to have the wiki items inside a "folder" and the actions outside. You could prefix actions, like mediawiki does (eg. Action:Edit, and forbidding pages starting with Action:) but you would still have the classic problems for root folder items such as favicon.ico. See http://www.mediawiki.org/wiki/Manual:Wiki_in_site_root_directory#Reasons_why...
Thanks, I'll forward your feedback on to Sam who wrote wikiwiki.
Cheers, Karl
On Thu, Feb 3, 2011 at 12:41 AM, Platonides platonides@gmail.com wrote:
Karl Matthias wrote:
The url mapping used there, make some titles impossible to use, such as making an entry for [[Edit]] - http://en.wikipedia.org/wiki/Edit
You are right about that. I'm sure Sam would be happy to accept contributions to change that. The site does support double-click to edit, though, so making links to Edit is kind of unnecessary.
It's not just edit, but all actions, such as upload. The real solution is to have the wiki items inside a "folder" and the actions outside. You could prefix actions, like mediawiki does (eg. Action:Edit, and forbidding pages starting with Action:) but you would still have the classic problems for root folder items such as favicon.ico. See http://www.mediawiki.org/wiki/Manual:Wiki_in_site_root_directory#Reasons_why...
Wikitext-l mailing list Wikitext-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitext-l
wikitext-l@lists.wikimedia.org