I already got it in your reply here:
Maybe you are underestimating the vast differences in implementation between the current not-really-a-parser and what I am working on.
There is nothing wrong with using a group of templates together, but there *is* something majorly wrong with patching together one object (a table, in this case) using pieces from different places. It works with the current not-really-a-parser because it takes the wiki source texts from the templates, sticks them together somehow, and then converts them to HTML. This kind of practice is exactly what leads to all the problems with our current not-really-a-parser. A proper parser should parse each template individually, and then use its parse tree in the processing of the page that uses it.
It's great that you're working on a different way to do it thats not just dumb-text-includes.
On Mon, 30 Aug 2004 16:55:01 +0100, Timwi timwi@gmx.net wrote:
Ævar Arnfjörð Bjarmason wrote:
Why would it ever break? I can see it getting slow because it cannot be optimized but not breaking, all it's doing is just including one thing after the other
{{a}} gets Template:A which contains "foo" and {{b}} gets Template:B which contains "bar" hence
{{a}}{{b}} = foobar
Of course, this simple example would still work. But picture this:
Template:A contains: I ''li Template:B contains: ke'' hamburgers
currently, {{a}}{{b}} would yield "I <em>like</em> hamburgers", but only because it sticks the pieces together and then tries to make sense of it.
Why is this bad? Picture this:
Template:A contains: {| | nowrap Template:B contains: | Text |}
Is the "nowrap" a table cell attribute or text in a separate cell? Does this change depending on whether there is a newline after "nowrap"? ... And this is just a simple example.
Why would this break in whatever parser you plan to implement?
Because a parser is not a converter. The current not-really-a-parser is actually a converter: It looks out for particular syntax elements like ''these'' and turns them into <em>HTML tags</em>. This is bad because it means that several of these conversions can interfere with each other:
I ''like [[hamburger|hamburgers'']]
produces invalid HTML. It gets even worse when it tries to locate {{template inclusions}} and replaces them with some other text, not knowing what it is or how it fits into the document structure.
A real parser analyses the document's structure. It turns the wiki text into a data structure in memory that actually bears resemblance to the structure of the document. It creates a "heading" element where there is a heading, instead of turning some strategically-placed equals signs into <h#> tags.
The only reason i can see why that would happen is if you were to implement some auto-completion of the table syntax. Sort of like tidy(html) for wikisyntax and do it before things get fetched from Template: rather than after everything has been included.
Your terminology "auto-completion" reveals that you are thinking in terms of conversion. Don't think of it as auto-completion; for example, if a '' has no matching '', I can tell the parser what to do independently of what it does when there *is* a matching ''. There are several possibilities: make an italics element (what you would probably call auto-completion); make a text element (i.e. pretend the "''" was actually text); or bail out saying "syntax error". Of course, we don't want the latter. My parser currently does the second: It turns the '' into text. I did that because this is also how the current not-really-a-parser functions. However, I can easily change that.
In our specific case, there would be a document (a template) that has a {| with no matching |}. What should it do? Unfortunately, none of the three options make it work the way you have come to expect from the current not-really-a-parser.
Timwi
WikiEN-l mailing list WikiEN-l@Wikipedia.org http://mail.wikipedia.org/mailman/listinfo/wikien-l