While browsing UseMod, as I almost never do, I came across the following interesting comment on http://www.usemod.com/cgi-bin/mb.pl?WikiCreole:
"Several wiki engines have agreed to jump on board (including MediaWiki and the Ward's original WikiWikiWeb)."
We have, have we? I was under the impression that we couldn't even standardise our own markup, let alone support something else; I also got the impression from the recent discussions in response to the "announcement" that several developers did not see a significant benefit in supporting it.
The question, then, is where this impression came from, and whether it is correct; I'm directing this one at the release manager (Brion), and his senior assistant (Tim). If the impression is false, then who on Earth expressed this support?
Rob Church
On 29/07/07, Rob Church robchur@gmail.com wrote:
While browsing UseMod, as I almost never do, I came across the following interesting comment on http://www.usemod.com/cgi-bin/mb.pl?WikiCreole:
"Several wiki engines have agreed to jump on board (including MediaWiki and the Ward's original WikiWikiWeb)."
I seem to remember seeing something similar on WikiCreole's website. I think Brion was in discussions with them at some point (can't remember if it was him or them that said that), so I guess he said something vaguely supportive and it's been exaggerated.
On Sun, Jul 29, 2007 at 11:50:05PM +0100, Thomas Dalton wrote:
On 29/07/07, Rob Church robchur@gmail.com wrote:
While browsing UseMod, as I almost never do, I came across the following interesting comment on http://www.usemod.com/cgi-bin/mb.pl?WikiCreole:
"Several wiki engines have agreed to jump on board (including MediaWiki and the Ward's original WikiWikiWeb)."
I seem to remember seeing something similar on WikiCreole's website. I think Brion was in discussions with them at some point (can't remember if it was him or them that said that), so I guess he said something vaguely supportive and it's been exaggerated.
It was discussed on this list, in a discussion that I believe Brion, Tim, and I thought you, Rob, were all in. Check the archives from about ... May?
Nope, 8/29/06 and 9/07/06, and this July 4th.
Cheers -- jra
Rob Church wrote:
While browsing UseMod, as I almost never do, I came across the following interesting comment on http://www.usemod.com/cgi-bin/mb.pl?WikiCreole:
"Several wiki engines have agreed to jump on board (including MediaWiki and the Ward's original WikiWikiWeb)."
We have, have we? I was under the impression that we couldn't even standardise our own markup, let alone support something else; I also got the impression from the recent discussions in response to the "announcement" that several developers did not see a significant benefit in supporting it.
The question, then, is where this impression came from, and whether it is correct; I'm directing this one at the release manager (Brion), and his senior assistant (Tim). If the impression is false, then who on Earth expressed this support?
It's probably based on my saying I might try to implement an experimental mode for MediaWiki when I had time, which I have never had time to do.
IMHO this creole thing is a dead-end, just as the attempts to do WYSIWYG editing by transcoding back and forth between wikitext and HTML. The long-term future of wiki text handling is most likely going to be WYSIWYG based on an HTML/XML backend.
I see no benefit to mucking about to make a half-compatible, frequently-breaking alternate-wikitext mode.
-- brion vibber (brion @ wikimedia.org)
On Mon, Jul 30, 2007 at 07:08:49AM +0800, Brion Vibber wrote:
IMHO this creole thing is a dead-end, just as the attempts to do WYSIWYG editing by transcoding back and forth between wikitext and HTML. The long-term future of wiki text handling is most likely going to be WYSIWYG based on an HTML/XML backend.
And other people, including me, can't see decommissioning rich-text markup of one form of another, because of the WYSIAYG effect...
but this horse is *long* dead on this list, and people wanting to revive it are invited to read the archives first.
Cheers, -- jra
On 30/07/07, Brion Vibber brion@wikimedia.org wrote:
It's probably based on my saying I might try to implement an experimental mode for MediaWiki when I had time, which I have never had time to do.
Yeah, I came across some page about this on their web site, as it happens, just after the initial post.
IMHO this creole thing is a dead-end, just as the attempts to do WYSIWYG editing by transcoding back and forth between wikitext and HTML. The long-term future of wiki text handling is most likely going to be WYSIWYG based on an HTML/XML backend.
It's a nice idea, but...well, so is Communism, and how well did that work out for Russia? :)
Rob Church
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
Rob Church wrote:
It's a nice idea, but...well, so is Communism, and how well did that work out for Russia? :)
Faulty analogy. Markup langauges are much simpler than political philosophies. :-)
That being said, I imagine the biggest problems with a fully XHTML backend is making it performance efficient (there are already tools out there that can validate HTML quite well http://htmlpurifier.org). Parsing/instantiating DOMs are quite expensive.
On 7/29/07, Edward Z. Yang edwardzyang@thewritingpot.com wrote:
That being said, I imagine the biggest problems with a fully XHTML backend is making it performance efficient (there are already tools out there that can validate HTML quite well http://htmlpurifier.org). Parsing/instantiating DOMs are quite expensive.
Um . . . you do realize that at last count it takes *800 ms* to parse a page of wikitext? This is not the right context for complaints about XML being complicated. ;)
Did anyone ever publish a performance evaluation of Mediawiki that you could point me to? I'd be curious to learn more about the 800ms etc. --Dirk
On 7/29/07, Simetrical Simetrical+wikilist@gmail.com wrote:
On 7/29/07, Edward Z. Yang edwardzyang@thewritingpot.com wrote:
That being said, I imagine the biggest problems with a fully XHTML backend is making it performance efficient (there are already tools out there that can validate HTML quite well http://htmlpurifier.org). Parsing/instantiating DOMs are quite expensive.
Um . . . you do realize that at last count it takes *800 ms* to parse a page of wikitext? This is not the right context for complaints about XML being complicated. ;)
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org http://lists.wikimedia.org/mailman/listinfo/wikitech-l
On 7/30/07, Dirk Riehle dirk@riehle.org wrote:
Did anyone ever publish a performance evaluation of Mediawiki that you could point me to? I'd be curious to learn more about the 800ms etc.
Profiling data is available at http://noc.wikimedia.org/cgi-bin/report.py . Spoiler: Parser::parse takes >40% of all runtime. Variable replacement (i.e., templates/ParserFunctions) somehow takes even more . . . does that ever get called outside of parse()?
On 7/31/07, Simetrical Simetrical+wikilist@gmail.com wrote:
Variable replacement (i.e., templates/ParserFunctions) somehow takes even more . . . does that ever get called outside of parse()?
I bugged somebody about this on IRC a little while ago. It's supposedly a recursion issue.
On 7/30/07, Andrew Garrett andrew@epstone.net wrote:
On 7/31/07, Simetrical Simetrical+wikilist@gmail.com wrote:
Variable replacement (i.e., templates/ParserFunctions) somehow takes even more . . . does that ever get called outside of parse()?
I bugged somebody about this on IRC a little while ago. It's supposedly a recursion issue.
Ah, I suppose the profiler adds up times for nested function calls. It probably shouldn't do that.
On 7/29/07, Edward Z. Yang edwardzyang@thewritingpot.com wrote:
Faulty analogy. Markup langauges are much simpler than political philosophies. :-)
So say many men before attempting to write a MediaWiki WikiText parser but same words are spoken by few men afterwards.
;)
On Monday 30 July 2007 01:08:49 Brion Vibber wrote:
I see no benefit to mucking about to make a half-compatible, frequently-breaking alternate-wikitext mode.
Anyways the linebreak would be still a cool thing we currently don't have in MediaWiki: http://www.wikicreole.org/wiki/AllMarkup#section-AllMarkup-Linebreaks
Arnomane
On Mon, Jul 30, 2007 at 02:20:06AM +0200, Daniel Arnold wrote:
On Monday 30 July 2007 01:08:49 Brion Vibber wrote:
I see no benefit to mucking about to make a half-compatible, frequently-breaking alternate-wikitext mode.
Anyways the linebreak would be still a cool thing we currently don't have in MediaWiki: http://www.wikicreole.org/wiki/AllMarkup#section-AllMarkup-Linebreaks
There's <br>, no?
jens
On Tuesday 31 July 2007 01:10:55 Jens Frank wrote:
On Mon, Jul 30, 2007 at 02:20:06AM +0200, Daniel Arnold wrote:
Anyways the linebreak would be still a cool thing we currently don't have in MediaWiki: http://www.wikicreole.org/wiki/AllMarkup#section-AllMarkup-Linebreaks
There's <br>, no?
Hm well there is "<a href" too... ;-) But none likes writing it, wikisyntax is much easier (and causes less input errors).
The same is with br. There are several possibilities writing br. The correct one is <br /> but you can see <br> all around. A correct nice br has 6 characters. \ would be just two and there's no doubt about the syntax.
But there's another heavily used br style:
<br style="clear:both;" />
This is equal to a page break you can often find after a chapter in a book. In Wikipedia for example it is often used in order to clearly separate different article objects (for example navigation bars use it very often) or article some article chapters, where free flow of elements into the next chapter would cause layout confusion.
So how about - for chapter break aka <br style="clear:both;" />?
\ and - would be a cool thing and would be extensively used (even in discussions I sometimes need a br cause a full empty line is visually too much separation and makes people think that these were two different people).
Just look into random Wikipedia articles. Br in thes two variants (very often with wrong html syntax) is the most used html element not covered by Wikisyntax.
Arnomane
On 31/07/07, Daniel Arnold arnomane@gmx.de wrote:
Hm well there is "<a href" too... ;-) But none likes writing it, wikisyntax is much easier (and causes less input errors).
Well, no, <a> is not a whitelisted HTML tag, so the MediaWiki sanitiser won't let it through.
The same is with br. There are several possibilities writing br. The correct one is <br /> but you can see <br> all around. A correct nice br has 6 characters. \ would be just two and there's no doubt about the syntax.
There's no "correct" case - <br> is fine in wiki text; it's sanitised to <br /> before being emitted in HTML, and of course, Tidy usually sorts out any other mess.
This is equal to a page break you can often find after a chapter in a book. In Wikipedia for example it is often used in order to clearly separate different article objects (for example navigation bars use it very often) or article some article chapters, where free flow of elements into the next chapter would cause layout confusion.
Semantically speaking, <br style="clear: both;" /> is not a page break; it's just a line break which happens to clear all preceding floats.
Rob Church
On Tuesday 31 July 2007 12:55:32 Rob Church wrote:
Well, no, <a> is not a whitelisted HTML tag, so the MediaWiki sanitiser won't let it through.
It was an example in order to demonstrate the strength of Wiki syntax compared to HTML for hand written texts. HTML/XML an friends aren't exactly concise (and their long winded sntax is probably one reason for the real-life-html parse problems of browsers).
There's no "correct" case - <br> is fine in wiki text; it's sanitised to <br /> before being emitted in HTML, and of course, Tidy usually sorts out any other mess.
Well I guess you have strict coding style guidelines for your source code. Wiki code is source code, too. It should be as clean and simple as possible in order to allow every person editing and improving an article in no time.
As well there is a hell of code pedants out there in Wikipedia that love to correct these HTML bits in Wikipedia (ever ecuntered such a bot or person; I do encounter them very frequent on my watchlist).
Furthermore there are other parsers beside MediaWiki that need to parse MediaWiki wiki source code. For example parsers for printed books, for the Wikipedia DVD and others. A MediaWiki sanitizer doesn't make their life easier...
Semantically speaking, <br style="clear: both;" /> is not a page break; it's just a line break which happens to clear all preceding floats.
Well wiki has no pages you can turn. So this is the nearest matching equivalent to page/chapter break for the web world.
As well I am quite sure that Wikisource could make a great use of a wiki element that allows for that, cause wikisource is source textbook page aligned and sometimes more than one book page on one wiki page and you definitely don't wan't free flow of elements between these two distinct text book pages.
So there's a multitude of valid use cases for a convenient br wiki element.
Wiki syntax is all about shortness and simplicity.
Arnomane
On 31/07/07, Daniel Arnold arnomane@gmx.de wrote:
Furthermore there are other parsers beside MediaWiki that need to parse MediaWiki wiki source code. For example parsers for printed books, for the Wikipedia DVD and others. A MediaWiki sanitizer doesn't make their life easier...
No, but a well-defined specification for the markup would make writing alternative parsers quite straightforward.
Well wiki has no pages you can turn. So this is the nearest matching equivalent to page/chapter break for the web world.
You assume that wiki content is web-oriented, and while that's true, it's not the best attitude to have - wiki content does not have to be viewed on a screen at all, it could be printed into a book, or on t-shirts, or bits of toilet paper.
This is the *point* of semantics, as applied to information - "what I mean", not "what I look like".
Rob Church
On Tuesday 31 July 2007 16:03:54 Rob Church wrote:
You assume that wiki content is web-oriented, and while that's true, it's not the best attitude to have - wiki content does not have to be viewed on a screen at all, it could be printed into a book, or on t-shirts, or bits of toilet paper.
This is the *point* of semantics, as applied to information - "what I mean", not "what I look like".
Great! You gave us a perfect reason why my proposed - or = is much much better than <br style="clear:both;" />.
= means: make a full break considering all elements above.
On a web page it would be <br style="clear:both;" /> on a book it would be the page break tag of Docbook-XML (or another book format as long as you have the parser for it), on T-Shitrs maybe the "turn to backside tag" and on toilet paper... na you don't want to turn to the backside but maybe to the next sheet...
So my suggest is about maintaining semantics. <br style="clear:both;" /> is bad in wiki source code for your named reason of "what I look like" vs. "what I mean".
Cheers, Arnomane
On 31/07/07, Daniel Arnold arnomane@gmx.de wrote:
= means: make a full break considering all elements above.
How do we know this isn't going to conflict with existing usage in an incompatible manner? Introducing special meanings for characters which don't have them once the markup is well established *might* be problematic in some cases.
Rob Church
On Tuesday 31 July 2007 16:58:10 Rob Church wrote:
On 31/07/07, Daniel Arnold arnomane@gmx.de wrote:
= means: make a full break considering all elements above.
How do we know this isn't going to conflict with existing usage in an incompatible manner? Introducing special meanings for characters which don't have them once the markup is well established *might* be problematic in some cases.
Well you can easily estimate the possible trouble with a SQL query against our wiki databases for these strings ( \ + - = ). I am pretty sure that if you really encounter this syntax, that is is very likely inside <pre>, <nowiki> or <code>.
But anyways. Quite some parts of MediaWiki syntax have changed in the past a lot. Template syntax for example has changed dramatically without any real warning with quite some implications on used templates, partly resulting in totally different output (of course I strongly welcomed these bold changes at templates, cause they where necessary).
So any of the template changes (templates also raise any template syntax change to the power cause of embedding) had a larger side effect on existing wiki layout than my proposed br syntax could have.
Arnomane
Well wiki has no pages you can turn. So this is the nearest matching equivalent to page/chapter break for the web world.
Since we don't know what kinds of pages there are or are not, page breaks are useless. We do have syntax for a chapter break, though: "===".
(NB: A page is a feature of layout and have no logical meaning, a chapter is a feature of logical structure. That's why we have syntax for one and not the other. If you want to worry about pages, the code should go in a skin.)
On Tuesday 31 July 2007 16:36:23 Thomas Dalton wrote:
Since we don't know what kinds of pages there are or are not, page breaks are useless. We do have syntax for a chapter break, though: "===".
No. Please re-read my examples and have a look at some source code in Wikipedia and Wikisource for valid applications of my suggestion.
There is a very good reason for floating elements such as images into the next chapter by default - even on printed books.
But I said: There are certain reason where it is necessary doing so and many people are doing so right now with crappy syntax. You won't stop them but you can help them making it better in future.
Arnomane
Since we don't know what kinds of pages there are or are not, page breaks are useless. We do have syntax for a chapter break, though: "===".
No. Please re-read my examples and have a look at some source code in Wikipedia and Wikisource for valid applications of my suggestion.
There is a very good reason for floating elements such as images into the next chapter by default - even on printed books.
But I said: There are certain reason where it is necessary doing so and many people are doing so right now with crappy syntax. You won't stop them but you can help them making it better in future.
Ok, so === isn't quite what you want. You probably want <div> tags. A div tag is designed to separate different parts of the content, which is exactly what you are trying to do.
On Tuesday 31 July 2007 17:05:27 Thomas Dalton wrote:
Ok, so === isn't quite what you want. You probably want <div> tags. A div tag is designed to separate different parts of the content, which is exactly what you are trying to do.
No div is also not what I want. Div makes a (logical) box around everything inside. So partly (for example at navigation bars) it would be better than a ful break with br but in general this is not what you need.
You don't want to embedd a paragraph of text into a div. You just somtimes want to separate contents at one line before - after and not before - middle - after.
Arnomane.
No div is also not what I want. Div makes a (logical) box around everything inside. So partly (for example at navigation bars) it would be better than a ful break with br but in general this is not what you need.
You don't want to embedd a paragraph of text into a div. You just somtimes want to separate contents at one line before - after and not before - middle - after.
A logical box is exactly want you want to mark a chapter. I can't see a logical reason to have a break inbetween 2 items rather than a box around each.
On 7/31/07, Thomas Dalton thomas.dalton@gmail.com wrote:
A logical box is exactly want you want to mark a chapter. I can't see a logical reason to have a break inbetween 2 items rather than a box around each.
Precisely. <br> is non-semantic markup. A classed div is better: then you can make all sorts of decisions as to how to display chapter breaks, such as how much of a margin you want under it, or if you want page-break-before: always; or whatever the CSS rule is to do that.
On Tue, Jul 31, 2007 at 01:39:55PM -0700, Simetrical wrote:
On 7/31/07, Thomas Dalton thomas.dalton@gmail.com wrote:
A logical box is exactly want you want to mark a chapter. I can't see a logical reason to have a break inbetween 2 items rather than a box around each.
Precisely. <br> is non-semantic markup. A classed div is better: then you can make all sorts of decisions as to how to display chapter breaks, such as how much of a margin you want under it, or if you want page-break-before: always; or whatever the CSS rule is to do that.
Having spent a lot of time with Ventura, I'm a bit querolous about that idea. A 'chapter' may be *more* than you want to display on a page.
Perhaps 'chapter' is a bad choice of term thereby. But if you wrap entire chapters in DIVs, you may need to physically break a div in the middle when you paginate.
And perhaps I'm completely out of context.
Cheers, -- jra
Perhaps 'chapter' is a bad choice of term thereby. But if you wrap entire chapters in DIVs, you may need to physically break a div in the middle when you paginate.
How you split something into pages is a matter of layout, not content. The div tag is part of the content. How you show the contents of that div tag is another matter entirely. If you are talking about a webpage, then there is no need to split it at arbitrary points (since there is no absolute limit on length), so you should always split at a logical break (perhaps a subsection of a chapter, but that should still be determined with divs), if at all. You have to remember not to think of wiki articles as webpages - they can be displayed in any medium imaginable. If someone is printing them in a book, they will worry about were to put the page breaks based on the logical structure you give the article (primarily using headers, and using divs explicitly for more complicated things).
On Tuesday 31 July 2007 18:47:09 Thomas Dalton wrote:
A logical box is exactly want you want to mark a chapter. I can't see a logical reason to have a break inbetween 2 items rather than a box around each.
Ok, once again. Take this example:
--begin--
[[Image:bla.jpeg|thumb|blubb]] lala I am some cool text...
== Chapter 1 == blind text blind text blind text blind text blind text blind text blind text blind text blind text
[[Image:foo.jpeg|thumb|bar]] some more blind text some more blind text some more blind text some more blind text <br style="clear:both;" />
== Chapter 2 == lalala here is the next nonsense text.
--end--
The image bla.jpeg shall flow into Chapter 1 but foo.jpeg shall _not_ flow into Chapter 2. You cannot solve this with a div around Chapter 1, cause in that case bla.jpeg also can't flow into Chapter 1.
This arbitrary break is meant as an exemption from the normal useful concise element positioning. You sometimes simply need it.
Furthermore HTML has a fundamental design flaw for human editing. Most HTML tags need open and close tags. In contrast a lot of wiki elements don't need closing tags, they close implicit. This is very good at editing cause you don't need to keep in mind closing a tag from 10 screen pages above. So wiki allows for linear editing while HTML definitely does not allow for linear editing.
A div is such a flawed element that prevents linear editing (beside the fact that div is not appropriate in my above example).
(Also take a look at LaTex. LaTex is a very good example for semantic page description. Like wiki syntax LaTex tries to avoid explicit open an close tags wherever possible in order to allow for linear and concise editing.)
So as others in this thread have noted this currently widley existing habbit using br in wiki pages is _bad_ because it is no semantic markup and cause it has a special meaning in HTML and does not necessarily make perfect sense in other medias that's why I was suggesting a special wiki tag (that also is short and does not have different ways of writing it).
\ for a simple line break (in HTML <br />, in a word processor the return character) = for an all elements break (in HTML <br style="clear:both;" />, in a word processor the page return/break).
And if you really want a bit more flexibility you maybe can have these two line breaks, too:
- that would render n HTML as <br style="clear:left;" /> + that would render n HTML as <br style="clear:right;" />
br in all its valid and invalid variants is currently heavily used in Wikipedia and is the most used HTML element in wiki pages (more than div, sub and sup). You are simply not able to stop using people breaks. You can only improve the situation with intruducing a concise and logical wiki element (which also makes the life easier transfoming the wiki source code into other medias such as books). This would make curent wiki source code more human readable (beside more concise).
Arnomane
The image bla.jpeg shall flow into Chapter 1 but foo.jpeg shall _not_ flow into Chapter 2. You cannot solve this with a div around Chapter 1, cause in that case bla.jpeg also can't flow into Chapter 1.
Are you sure? I would think you just need to use the right CSS to get the desired result.
On 01/08/07, Thomas Dalton thomas.dalton@gmail.com wrote:
The image bla.jpeg shall flow into Chapter 1 but foo.jpeg shall _not_ flow into Chapter 2. You cannot solve this with a div around Chapter 1, cause in that case bla.jpeg also can't flow into Chapter 1.
Are you sure? I would think you just need to use the right CSS to get the desired result.
PS Well, after a few attempts, I can't find any CSS that works (other than putting <div style="clear:both"> around just chapter 2, but that's equivalent to your way, it doesn't make any logical sense). I'm no CSS expert, though - hopefully someone else can work out what I've missed. It should be possible...
On Wednesday 01 August 2007 15:10:35 Thomas Dalton wrote:
PS Well, after a few attempts, I can't find any CSS that works (other than putting <div style="clear:both"> around just chapter 2, but that's equivalent to your way, it doesn't make any logical sense). I'm no CSS expert, though - hopefully someone else can work out what I've missed. It should be possible...
Even if it would be possible it would be abuse of div tag and furthermore there would be the non-linear-edit problem with such a div.
But the most important thing is: Who would actually use it? Nobody.
Wikisyntax is not there for the sake of yet another makup language. Wiki syntax is meant for simple and concise markup. HTML is all but concise and simple markup. So my new suggest will maintain this simple and concise markup. I am not interested in finding the holy grail of true text markup.
Arnomane
Daniel Arnold wrote:
But the most important thing is: Who would actually use it? Nobody.
Wikisyntax is not there for the sake of yet another makup language. Wiki syntax is meant for simple and concise markup. HTML is all but concise and simple markup. So my new suggest will maintain this simple and concise markup. I am not interested in finding the holy grail of true text markup.
Arnomane
And why is = better than {{break}} ?
Even if it would be possible it would be abuse of div tag and furthermore there would be the non-linear-edit problem with such a div.
I don't see how it would be an abuse. Keeping internal images in but not keeping external images out seems like a perfectly reasonable things to do, I just can't work out how to do it.
Linear editing is too restrictive. Where it is obvious when something should close, it's fine not to require it to be explicitly closed. Otherwise, you need to close it yourself. I'm sure you've seen the problems caused by forgetting the closing ''' after bold text - there is no way to handle emphasis linearly.
That said, this could be made to work linearly if the div tags were automatically added around sections with an appropriate class that could be referred to in CSS at the top. (Working on the assumption that a section finishes when the next section of the same level starts - a pretty safe assumption.)
But the most important thing is: Who would actually use it? Nobody.
Wikisyntax is not there for the sake of yet another makup language. Wiki syntax is meant for simple and concise markup. HTML is all but concise and simple markup. So my new suggest will maintain this simple and concise markup. I am not interested in finding the holy grail of true text markup.
While WikiSyntax is meant to be simple and concise, it is also meant to be independent of medium, something HTML was never intended to be. If we are going to achieve that goal, we have to sacrifice some simplicity.
Besides WikiCreole, someone else is implementing pluggable parsers of wiki syntax without an intermediate language, or say, the intermediate language is flex. This guy, Ping Yeh, will introduce his pilot work on Wikimania: http://wikimania2007.wikimedia.org/wiki/Proceedings:PY1
Actually I'm imaging that if this flexible parser idea, as the input part, connected to some kind of wikiwyg formatter, as the output part, they may solve similar problem of WikiCreole and pull the day of "WYSIWYG <-> XHTML" closer to us; by this kind of interpretation of Brion's comment, I second it.
Sincerely, /Mike/
Hi,
Thanks Mike for introducing me. I have just subscribed to the list.
Yes I'm working on a wiki "engine" with replaceable parts. The replaceable parts include parser (wiki text -> document tree in memory), formatter (document tree -> output text), storage system, authenticator, authorizer, and search engine. User interface is not in it, but rather the whole engine is driven by a UI to be supplied by the developer.
You can think of this "engine" as a "wiki library" that have pre-defined interfaces to replaceable parts. So, with an html formatter and a mediawiki parser (a draft version already exist), it can already show HTML for mediawiki contents. Furthermore, if parsers and formatters of markup A and B are available, we can freely convert wiki contents from A to B and vice versa. The "intermediate" thing is actually the document tree in memory.
The whole thing is written in C++. With help from wrapper generators like SWIG, it is possible to be called by many scripting languages supported by SWIG (PHP, Python, Ruby, Tcl, Lua, Java, C#, to name a few). I haven't got around to doing this just yet, but I did similar work with other systems before so this should be doable.
If this works, the wikipedia contents will instantly become available to many systems even if they are not written in PHP, or even not web-based.
I'll attend the Hacking Days Extra in the afternoon tomorrow. Maybe I can show you what I have so far and get your comments. :)
cheers, Ping
On 8/1/07, Tian-Jian Barabbas Jiang@Gmail barabbas@gmail.com wrote:
Besides WikiCreole, someone else is implementing pluggable parsers of wiki syntax without an intermediate language, or say, the intermediate language is flex. This guy, Ping Yeh, will introduce his pilot work on Wikimania: http://wikimania2007.wikimedia.org/wiki/Proceedings:PY1
Actually I'm imaging that if this flexible parser idea, as the input part, connected to some kind of wikiwyg formatter, as the output part, they may solve similar problem of WikiCreole and pull the day of "WYSIWYG <-> XHTML" closer to us; by this kind of interpretation of Brion's comment, I second it.
Sincerely, /Mike/
On Wed, Aug 01, 2007 at 10:36:49PM +0800, Ping Yeh wrote:
Yes I'm working on a wiki "engine" with replaceable parts. The replaceable parts include parser (wiki text -> document tree in memory), formatter (document tree -> output text), storage system, authenticator, authorizer, and search engine. User interface is not in it, but rather the whole engine is driven by a UI to be supplied by the developer.
You can think of this "engine" as a "wiki library" that have pre-defined interfaces to replaceable parts. So, with an html formatter and a mediawiki parser (a draft version already exist), it can already show HTML for mediawiki contents. Furthermore, if parsers and formatters of markup A and B are available, we can freely convert wiki contents from A to B and vice versa. The "intermediate" thing is actually the document tree in memory.
The whole thing is written in C++. With help from wrapper generators like SWIG, it is possible to be called by many scripting languages supported by SWIG (PHP, Python, Ruby, Tcl, Lua, Java, C#, to name a few). I haven't got around to doing this just yet, but I did similar work with other systems before so this should be doable.
Oooooh! Phase 4!
:-)
Cheers, -- jr 'will rouse rabble for food' a
"Ping Yeh" ping.nsr.yeh@gmail.com wrote: So, with an html formatter and a mediawiki parser (a draft version already exist), it can already show HTML for mediawiki contents.
For your information, and possibly some inspiration; I have been working on a python-based wikitext parser to be used with pywikipedia; the source is available at http://svn.wikimedia.org/viewvc/pywikipedia/trunk/pywikiparser/ ; How far has your parser been developed already, and how does it parse? I have been trying to fit wikitext into a grammar parseable by an LL(k) parser, but this was not as easy as it looked. Hence, I have started building a parser form scratch.
I'll attend the Hacking Days Extra in the afternoon tomorrow. Maybe I can show you what I have so far and get your comments. :)
Unfortunatly, not all of us are at wikimania ;) Have you got some on-line resource where we can find more information?
--valhallasw
Thomas Dalton wrote:
PS Well, after a few attempts, I can't find any CSS that works (other than putting <div style="clear:both"> around just chapter 2, but
Why not put an empty div:
<div style="clear:both;"></div>
That avoids wrapping anything. I think this will do what you want but I'm not sure I really understand the problem.
Mike
On Wednesday 01 August 2007 21:45:48 Michael Daly wrote:
Thomas Dalton wrote:
PS Well, after a few attempts, I can't find any CSS that works (other than putting <div style="clear:both"> around just chapter 2, but
Why not put an empty div:
<div style="clear:both;"></div>
That avoids wrapping anything. I think this will do what you want but I'm not sure I really understand the problem.
Erm and what is the point of using an empty div instead of a br?
Arnomane
Why not put an empty div:
<div style="clear:both;"></div>
That avoids wrapping anything. I think this will do what you want but I'm not sure I really understand the problem.
That's the same as using <br>, it makes no semantic sense. The HTML should describe information, not presentation.
On 8/1/07, Daniel Arnold arnomane@gmx.de wrote:
The image bla.jpeg shall flow into Chapter 1 but foo.jpeg shall _not_ flow into Chapter 2. You cannot solve this with a div around Chapter 1, cause in that case bla.jpeg also can't flow into Chapter 1.
Why would you want bla.jpeg to flow into the first section on the page but not the second? If you want to break before non-initial chapters, you could do
div.chapter + div.chapter { clear: both; }
Or of course a specific class.
Furthermore HTML has a fundamental design flaw for human editing. Most HTML tags need open and close tags. [etc.]
I never said you should use HTML and not wikitext. Possibly Wikibooks or whatever would like wikitext chapter breaks or something, in the fullness of time. Certainly the existing markup is rather Wikipedia-centric. But line breaks, cleared or not, are not semantic, and you have provided no use case where they're particularly more useful than a more semantic (and therefore more useful) wikitext equivalent. Why doesn't Wikibooks just adopt the convention of h2 = chapter break or something and then do clears based on that, if it's wanted so much? Are there really <br clear="both" />s everywhere?
<br> may be the most used HTML tag in wikitext -- actually I'd be extraordinarily surprised if it beat out <div>, at least if you count template usage; where are your figures from? -- but if that's so, it's got to be the most overused as well. It's really necessary only very rarely if you're going for a consistent site-wide style.
On Wednesday 01 August 2007 21:56:32 Simetrical wrote:
But line breaks, cleared or not, are not semantic, and you have provided no use case where they're particularly more useful than a more semantic (and therefore more useful) wikitext equivalent.
Yes but people still use it for a _very_ good reason sometimes.
Have you ever looked into LaTex? * At first you just write your document with semantic elements only. * Then you tweak the individual elements a bit with global style variables. * At the end you need to add non-semantic tags at some places in order to get the desired result.
The same work flow applies for wikis especially for Wikipedia. You simply CANNOT stop people adding non-semantic tags into articles you cannot entirely stop people tweaking articles. Currently they are doing this with crappy HTML/CSS sytnax.
As br in every variant is the most used HTML element not covered by Wiki syntax I expanded the Wikicreole idea a bit for a _less_ explicit more semantic line break sign. Semantics aren't binary.
<br> may be the most used HTML tag in wikitext -- actually I'd be extraordinarily surprised if it beat out <div>, at least if you count template usage; where are your figures from?
Plain articles and talk pages. Templates anyways very often have wired syntax.
Arnomane
Why would you want bla.jpeg to flow into the first section on the page but not the second? If you want to break before non-initial chapters, you could do
What he wants is for the first image to flow anywhere it likes (since it isn't part of any chapter), but the second image (which is part of chapter 1) to stay within chapter 1. Were there an image in chapter 2, it would stay within chapter 2. Were there an image inbetween chapters 1 and 2, it would flow into chapter 2.
On 8/1/07, Thomas Dalton thomas.dalton@gmail.com wrote:
Why would you want bla.jpeg to flow into the first section on the page but not the second? If you want to break before non-initial chapters, you could do
What he wants is for the first image to flow anywhere it likes (since it isn't part of any chapter), but the second image (which is part of chapter 1) to stay within chapter 1. Were there an image in chapter 2, it would stay within chapter 2. Were there an image inbetween chapters 1 and 2, it would flow into chapter 2.
Okay, then
div.chapter > :last-child { clear: both; }
Of course, that will only work in Firefox, Opera 9.5, Konqueror, and maybe "MSN for Mac OS X", whatever that is. Practically speaking you'll need some kind of dummy element at the end of every chapter, whether a br or something else.
On 8/1/07, Thomas Dalton thomas.dalton@gmail.com wrote:
That's the same as using <br>, it makes no semantic sense.
It does have the advantage, AFAIK, of not creating an extra line break.
Okay, then
div.chapter > :last-child { clear: both; }
That almost works. It stops the floats one element before the end of the div, though, so you end up with a gap before the last paragraph in most cases.
It does have the advantage, AFAIK, of not creating an extra line break.
I think that depends on browser, I think some do put a line break around divs.
On 8/2/07, Thomas Dalton thomas.dalton@gmail.com wrote:
Okay, then
div.chapter > :last-child { clear: both; }
That almost works. It stops the floats one element before the end of the div, though, so you end up with a gap before the last paragraph in most cases.
Yes, I realized that after posting. I don't know of anything like "clear-after" in CSS (maybe in CSS3? or is there some way to do it I'm missing?). Basically, as I say, you'd have to use dummy elements anyway for now to account for IE et al.
It does have the advantage, AFAIK, of not creating an extra line break.
I think that depends on browser, I think some do put a line break around divs.
I find that hard to believe, honestly. CSS support may be flaky in many cases, but even IE long ago got pretty much all of CSS1 down (okay, the box model implementation was slightly wrong until IE7).
I think that depends on browser, I think some do put a line break around divs.
I find that hard to believe, honestly. CSS support may be flaky in many cases, but even IE long ago got pretty much all of CSS1 down (okay, the box model implementation was slightly wrong until IE7).
It's not CSS support, it's HTML support - div is part of HTML. I just loaded a text file containing the following in Firefox:
<html><body> <div>foo</div><div>bar</div> </body></html>
It printed foo and bar on separate lines. The spec does say that some user agents add line breaks, it doesn't say if they should or not...
On 8/2/07, Thomas Dalton thomas.dalton@gmail.com wrote:
It's not CSS support, it's HTML support - div is part of HTML. I just loaded a text file containing the following in Firefox:
<html><body> <div>foo</div><div>bar</div> </body></html>
It printed foo and bar on separate lines.
Er, yes, the div will break a line, because it's a block element. It will not, however, add any additional vertical whitespace, which <br> will. Try the following:
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en" dir="ltr"> <head><title></title></head> <body> <ul> <li>FooBar</li> <li>Foo<div></div>Bar</li> <li>Foo<br />Bar</li> <li><p>Foo</p><p>Bar</p></li> <li><p>Foo</p><div></div><p>Bar</p></li> <li><p>Foo</p><br /><p>Bar</p></li> </ul> </body> </html>
If the previous and next boxes are inline-level, there will be no difference, because each type of markup effectively adds a line break that wasn't present before. (Actually there will probably be subtle differences, because the div will create new block wrappers around the inline elements, while <br> will not, but there won't be very obvious differences.) If either the previous or next box is block-level, however, as in the case being discussed, <br> will add a new line, but <div></div> will not affect layout without extra style rules, making it more suitable for clearing.
Rob Church wrote:
While browsing UseMod, as I almost never do, I came across the following interesting comment on http://www.usemod.com/cgi-bin/mb.pl?WikiCreole:
"Several wiki engines have agreed to jump on board (including MediaWiki and the Ward's original WikiWikiWeb)."
We have, have we? I was under the impression that we couldn't even standardise our own markup, let alone support something else; I also got the impression from the recent discussions in response to the "announcement" that several developers did not see a significant benefit in supporting it.
The question, then, is where this impression came from, and whether it is correct; I'm directing this one at the release manager (Brion), and his senior assistant (Tim). If the impression is false, then who on Earth expressed this support?
I've been to a couple of meetings about WikiCreole this year, one IRL in Montreal and one on IRC. The message I gave them both times was that I am personally not intending to implement any kind of WikiCreole support in MediaWiki, but that I would welcome such support if someone implemented it and submitted it. I supported the model of having an alternative parser as an installation option, not the "easy edit" model.
-- Tim Starling
wikitech-l@lists.wikimedia.org