I know I've done this once before, but this one's worse:
The name Pluto was first suggested by [[Venetia Burney|Venetia Phair (née Burney)]], at the time an eleven-year-old girl from [[Oxford, England|Oxford]], [[England]].<ref>{{cite web |url=http://news.bbc.co.uk/1/hi/sci/tech/4596246.stm |title=The girl who named a planet |first= Paul |last= Rincon |publisher=BBC News |accessdate=2006-03-05}}</ref> Venetia, who was interested in [[Classical mythology]] as well as astronomy, suggested the name, the Roman equivalent of [[Hades]], in a conversation to her grandfather [[Falconer Madan]], a former [[librarian]] of [[Oxford University]]'s [[Bodleian Library]].<ref>{{cite web |url=http://www.amblesideonline.org/PR/PR62p030PlanetPluto.shtml |title=The Planet 'Pluto' |first= K.M |last= Claxton |publisher=Parents' Union School Diamond Jubilee Magazine, 1891-1951 (Ambleside: PUS, 1951), p. 30-32 |accessdate=2006-08-24}}</ref> Madan passed the suggestion to Professor [[Herbert Hall Turner]], Turner then cabled the suggestion to colleagues in America. After favourable consideration which was almost unanimous{{fact}}, the name Pluto was officially adopted and an announcement made by Slipher on [[1930-05-01]].
--- Can you believe that in that chunk of text, there are actually three separate pieces of text, with two references between them? It's totally unmanageable - attempting to actually edit the text that's buried in there as a cohesive whole is next to impossible. Solutions desperately wanted.
Steve
Steve Bennett wrote:
I know I've done this once before, but this one's worse:
The name Pluto was first suggested by [[Venetia Burney|Venetia Phair (née Burney)]], at the time an eleven-year-old girl from [[Oxford, England|Oxford]], [[England]].<ref>{{cite web
[...]
announcement made by Slipher on [[1930-05-01]].
Can you believe that in that chunk of text, there are actually three separate pieces of text, with two references between them? It's totally unmanageable - attempting to actually edit the text that's buried in there as a cohesive whole is next to impossible. Solutions desperately wanted.
It would be nice if the extension allowed for references to be declared after the paragraph that uses them. In the edit window every paragraph would have their own "footnotes" with the full reference information, while in the paragraph only <ref name="xzxz"/> would be used. (In the rendered article, all references would go together, as they do now.) This way, the text would flow uninterrupted, and the reference information would keep its proximity with it (except in very long paragraphs...).
Greetings.
On 8/27/06, Carlos angus@quovadis.com.ar wrote:
It would be nice if the extension allowed for references to be declared after the paragraph that uses them. In the edit window every paragraph would have their own "footnotes" with the full reference information, while in the paragraph only <ref name="xzxz"/> would be used. (In the rendered article, all references would go together, as they do now.) This way, the text would flow uninterrupted, and the reference information would keep its proximity with it (except in very long paragraphs...).
Yeah, I think we've discussed putting them at the end of the article, but probably the best is just allowing them to be defined *anywhere* - before, at, or after the time the link to the reference (the [1]) actually appears.
Steve
Carlos wrote:
Steve Bennett wrote:
I know I've done this once before, but this one's worse:
[...]
It would be nice if the extension allowed for references to be declared after the paragraph that uses them.
Remind me what was wrong with my suggestion to allow users to declare the reference anywhere, but move it to the "References" section upon save?
Timwi
On 8/28/06, Timwi timwi@gmx.net wrote:
Remind me what was wrong with my suggestion to allow users to declare the reference anywhere, but move it to the "References" section upon save?
My only concern is that it's slightly disconcerting when stuff moves after save. It would certainly be better than the current situation though, and I'd support seeing that change implemented until an even better solution comes along (if ever).
Steve
On Mon, Aug 28, 2006 at 01:24:36PM +0100, Timwi wrote:
Carlos wrote:
Steve Bennett wrote:
I know I've done this once before, but this one's worse:
[...]
It would be nice if the extension allowed for references to be declared after the paragraph that uses them.
Remind me what was wrong with my suggestion to allow users to declare the reference anywhere, but move it to the "References" section upon save?
The fact that lifting a graf with references and copying it to another article becomes a major hassle.
The problem as *I* see it is that the whitespace necessary to make the wikitext readable screws with the article vspace.
Cheers, -- jra
On 8/28/06, Jay R. Ashworth jra@baylink.com wrote:
The fact that lifting a graf with references and copying it to another article becomes a major hassle.
The problem as *I* see it is that the whitespace necessary to make the wikitext readable screws with the article vspace.
That's part of the problem, but not all of it.
You know, what would really be ideal (in dreamland) would be a column of references next to the edit box. You would type <ref name="foo" /> and the instant you'd closed the >, you'd have a new entry in the list to the right, just begging you to give it some details. The list would probably be stored at the bottom of the article, but you wouldn't even care. You'd always edit it directly in that list. And you could do neat things like collapsing two references together, highlighting any references which were no longer referred to in the text etc...
Steve
On Mon, Aug 28, 2006 at 05:56:42PM +0200, Steve Bennett wrote:
On 8/28/06, Jay R. Ashworth jra@baylink.com wrote:
The fact that lifting a graf with references and copying it to another article becomes a major hassle.
The problem as *I* see it is that the whitespace necessary to make the wikitext readable screws with the article vspace.
That's part of the problem, but not all of it.
You know, what would really be ideal (in dreamland) would be a column of references next to the edit box. You would type <ref name="foo" /> and the instant you'd closed the >, you'd have a new entry in the list to the right, just begging you to give it some details. The list would probably be stored at the bottom of the article, but you wouldn't even care. You'd always edit it directly in that list. And you could do neat things like collapsing two references together, highlighting any references which were no longer referred to in the text etc...
This just keeps converging on Christiaan(?)'s XML in the DBMS approach -- once the backend is tractable, editors in front can be as smart as they like.
It occurs to me that diffing XML wouldn't be pretty.
Cheers, -- jra
On 8/28/06, Jay R. Ashworth jra@baylink.com wrote:
It occurs to me that diffing XML wouldn't be pretty.
Why ever not?
Steve
On Mon, Aug 28, 2006 at 07:44:12PM +0200, Steve Bennett wrote:
On 8/28/06, Jay R. Ashworth jra@baylink.com wrote:
It occurs to me that diffing XML wouldn't be pretty.
Why ever not?
Because in wikitext, everything is in-band; in XML, the structure is out-of-band, on purpose. This requires an entirely different, and I suspect, much more complicated diff algorithm.
Cheers, -- jra
On 8/28/06, Jay R. Ashworth jra@baylink.com wrote:
Because in wikitext, everything is in-band; in XML, the structure is out-of-band, on purpose. This requires an entirely different, and I suspect, much more complicated diff algorithm.
I don't know what "in-band" and "out-of-band" mean ([[Out of band]] doesn't help either), but if the diff engine parses the XML, it can look for a) changes in structure/markup and b) changes in content. Either one should be very easy and fast to diff, given XML-parsing library functions (for the C++ module used on WMF sites, that is). Faster than present, I don't know, but the present differ is hardly a bottleneck.
Simetrical wrote:
doesn't help either), but if the diff engine parses the XML, it can look for a) changes in structure/markup and b) changes in content. Either one should be very easy and fast to diff
XML diffs are generally rather non-trivial (see e.g. http://www.unibw.de/rz/dokumente/getFILE?fid=1076019), though it would be somewhat easier for MediaWiki to get away with them, as we're talking about a strongly constrained XML schema.
In any case, I don't see much point in discussing this; it's been made abundantly clear that even comparatively minor and much saner changes in the syntax will not happen.
On 8/29/06, Simetrical Simetrical+wikitech@gmail.com wrote:
I don't know what "in-band" and "out-of-band" mean ([[Out of band]] doesn't help either), but if the diff engine parses the XML, it can look for a) changes in structure/markup and b) changes in content.
I think what's meant is that with XML, it's basically trivial to separate text from markup - depending on how you receive the XML, that may already have been done for you. The structure and formatting thus occupies a completely separate "band" to the text being formatted.
Whereas Wikitext is a nightmare to parse :)
Steve
On Tue, Aug 29, 2006 at 09:32:48AM +0200, Steve Bennett wrote:
On 8/29/06, Simetrical Simetrical+wikitech@gmail.com wrote:
I don't know what "in-band" and "out-of-band" mean ([[Out of band]] doesn't help either), but if the diff engine parses the XML, it can look for a) changes in structure/markup and b) changes in content.
I think what's meant is that with XML, it's basically trivial to separate text from markup - depending on how you receive the XML, that may already have been done for you. The structure and formatting thus occupies a completely separate "band" to the text being formatted.
Whereas Wikitext is a nightmare to parse :)
My point was that it's a bastard to parse, but it seems intuitively that it would be *easier* to diff.
And as to "it's never going to happen"...
Never's a *long* time; if we wish to engage in gedankenexperiments about how to do an implementation that's "never going to happen"...
who cares?
Cheers, -- jra
On Mon, Aug 28, 2006 at 10:52:42PM -0400, Simetrical wrote:
On 8/28/06, Jay R. Ashworth jra@baylink.com wrote:
Because in wikitext, everything is in-band; in XML, the structure is out-of-band, on purpose. This requires an entirely different, and I suspect, much more complicated diff algorithm.
I don't know what "in-band" and "out-of-band" mean ([[Out of band]] doesn't help either),
The current diff engine, with which I'm not familiar intimately (read that as I haven't looked at the code at all, but I'm assuming it's somewhat familiar with the Unix diff internals) is working on one big object of stream text. The structural markup is *part* of that stream of text, hence, in-band.
but if the diff engine parses the XML, it can
look for a) changes in structure/markup and b) changes in content.
Yep, and those will interact in ways different from the ways that they do now: the current diff engine need not "trip over" the edges of objects in the way that an XML parser will have to.
Either one should be very easy and fast to diff, given XML-parsing library functions (for the C++ module used on WMF sites, that is). Faster than present, I don't know, but the present differ is hardly a bottleneck.
Certainly. I wasn't suggesting that it was; rather, the opposite.
Anyone got any implementation experience with diffing XML trees?
Cheers, -- jra
Jay R. Ashworth wrote:
Anyone got any implementation experience with diffing XML trees?
I haven't done it personally, but the paper I linked to earlier looked at a bunch of implementations.
On Tue, Aug 29, 2006 at 12:06:42PM -0400, Ivan Krsti?? wrote:
Jay R. Ashworth wrote:
Anyone got any implementation experience with diffing XML trees?
I haven't done it personally, but the paper I linked to earlier looked at a bunch of implementations.
I'll check it out and see if it agrees with my prejudices.
On 8/29/06, Jay R. Ashworth jra@baylink.com wrote:
Anyone got any implementation experience with diffing XML trees?
No, but I'm wondering what happens if you simply flatten it down to text then diff. What's the worst that could happen?
Steve
On 8/29/06, Steve Bennett stevage@gmail.com wrote:
No, but I'm wondering what happens if you simply flatten it down to text then diff. What's the worst that could happen?
The worst that could happen is it's all on one line and your diff engine says "line 1 was changed, here are the lines side-by-side". No, that wouldn't work too well: instead, how about you
1) Compress all whitespace in each document per XML specs, 2) Replace all /(<.*?>)/ with /\n$1\n/ in each document, 3) Run a normal line-based diff, such as the one we use now.
Each tag will then be on one line, and so will the contents of each tag. Perfect? Ideal? No, but definitely usable.
On Tue, Aug 29, 2006 at 06:42:38PM +0200, Steve Bennett wrote:
On 8/29/06, Jay R. Ashworth jra@baylink.com wrote:
Anyone got any implementation experience with diffing XML trees?
No, but I'm wondering what happens if you simply flatten it down to text then diff. What's the worst that could happen?
Well, that would likely make the job back into what we have now, yes.
Cheers, -- jra
--- Steve Bennett stevage@gmail.com wrote:
I know I've done this once before, but this one's worse: Can you believe that in that chunk of text, there are actually three separate pieces of text, with two references between them? It's totally unmanageable - attempting to actually edit the text that's buried in there as a cohesive whole is next to impossible. Solutions desperately wanted.
Yeah, that's bad. The detailed part of the references really should be under a 'Works cited' subsection of the ==References== section while something like this <ref>bbc.co.uk "The Girl that named Pluto"</ref> should be inline. The whole point of wiki syntax is to make it possible to easily read and edit source text (unlike HTML). But having complete reference info directly inline defeats that.
-- mav
__________________________________________________ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com
On Sun, Aug 27, 2006 at 12:03:15PM -0700, Daniel Mayer wrote:
--- Steve Bennett stevage@gmail.com wrote:
I know I've done this once before, but this one's worse: Can you believe that in that chunk of text, there are actually three separate pieces of text, with two references between them? It's totally unmanageable - attempting to actually edit the text that's buried in there as a cohesive whole is next to impossible. Solutions desperately wanted.
Yeah, that's bad. The detailed part of the references really should be under a 'Works cited' subsection of the ==References== section while something like this <ref>bbc.co.uk "The Girl that named Pluto"</ref> should be inline. The whole point of wiki syntax is to make it possible to easily read and edit source text (unlike HTML). But having complete reference info directly inline defeats that.
Yes; that's the argument we were having last week.
Cheers, -- jra
wikitech-l@lists.wikimedia.org