You're right Steve - I missed the pound symbol in my retort.
But, some good came out of it. These two constructs produce identical HTML:
;#what does: this render?
;#how about this?
;#or this?
;what does
:# this render
:# how about this?
:# or this?
So we conclude that:
;#A: B
is shorthand for:
;A
:# B
-- Jim
On Nov 12, 2007 7:54 PM, Nick Jenkins <nickpj(a)gmail.com> wrote:
> > * Should be implemented in the same language
(i.e. PHP) so that any
> > > comparisons
> > > are comparing-apples-with-applies, and so that it can run on the
current
> > > installed
> > > base of servers as-is. Having other implementations in other languages
> > > is fine
> > > (e.g. you could have a super-fast version in C too) just provide one in
> > > PHP that can
> > > be directly compared with the current parser for performance and
> > > backwards-compatibility.
> >
> > That condition see
ms bizarre. The parser is either faster or it's
slower. Whether it's faster because it's
implemented in C is
irrelevant: it's faster.
In any case I thought it had been decided that it had to be in PHP?
No, I think if we can get a 20:1 speedup for a C version, they'd take
it. :-0
I don't doubt it in the case of most large wiki farms - but numerically most
installations of MediaWiki are on small wikis, probably running on shared hosts,
and in those situations using a C-based parser is either not possible, or significantly
more complicated than running a PHP script. So for those installs, if the speed of a PHP
parser suddenly gets much worse, then I expect those admins would complain. So whilst a
faster parser is a faster parser, if it requires running code that you can't run,
then it
ain't going to do you much good. A custom super-fast wiki-farm parser is great, but
the
general-case parser should have similar performance characteristics and the same
software requirements (i.e. the test is that nobody should be noticeably worse off).
* Should
have a worst-case render time no worse than 2x slower on any given
input.
Any given? That's not reasonable. Perhaps "Any given existing
Wikipedia page"? It would be too easy to find some
construct that is rendered quickly by the existing parser but is slow
with the new one, then create a page that contained
5000 examples of that construct.
Sure; pathological cases are always possible. Let's say "on any 10
randomly chosen already extant pages of wikitext."
The current parser (from my perspective) seems to cope quite well with malformed
input. So all I'm saying is that if a replacement parser could behave similarly
then that would be good - although I take your point that the input that is
considered pathological could be different for different parsers, so let's say that
the render time on randomly generated malformed input should be equivalent on average.
The English Wikipedia does an provide excellent
environment to test the
English language environment. It does not do the same for other languages.
Remember that MediaWiki supports over 250 languages?
Indeed - it's only intended as a test for performance and most functionality. For
a more complete compatibility test with a variety of languages, you'd probably need
to test against all the database dumps at:
http://download.wikimedia.org/
* When
running parserTests should introduce a net total of no more than
(say) 2 regressions (e.g. if you break 5 parser tests, then you have to fix
3 or more parser tests that are currently broken).
I'm not familiar enough with the current set of tests to comment on that.
The core tests are in maintenance/parserTests.txt
(
http://svn.wikimedia.org/viewvc/mediawiki/trunk/phase3/maintenance/parserTe…
)
and generally follow a structure with name of the test, wiki text input, and the
expected XHTML output, for example:
!! test
Preformatted text
!! input
This is some
Preformatted text
With ''italic''
And '''bold'''
And a [[Main Page|link]]
!! result
<pre>This is some
Preformatted text
With <i>italic</i>
And <b>bold</b>
And a <a href="/wiki/Main_Page" title="Main
Page">link</a>
</pre>
!! end
It's probably a pretty good place to start with writing a parser, in terms of what
the expected behaviour is. Then probably after that comes testing against user-generated
input versus the current parser.
-- All the best,
Nick.
_______________________________________________
Wikitech-l mailing list
Wikitech-l(a)lists.wikimedia.org
http://lists.wikimedia.org/mailman/listinfo/wikitech-l