On Mon, May 2, 2011 at 5:28 PM, Tim Starling <tstarling(a)wikimedia.org>wrote;wrote:
On 03/05/11 04:25, Brion Vibber wrote:
The most fundamental problem with Wikia's
editor remains its fallback
behavior when some structure is unsupported:
"Source mode required
Rich text editing has been disabled because the page contains complex
code."
I don't think that's a fundamental problem, I think it's a quick hack
added to reduce the development time devoted to rare wikitext
constructs, while maintaining round-trip safety. Like you said further
down in your post, it can be handled more elegantly by replacing the
complex code with a placeholder. Why not just do that?
Excellent question -- how hard would it be to change that?
I'm fairly sure that's easier to do with an abstract parse tree generated
from source (don't recognize it? stash it in a dedicated blob); I worry it
may be harder trying to stash that into the middle of a multi-level HTML
translation engine that wasn't meant to be reversible in the first place (do
we even know if there's an opportunity to recognize the problem component
within the annotated HTML or not? Is it seeing things it doesn't recognize
in the HTML, or is it seeing certain structures in the source and aborting
before it even gets there?).
Like many such things, this might be better resolved by trying it and seeing
what happens -- I don't want us to lock into a strategy too early when a lot
of ideas are still unresolved.
I'm very interested in making experimentation easy; for my pre-exploratory
work I'm stashing things into a gadget which adds render/parse
tree/inspector modes to the editing page:
http://www.mediawiki.org/wiki/File:Parser_Playground_demo.png (screenshot &
links)
I've got this set up as a gadget on
mediawiki.org now and as a user script
on
en.wikipedia.org (loaded on User:Brion_VIBBER/vector.js) just for tossing
random pages in and getting a better sense of how things break down.
Currently parser variant choices are:
* the actual MediaWiki parser via API (parse tree shows the preprocessor
XML; side-by-side mode doesn't have a working inspector mode though)
* a really crappy FakeParser class I threw together, able to handle only a
few constructs. Generates a JSON parse tree, and the inspector mode can
match up nodes in side-by-side view of the tree & HTML.
* PegParser using the peg.js parser generator to build the source->tree
parser, and the same tree->html and tree->source round-trip functions as
FakeParser. The peg source can be edited and rerun to regen the new parse
tree. It's fun!
These are a long way off from the level of experimental support we're going
to want, but I think people are going to benefit from trying a few different
things and getting a better feel for how source, parse trees, and resulting
HTML really will look.
(Template expansion isn't yet presented in this system, and that's going to
be where the real fun is. ;)
Some people in this thread have expressed concerns about the tiny
breakages in wikitext backwards compatibility
introduced by RTE,
despite the fact that RTE has aimed for, and largely achieved, precise
backwards compatibility with legacy wikitext.
I find it hard to believe that those people would be comfortable with
a project which has as its goal a broad reform of
wikitext syntax.
Perhaps there are good arguments for wikitext syntax reform, but I
have trouble believing that WYSIWYG support is one of them, since the
problem appears to have been solved already by RTE, without any reform.
Well, Wikia's RTE still doesn't work on high-profile Wikipedia article
pages, so that remains unproven...
That said, an RTE that doesn't require changing core parser behavior yet
*WILL BE A HUGE BENEFIT* to getting it into use sooner, and still leaves
future reform efforts open.
I'm *VERY OPEN* to the notion of doing the RTE using either a supplementary
source-level parser (which doesn't have to render all structures 100% the
same as the core parser, but *needs* to always create sensible structures
that are useful for editors and can round-trip cleanly) or an alternate
version of the core parser with annotations and limited transformations (eg
like how we don't strip comments out when producing editable source, so we
need to keep them in the output in some way if it's going to be fed into an
HTML-ish editing view).
A supplementary parser that deals with all your editing fun, but doesn't
play super nice with open...close templates is probably just fine for a huge
number of purposes.
Now that we have HipHop support, we have the ability to turn
MediaWiki's core parser into a fast, reusable
library. The performance
reasons for limiting the amount of abstraction in the core parser will
disappear. How many wikitext parsers does the world really need?
I'm not convinced that a giant blob of MediaWiki is suitable as a reusable
library, but would love to see it tried.
-- brion