On 29 December 2010 02:07, Happy-melon <happy-melon(a)live.com> wrote:
There are some things that we know:
1) as Brion says, MediaWiki currently only presents content in one way: as
wikitext run through the parser. He may well be right that there is a
bigger fish which could be caught than WYSIWYG editing by saying that MW
should present data in other new and exciting ways, but that's actually a
separate question. *If* you wish to solve WYSIWYG editing, your baseline is
wikitext and the parser.
Specifically, it only presents content as HTML. It's not really a
parser because it doesn't create an AST (Abstract Syntax Tree). It's a
wikitext to HTML converter. The flavour of the HTML can be somewhat
modulated by the skin but it could never output directly to something
totally different like RTF or PDF.
2) "guacamole" is one of the more unusual
descriptors I've heard for the
parser, but it's far from the worst. We all agree that it's horribly messy
and most developers treat it like either a sleeping dragon or a *very*
grumpy neighbour. I'd say that the two biggest problems with it are that a)
it's buried so deep in the codebase that literally the only way to get your
wikitext parsed is to fire up the whole of the rest of MediaWiki around it
to give it somewhere comfy to live in,
I have started to advocate the isolation of the parser from the rest
of the innards or MediaWiki for just this reason:
https://bugzilla.wikimedia.org/show_bug.cgi?id=25984
Free it up so that anybody can embed it in their code and get exactly
the same rendering that Wikipedia et al get, guaranteed.
We have to find all the edges where the parser calls other parts of
MediaWiki and all the edges where other parts of MediaWiki call the
parser. We then define these edges as interfaces so that we can drop
an alternative parser into MediaWiki and drop the current parser into
say an offline viewer or whatever.
With a freed up parser more people will hack on it, more people will
come to grok it and come up with strategies to address some of its
problems. It should also be a boon for unit testing.
(I have a very rough prototype working by the way with lots of stub classes)
and b) there is as David says no way
of explaining what it's supposed to be doing except saying "follow the code;
whatever it does is what it's supposed to do". It seems to be generally
accepted that it is *impossible* to represent everything the parser does in
any standard grammar.
I've thought a lot about this too. It certainly is not any type of
standard grammar. But on the other hand it is a pretty common kind of
nonstandard grammar. I call it a "recursive text replacement grammar".
Perhaps this type of grammar has some useful characteristics we can
discover and document. It may be possible to follow the code flow and
document each text replacement in sequence as a kind of parser spec
rather than trying and failing again to shoehorn it into a standard
LALR grammar.
If it is possible to extract such a spec it would then be possible to
implement it in other languages.
Some research may even find that is possible to transform such a
grammar deterministically into an LALR grammar...
But even if not I'm certain it would demysitfy what happens in the
parser so that problems and edge cases would be easier to locate.
Andrew Dunbar (hippietrail)
Those are all standard gripes, and nothing new or
exciting. There are also,
to quote a much-abused former world leader, some known unknowns:
1) we don't know how to explain What You See when you parse wikitext except
by prodding an exceedingly grumpy hundred thousand lines of PHP and *asking
What it thinks* You Get.
2) We don't know how to create a WYSIWYG editor for wikitext.
Now, I'd say we have some unknown unknowns.
1) *is* it because of wikitext's idiosyncracies that WYSIWYG is so
difficult? Is wikitext *by its nature* not amenable to WYSIWYG editing?
2) would a wikitext which *was* representable in a standard grammar be
amenable to WYSIWYG editing?
3) would a wikitext which had an alternative parser, one that was not buried
in the depths of MW (perhaps a full JS library that could be called in
real-time on the client), be amenable to WYSIWYG editing?
4) are questions 2 and 3 synonymous?
--HM
"David Gerard" <dgerard(a)gmail.com> wrote in
message news:AANLkTimthUx-UndO1CTnexcRqbPP89t2M-PVhA6FkFp8@mail.gmail.com...
[crossposted to foundation-l and wikitech-l]
"There has to be a vision though, of something better. Maybe something
that is an actual wiki, quick and easy, rather than the template
coding hell Wikipedia's turned into." - something Fred Bauder just
said on wikien-l.
Our current markup is one of our biggest barriers to participation.
AIUI, edit rates are about half what they were in 2005, even as our
fame has gone from "popular" through "famous" to "part of the
structure of the world." I submit that this is not a good or healthy
thing in any way and needs fixing.
People who can handle wikitext really just do not understand how
offputting the computer guacamole is to people who can cope with text
they can see.
We know this is a problem; WYSIWYG that works is something that's been
wanted here forever. There are various hideous technical nightmares in
its way, that make this a big and hairy problem, of the sort where the
hair has hair.
However, I submit that it's important enough we need to attack it with
actual resources anyway.
This is just one data point, where a Canadian government office got
*EIGHT TIMES* the participation in their intranet wiki by putting in a
(heavily locally patched) copy of FCKeditor:
http://lists.wikimedia.org/pipermail/mediawiki-l/2010-May/034062.html
"I have to disagree with you given my experience. In one government
department where MediaWiki was installed we saw the active user base
spike from about 1000 users to about 8000 users within a month of having
enabled FCKeditor. FCKeditor definitely has it's warts, but it very
closely matches the experience non-technical people have gotten used to
while using Word or WordPerfect. Leveraging skills people already have
cuts down on training costs and allows them to be productive almost
immediately."
http://lists.wikimedia.org/pipermail/mediawiki-l/2010-May/034071.html
"Since a plethora of intelligent people with no desire to learn WikiCode
can now add content, the quality of posts has been in line with the
adoption of wiki use by these people. Thus one would say it has gone up.
"In the beginning there were some hard core users that learned WikiCode,
for the most part they have indicated that when the WYSIWYG fails, they
are able to switch to WikiCode mode to address the problem. This usually
occurs with complex table nesting which is something that few of the
users do anyways. Most document layouts are kept simple. Additionally,
we have a multilingual english/french wiki. As a result the browser
spell-check is insufficient for the most part (not to mention it has
issues with WikiCode). To address this a second spellcheck button was
added to the interface so that both english and french spellcheck could
be available within the same interface (via aspell backend)."
So, the payoffs could be ridiculously huge: eight times the number of
smart and knowledgeable people even being able to *fix typos* on
material they care about.
Here are some problems. (Off the top of my head; please do add more,
all you can think of.)
- The problem:
* Fidelity with the existing body of wikitext. No conversion flag day.
The current body exploits every possible edge case in the regular
expression guacamole we call a "parser". Tim said a few years ago that
any solution has to account for the existing body of text.
* Two-way fidelity. Those who know wikitext will demand to keep it and
will bitterly resist any attempt to take it away from them.
* FCKeditor (now CKeditor) in MediaWiki is all but unmaintained.
* There is no specification for wikitext. Well, there almost is -
compiled as C, it runs a bit slower than the existing PHP compiler.
But it's a start!
http://lists.wikimedia.org/pipermail/wikitext-l/2010-August/000318.html
- Attempting to solve it:
* The best brains around Wikipedia, MediaWiki and WMF have dashed
their foreheads against this problem for at least the past five years
and have got *nowhere*. Tim has a whole section in the SVN repository
for "new parser attempts". Sheer brilliance isn't going to solve this
one.
* Tim doesn't scale. Most of our other technical people don't scale.
*We have no resources and still run on almost nothing*.
($14m might sound like enough money to run a popular website, but for
comparison, I work as a sysadmin at a tiny, tiny publishing company
with more money and staff just in our department than that to do
*almost nothing* compared to what WMF achieves. WMF is an INCREDIBLY
efficient organisation.)
- Other attempts:
* Starting from a clear field makes it ridiculously easy. The
government example quoted above is one. Wikia wrote a good WYSIWYG
that works really nicely on new wikis (I'm speaking here as an
experienced wikitext user who happily fixes random typos on Wikia). Of
course, I noted that we can't start from a clear field - we have an
existing body of wikitext.
So, specification of the problem:
* We need good WYSIWYG. The government example suggests that a simple
word-processor-like interface would be enough to give tremendous
results.
* It needs two-way fidelity with almost all existing wikitext.
* We can't throw away existing wikitext, much as we'd love to.
* It's going to cost money in programming the WYSIWYG.
* It's going to cost money in rationalising existing wikitext so that
the most unfeasible formations can be shunted off to legacy for
chewing on.
* It's going to cost money in usability testing and so on.
* It's going to cost money for all sorts of things I haven't even
thought of yet.
This is a problem that would pay off hugely to solve, and that will
take actual money thrown at it.
How would you attack this problem, given actual resources for grunt work?
- d.
_______________________________________________
foundation-l mailing list
foundation-l(a)lists.wikimedia.org
Unsubscribe:
https://lists.wikimedia.org/mailman/listinfo/foundation-l
_______________________________________________
Wikitech-l mailing list
Wikitech-l(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l