Re: [Wikitech-l] Big problem to solve: good WYSIWYG on WMF wikis

29 Dec 2010

On 29 December 2010 02:07, Happy-melon &lt;happy-melon(a)live.com&gt; wrote:
...
  There are some things that we know:

 1) as Brion says, MediaWiki currently only presents content in one way: as
 wikitext run through the parser.  He may well be right that there is a
 bigger fish which could be caught than WYSIWYG editing by saying that MW
 should present data in other new and exciting ways, but that's actually a
 separate question.  *If* you wish to solve WYSIWYG editing, your baseline is
 wikitext and the parser. 
Specifically, it only presents content as HTML. It's not really a
parser because it doesn't create an AST (Abstract Syntax Tree). It's a
wikitext to HTML converter. The flavour of the HTML can be somewhat
modulated by the skin but it could never output directly to something
totally different like RTF or PDF.

...
  2) "guacamole" is one of the more unusual
descriptors I've heard for the
 parser, but it's far from the worst.  We all agree that it's horribly messy
 and most developers treat it like either a sleeping dragon or a *very*
 grumpy neighbour.  I'd say that the two biggest problems with it are that a)
 it's buried so deep in the codebase that literally the only way to get your
 wikitext parsed is to fire up the whole of the rest of MediaWiki around it
 to give it somewhere comfy to live in, 
I have started to advocate the isolation of the parser from the rest
of the innards or MediaWiki for just this reason:
https://bugzilla.wikimedia.org/show_bug.cgi?id=25984

Free it up so that anybody can embed it in their code and get exactly
the same rendering that Wikipedia et al get, guaranteed.

We have to find all the edges where the parser calls other parts of
MediaWiki and all the edges where other parts of MediaWiki call the
parser. We then define these edges as interfaces so that we can drop
an alternative parser into MediaWiki and drop the current parser into
say an offline viewer or whatever.

With a freed up parser more people will hack on it, more people will
come to grok it and come up with strategies to address some of its
problems. It should also be a boon for unit testing.

(I have a very rough prototype working by the way with lots of stub classes)

...
  and b) there is as David says no way
 of explaining what it's supposed to be doing except saying "follow the code;
 whatever it does is what it's supposed to do".  It seems to be generally
 accepted that it is *impossible* to represent everything the parser does in
 any standard grammar. 
I've thought a lot about this too. It certainly is not any type of
standard grammar. But on the other hand it is a pretty common kind of
nonstandard grammar. I call it a "recursive text replacement grammar".

Perhaps this type of grammar has some useful characteristics we can
discover and document. It may be possible to follow the code flow and
document each text replacement in sequence as a kind of parser spec
rather than trying and failing again to shoehorn it into a standard
LALR grammar.

If it is possible to extract such a spec it would then be possible to
implement it in other languages.

Some research may even find that is possible to transform such a
grammar deterministically into an LALR grammar...

But even if not I'm certain it would demysitfy what happens in the
parser so that problems and edge cases would be easier to locate.

Andrew Dunbar (hippietrail)

...
  Those are all standard gripes, and nothing new or
exciting.  There are also,
 to quote a much-abused former world leader, some known unknowns:

 1) we don't know how to explain What You See when you parse wikitext except
 by prodding an exceedingly grumpy hundred thousand lines of PHP and *asking
 What it thinks* You Get.

 2) We don't know how to create a WYSIWYG editor for wikitext.

 Now, I'd say we have some unknown unknowns.

 1) *is* it because of wikitext's idiosyncracies that WYSIWYG is so
 difficult?  Is wikitext *by its nature* not amenable to WYSIWYG editing?

 2) would a wikitext which *was* representable in a standard grammar be
 amenable to WYSIWYG editing?

 3) would a wikitext which had an alternative parser, one that was not buried
 in the depths of MW (perhaps a full JS library that could be called in
 real-time on the client), be amenable to WYSIWYG editing?

 4) are questions 2 and 3 synonymous?

 --HM

 "David Gerard" &lt;dgerard(a)gmail.com&gt; wrote in
 message news:AANLkTimthUx-UndO1CTnexcRqbPP89t2M-PVhA6FkFp8@mail.gmail.com...
  [crossposted to foundation-l and wikitech-l]

 "There has to be a vision though, of something better. Maybe something
 that is an actual wiki, quick and easy, rather than the template
 coding hell Wikipedia's turned into." - something Fred Bauder just
 said on wikien-l.

 Our current markup is one of our biggest barriers to participation.

 AIUI, edit rates are about half what they were in 2005, even as our
 fame has gone from "popular" through "famous" to "part of the
 structure of the world." I submit that this is not a good or healthy
 thing in any way and needs fixing.

 People who can handle wikitext really just do not understand how
 offputting the computer guacamole is to people who can cope with text
 they can see.

 We know this is a problem; WYSIWYG that works is something that's been
 wanted here forever. There are various hideous technical nightmares in
 its way, that make this a big and hairy problem, of the sort where the
 hair has hair.

 However, I submit that it's important enough we need to attack it with
 actual resources anyway.

 This is just one data point, where a Canadian government office got
 *EIGHT TIMES* the participation in their intranet wiki by putting in a
 (heavily locally patched) copy of FCKeditor:

   http://lists.wikimedia.org/pipermail/mediawiki-l/2010-May/034062.html

 "I have to disagree with you given my experience. In one government
 department where MediaWiki was installed we saw the active user base
 spike from about 1000 users to about 8000 users within a month of having
 enabled FCKeditor. FCKeditor definitely has it's warts, but it very
 closely matches the experience non-technical people have gotten used to
 while using Word or WordPerfect. Leveraging skills people already have
 cuts down on training costs and allows them to be productive almost
 immediately."

   http://lists.wikimedia.org/pipermail/mediawiki-l/2010-May/034071.html

 "Since a plethora of intelligent people with no desire to learn WikiCode
 can now add content, the quality of posts has been in line with the
 adoption of wiki use by these people. Thus one would say it has gone up.

 "In the beginning there were some hard core users that learned WikiCode,
 for the most part they have indicated that when the WYSIWYG fails, they
 are able to switch to WikiCode mode to address the problem. This usually
 occurs with complex table nesting which is something that few of the
 users do anyways. Most document layouts are kept simple. Additionally,
 we have a multilingual english/french wiki. As a result the browser
 spell-check is insufficient for the most part (not to mention it has
 issues with WikiCode). To address this a second spellcheck button was
 added to the interface so that both english and french spellcheck could
 be available within the same interface (via aspell backend)."

 So, the payoffs could be ridiculously huge: eight times the number of
 smart and knowledgeable people even being able to *fix typos* on
 material they care about.

 Here are some problems. (Off the top of my head; please do add more,
 all you can think of.)

 - The problem:

 * Fidelity with the existing body of wikitext. No conversion flag day.
 The current body exploits every possible edge case in the regular
 expression guacamole we call a "parser". Tim said a few years ago that
 any solution has to account for the existing body of text.

 * Two-way fidelity. Those who know wikitext will demand to keep it and
 will bitterly resist any attempt to take it away from them.

 * FCKeditor (now CKeditor) in MediaWiki is all but unmaintained.

 * There is no specification for wikitext. Well, there almost is -
 compiled as C, it runs a bit slower than the existing PHP compiler.
 But it's a start!
 http://lists.wikimedia.org/pipermail/wikitext-l/2010-August/000318.html

 - Attempting to solve it:

 * The best brains around Wikipedia, MediaWiki and WMF have dashed
 their foreheads against this problem for at least the past five years
 and have got *nowhere*. Tim has a whole section in the SVN repository
 for "new parser attempts". Sheer brilliance isn't going to solve this
 one.

 * Tim doesn't scale. Most of our other technical people don't scale.
 *We have no resources and still run on almost nothing*.

 ($14m might sound like enough money to run a popular website, but for
 comparison, I work as a sysadmin at a tiny, tiny publishing company
 with more money and staff just in our department than that to do
 *almost nothing* compared to what WMF achieves. WMF is an INCREDIBLY
 efficient organisation.)

 - Other attempts:

 * Starting from a clear field makes it ridiculously easy. The
 government example quoted above is one. Wikia wrote a good WYSIWYG
 that works really nicely on new wikis (I'm speaking here as an
 experienced wikitext user who happily fixes random typos on Wikia). Of
 course, I noted that we can't start from a clear field - we have an
 existing body of wikitext.

 So, specification of the problem:

 * We need good WYSIWYG. The government example suggests that a simple
 word-processor-like interface would be enough to give tremendous
 results.
 * It needs two-way fidelity with almost all existing wikitext.
 * We can't throw away existing wikitext, much as we'd love to.
 * It's going to cost money in programming the WYSIWYG.
 * It's going to cost money in rationalising existing wikitext so that
 the most unfeasible formations can be shunted off to legacy for
 chewing on.
 * It's going to cost money in usability testing and so on.
 * It's going to cost money for all sorts of things I haven't even
 thought of yet.

 This is a problem that would pay off hugely to solve, and that will
 take actual money thrown at it.

 How would you attack this problem, given actual resources for grunt work?

 - d.

 _______________________________________________
 foundation-l mailing list
 foundation-l(a)lists.wikimedia.org
 Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l

 _______________________________________________
 Wikitech-l mailing list
 Wikitech-l(a)lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikitech-l

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

Re: [Wikitech-l] Big problem to solve: good WYSIWYG on WMF wikis