Jared Williams wrote:
SDCHing MediaWiki HTML would take some effort, as the page output is between skin classes and OutputPage etc.
Also would want the translation text from \languages\messages\Messages*.php in there too I think. Handling the $1 style placeholders is easy, its just determining what message goes through which wfMsg*() function, and if the WikiText translations can be preconverted to html.
But most of the HTML comes from article wikitext, so I wonder wether it'd beat gzip by anything significant.
Jared
Note that SDCH is expected to be then gzipped, as they fulfill different needs. They aren't incompatible. You would use a dictionary for common skin bits, perhaps also adding some common page features, like the TOC code, 'amp;action=edit&redlink=1" class="new"'...
Having a second dictionary for language dependant output could be also interesting, but not all messages should be provided.
Simetrical wrote:
What happens if you have parser functions that depend on the value of $1 (allowed in some messages AFAIK)? What if $1 contains wikitext itself (I wouldn't be surprised if that were true somewhere)? How do you plan to do this substitution anyway, JavaScript? What about clients that don't support JavaScript?
/Usually/, you don't create the dictionary output by hand, but pass the page to a "dictionary compresser" (or so is expected, this is too much experimental yet). If a parser function changed it completely, they will just be literals. If you have a parametrized block, the vcdiff would see, "this piece up to Foo matches this dictionary section, before $1. And this other matches the text following Foo..."
Jared wrote:
I do have working PHP code, That can parse PHP templates & language strings to generate the dictionary, and a new set of templates rewritten to output the vcdiff efficiently.
Please share?