On Wed, Nov 27, 2013 at 9:52 AM, Bjoern Hassler <bjohas+mw(a)gmail.com> wrote:
could I check whether this new process would pick up
formatting inserted via
css styles, e.g. attached to a <span> or <div>?
On our mediawiki (
http://www.oer4schools.org) we use a handful of different
css styles to provide boxes for different types of text (such as facilitator
notes or background reading). With the PediaPress tools this didn't render
at all (because the wiki text was parsed directly), but also <blockquote>
and table background colors did not render nicely, leaving us very few
options for highlighting blocks of text. (See here for an example of two
types of boxed text:
http://orbit.educ.cam.ac.uk/wiki/OER4Schools/ICTs_in_interactive_teaching.)
The current plan is for the latex renderer *NOT* to pick up CSS
styles, in general. The latex renderer will be a 'semantic renderer'
-- it will normalize the formatting to make it conform to house style.
It will be tuned to the needs of the Wikipedias. It knows about
certain CSS classes and Templates, but is not particularly
extensible...
...which is why it won't be the only backend! We also expect to have
an "HTML" renderer, which will apply CSS styles to the Parsoid output
and render to PDF via phantom JS (aka webkit).
This gives you two options, "faithful" and "beautiful". In my
experience so far, the LaTeX output, when it works, produces superior
output -- the typesetting is better, the ligatures and non-latin
support ought to be superior, the justification is nicer, and math
rendering should be stellar. We also use a two column layout and
normalize figure sizes to match the column widths, which helps
maintain a clean appearance. However, as you have noted, the LaTeX
renderer isn't particularly extensible, and there are cases where we
need to preserve the author's styling even at the cost of somewhat
less 'clean' output. Some articles can't easily be shoehorned into
our 'house style'. The Parsoid->HTML->webkit->PDF render path should
be a good solution in these cases, even if (for instance) the
paragraph justification and page splitting isn't quite as pretty.
(Browser technology continues to improve; one day it may be possible
to make the HTML->PDF pipeline just as pretty. So the "faithful"
approach is also our "forward-looking" renderer.)
Our architecture allows multiple 'backends' to be plugged in, so it is
possible there could be other options as well. I hope to refactor the
LaTeX backend at some point, for instance, to make it more extensible
so that you could in theory add special 'tweaks' for your wiki's
"house style". I could also add a CSS engine so that the LaTeX
backend could pick up certain CSS styles -- like table background
color, for instance.
It's all a work in progress, of course! But the
"faithful"/"beautiful" split is the principle we're working with.
--scott
--
(
http://cscott.net)