Hello hackers,
I have just taken a look at some Wikimedia pages, and it struck me as odd that inline math was rendered without attention to the baseline.
If you are running your equations through LaTeX, I'd strongly suggest taking a look at the preview.sty package: this package will make it possible to produce either DVI files (for processing with dvipng, _very_ fast), PostScript or PDF files (through LaTeX/dvips or PDFLaTeX) with tight bounding boxes (using the "tightpage" option).
If you use the "lyx" option to preview.sty, the respective image information _including_ the baseline info will be written to the log file and to standard output from which it is easy to parse it.
Alternatively, you can tell dvipng to output height and depth information when converting from dvi to PNG files. This info, again, can easily be parsed.
As a result, you can produce appropriate image alignment tags that will make inline math produced with TeX align perfectly with its surroundings.
And this stuff is _fast_, very much so if you use dvipng.
preview-latex URL:http://preview-latex.sourceforge.net uses it as the base for WYSIWYG editing of LaTeX in Emacs windows, and it works comfortably with large documents on slow processors.
On Thu, May 13, 2004 at 10:27:40PM +0200, David Kastrup wrote:
Hello hackers,
I have just taken a look at some Wikimedia pages, and it struck me as odd that inline math was rendered without attention to the baseline.
HTML side is the difficult one, not the TeX side. How to do that in HTML ?
Tomasz Wegrzanowski wrote:
On Thu, May 13, 2004 at 10:27:40PM +0200, David Kastrup wrote:
I have just taken a look at some Wikimedia pages, and it struck me as
odd that inline math was rendered without attention to the baseline.
HTML side is the difficult one, not the TeX side. How to do that in HTML ?
Well, if we can get the height in pixels of the image (easy) _and_ the Y-pixel position in the image of the baseline (????) it might be possible to use relative positioning on the <img> to bump it down a few pixels to match with the surrounding text baseline, something like:
<img src="/math/123465.png" style="position:relative; top:4px" />
I haven't tested this theory, and that's dependent on getting the information out of the renderer.
-- brion vibber (brion @ pobox.com)
Brion Vibber brion@pobox.com writes:
Tomasz Wegrzanowski wrote:
On Thu, May 13, 2004 at 10:27:40PM +0200, David Kastrup wrote:
I have just taken a look at some Wikimedia pages, and it struck me as
odd that inline math was rendered without attention to the baseline.
HTML side is the difficult one, not the TeX side. How to do that in HTML ?
Well, if we can get the height in pixels of the image (easy) _and_ the Y-pixel position in the image of the baseline (????) it might be possible to use relative positioning on the <img> to bump it down a few pixels to match with the surrounding text baseline, something like:
<img src="/math/123465.png" style="position:relative; top:4px" />
I haven't tested this theory, and that's dependent on getting the information out of the renderer.
Well, as I already wrote: the information can be retrieved by using preview.sty and/or dvipng, easily. dvipng has command line options for writing out the ascender/descender information, and preview.sty will output the interesting information if you just use the "lyx" option.
David Kastrup wrote:
Brion Vibber brion@pobox.com writes:
Well, if we can get the height in pixels of the image (easy) _and_ the Y-pixel position in the image of the baseline (????) it might be
Well, as I already wrote: the information can be retrieved by using preview.sty and/or dvipng, easily. dvipng has command line options for writing out the ascender/descender information, and preview.sty will output the interesting information if you just use the "lyx" option.
Saying it's easy is nice, providing code to do it is better. ;)
I might try looking into it when I get a chance, but my hands are pretty full already.
-- brion vibber (brion @ pobox.com)
Brion Vibber brion@pobox.com writes:
David Kastrup wrote:
Brion Vibber brion@pobox.com writes:
Well, if we can get the height in pixels of the image (easy) _and_ the Y-pixel position in the image of the baseline (????) it might be
Well, as I already wrote: the information can be retrieved by using preview.sty and/or dvipng, easily. dvipng has command line options for writing out the ascender/descender information, and preview.sty will output the interesting information if you just use the "lyx" option.
Saying it's easy is nice, providing code to do it is better. ;)
What do you mean, "to do it"?
With preview.sty, the output is written into a single line, starting with a unique keyword phrase, followed by four integers (the number of the image, relevant width, depth and height) in the units of scaled TeX points (of which there are exactly 65781.76 per Postscript point).
That _is_ the code to do it. Its output is a single identifiable line of complete trivially parseable ASCII.
Something like
Preview: Snippet 3 13200342 455360 36455
And then there is another line of completely trivially parseable ASCII (keyword and 4 numbers) that tells the size of the border.
Something like
Preview: Tightpage 32891 32891 32891 32891
If you did not change the border sizes yourself (and no LaTeX code outside of your control did), the numbers are fixed: 0.50001 PostScript points on each side. So you need not even parse that information.
Extracting these lines from the .log file or the terminal output is a matter of matching with
^Preview: Snippet ([0-9]+) ([0-9]+) ([0-9]+) ([0-9]+)
and then using the \1 \2 \3 \4, with whatever matcher/processor you happen to be working with.
Really, I am at a loss at what "code" to show for that. It should be a one-liner in most scripting languages.
David Kastrup wrote:
Brion Vibber brion@pobox.com writes:
Saying it's easy is nice, providing code to do it is better. ;)
What do you mean, "to do it"?
He means, send us a patch to the MediaWiki software that implements your suggestion.
With preview.sty, the output is written into a single line,
Either I'm dumb, or you have never mentioned how to get it (what anyway? dvips or what?) to output this "single line".
It should be a one-liner in most scripting languages.
Often the difficult thing is not to write the one line, but to figure out where in the code to put it.
Additionally, you're forgetting that the LaTeX is (obviously) not rendered and re-rendered and re-re-rendered for every single pageview. The images are cached. Hence you would also need to come up with a way to cache these extra values, and then tell the parser to output them in the right place.
Programming that in PHP into the MediaWiki software is what Brion meant by "the code to do it".
Timwi
Timwi timwi@gmx.net writes:
David Kastrup wrote:
Brion Vibber brion@pobox.com writes:
Saying it's easy is nice, providing code to do it is better. ;)
What do you mean, "to do it"?
He means, send us a patch to the MediaWiki software that implements your suggestion.
Since I don't even know what language your converters are written in, this would be hard to do.
With preview.sty, the output is written into a single line,
Either I'm dumb, or you have never mentioned how to get it (what anyway? dvips or what?) to output this "single line".
Pass the "lyx" option to the preview package. I won't go looking for Message Ids.
It should be a one-liner in most scripting languages.
Often the difficult thing is not to write the one line, but to figure out where in the code to put it.
Right. This is the reason that I was telling the people that supposedly know the code about this possibility.
Additionally, you're forgetting that the LaTeX is (obviously) not rendered and re-rendered and re-re-rendered for every single pageview.
I am not forgetting anything. Obviously, the rendered final HTML (which includes the ascender information, as well as the image dimensions, as well as math rendered into HTML instead of PNG) has to get cached somewhere already now. Whether the images can be cached depends on whether you want to go for the image-size-fits-browser-font-size hacks of Jan-Åke, or just render at a fixed size. Since dvipng is rather light on resources and can even be run continuously, it would be quite feasible to cache the .dvi files (which are size independent) and rerender the png dynamically.
But of course you need not implement everything at once. Just replacing your current image generating process (which already _is_ there) with a dvipng-based process would be a start.
The images are cached. Hence you would also need to come up with a way to cache these extra values, and then tell the parser to output them in the right place.
Does that mean that the Wiki pages are created dynamically each time and only the images get cached? The usual HTML specifications _strongly_ recommend that the image dimensions are specified already in the HTML so that the page layout engine needs not relayout the page every time an image download completes. So in case you are _not_ already caching the image dimensions, it might be an idea to do so.
Programming that in PHP into the MediaWiki software is what Brion meant by "the code to do it".
Shrug. As you please. I am just telling you that software is here that would quite benefit the appearance and efficiency of your software. I am willing to help doing what it takes to get you the most of it from our project. Jan-Åke, the author of dvipng, already offered you the respective code to generate the HTML dynamically in perl.
I can't say anything about him, of course, but I don't have the resources to do your work. I'll help with what it takes on the side of preview.sty and other tools from our project to work for your application (and they should actually work rather out of the box), but that's about it. It would be unfair to the users and developers of my own projects if I invested the time needed to get acquainted with yours, because the benefits are considered so small that nobody in the project itself would want to invest any work on reasonably good-looking math.
On Fri, 2004-05-14 at 22:27 +0200, David Kastrup wrote:
I am not forgetting anything. Obviously, the rendered final HTML (which includes the ascender information, as well as the image dimensions, as well as math rendered into HTML instead of PNG) has to get cached somewhere already now. Whether the images can be cached depends on whether you want to go for the image-size-fits-browser-font-size hacks of Jan-Åke, or just render at a fixed size. Since dvipng is rather light on resources and can even be run continuously, it would be quite feasible to cache the .dvi files (which are size independent) and rerender the png dynamically.
Hm, th easiest way to scale images relative to the font size should be to use em's for height & width (and possible offsets via position:relative), but usually browsers don't do a very good job at scaling images- so i'm not sure if this is desirable at all.
Gabriel Wicke
Gabriel Wicke wrote:
Hm, th easiest way to scale images relative to the font size should be to use em's for height & width (and possible offsets via position:relative), but usually browsers don't do a very good job at scaling images- so i'm not sure if this is desirable at all.
Yes, that's not going to look goot at all. Images are problematic for a number of reasons: they don't scale well to different screen font sizes (ugly scaling or appearance at wrong size), they don't scale well to print resolution (ugly scaling), and they don't fit in well with alternate color schemes (transparent background has been considered, but would tend to fail if the user has overridden the background color to black, making hardcoded black math text illegible).
The best way to scale math is to render it as inline HTML (only looks decent for very simple stuff, as we do now) or MathML (which is provided for in 1.3, but only works with some browsers, only works in well-formed XHTML mode, and the MathML conversions haven't yet been completed.)
-- brion vibber (brion @ pobox.com)
Gabriel Wicke lists@wikidev.net writes:
On Fri, 2004-05-14 at 22:27 +0200, David Kastrup wrote:
I am not forgetting anything. Obviously, the rendered final HTML (which includes the ascender information, as well as the image dimensions, as well as math rendered into HTML instead of PNG) has to get cached somewhere already now. Whether the images can be cached depends on whether you want to go for the image-size-fits-browser-font-size hacks of Jan-Åke, or just render at a fixed size. Since dvipng is rather light on resources and can even be run continuously, it would be quite feasible to cache the .dvi files (which are size independent) and rerender the png dynamically.
Hm, th easiest way to scale images relative to the font size should be to use em's for height & width (and possible offsets via position:relative), but usually browsers don't do a very good job at scaling images- so i'm not sure if this is desirable at all.
We are running in circles here. Jan-Åke already gave the link URL:http://www.mai.liu.se/~jalar/dvipng/test.html where this and other ways of generating the display are not only discussed, but also demonstrated, and he offered the code generating the images if anybody was interested.
When Angus Leeming (who was not previously familiar with preview.sty and dvipng) integrated previews into LyX, he took about three days. And he had no offer of tested scripts that would already generate the complete HTML, and he also had to integrate the output into a display engine instead of just adapting scripts for a different scripting environment.
Anyway, the converters are there, they are available, the various methods of generating HTML are tested for you, you have a page where this technique is demonstrated, you can take a look the respective HTML, you can get the Perl scripts for generating them, and you can get support for using them from the respective authors of the LaTeX styles and the dvipng converter. Everything that is released is available under the GPL.
If you find that is not sufficient for you to feel it worth investing even a comparatively small amount of work into getting rendering as shown on the above web page, tough.
One can lead a horse to water. Maybe it will get thirsty eventually.
David Kastrup wrote:
Gabriel Wicke lists@wikidev.net writes:
Hm, th easiest way to scale images relative to the font size should be to use em's for height & width (and possible offsets via position:relative), but usually browsers don't do a very good job at scaling images- so i'm not sure if this is desirable at all.
We are running in circles here. Jan-Åke already gave the link URL:http://www.mai.liu.se/~jalar/dvipng/test.html where this and other ways of generating the display are not only discussed, but also demonstrated, and he offered the code generating the images if anybody was interested.
The images there (the good-looking ones) are not scaled by the browser. The image is rendered at the proper resolution on the server side, and displayed in the browser window.
When/if you want to look at this closer, let me know. /JÅ
Jan-Åke Larsson jalar@mai.liu.se writes:
David Kastrup wrote:
We are running in circles here. Jan-Åke already gave the link URL:http://www.mai.liu.se/~jalar/dvipng/test.html where this and other ways of generating the display are not only discussed, but also demonstrated, and he offered the code generating the images if anybody was interested.
The images there (the good-looking ones) are not scaled by the browser. The image is rendered at the proper resolution on the server side, and displayed in the browser window.
Sure. If one wanted the browser to render, one would have to convert the dvi file to SVG. I don't think that there is a usable SVG backend for GhostScript available right now; that would have been a good solution.
David, your suggestion is appreciated but as you are perhaps aware this is a free software project developed by volunteers in their spare time. Not every suggestion is going to get implemented immediately.
-- brion vibber (brion @ pobox.com)
Brion Vibber brion@pobox.com writes:
David, your suggestion is appreciated but as you are perhaps aware this is a free software project developed by volunteers in their spare time. Not every suggestion is going to get implemented immediately.
Sure. It was just a suggestion from the maintainer of free software projects developed by volunteers in their spare time to let others reap the benefits of our efforts, and an offer to support this by whatever steps would be required in the context of our projects.
It is an offer of collaboration on a task that would entirely benefit your project. That is all that I feel qualified to offer. I don't have the resources to offer also doing the part of the work that has to be done on your codebase, even though it would not seem that much work in that area would actually be required before getting results.
Tomasz Wegrzanowski wrote:
On Thu, May 13, 2004 at 10:27:40PM +0200, David Kastrup wrote:
Hello hackers,
I have just taken a look at some Wikimedia pages, and it struck me as odd that inline math was rendered without attention to the baseline.
HTML side is the difficult one, not the TeX side. How to do that in HTML ?
I have done some testing and come up with a (crude, but mostly working) solution. Point your browser to http://www.mai.liu.se/~jalar/dvipng/test.html
It works with a fairly new Exploder or Mozilla (JS-enabled), which will display math at the right size at the right alignment. If your browser won't redisplay after a font size change, click on one of the images. Sadly there is no "fontSizeChange" JS event.
I haven't done anything about other browsers. Let me know if you want the scripts.
/JÅ
wikitech-l@lists.wikimedia.org