Here is first version of TeX rendering extension to Wikipedia.
It's not production code yet.
Please comment.
= How does it work =
New preference is introduced which says whether to:
* always render images as PNGs
* render them as HTML if they are simple enough, or as PNGs otherwise.
* leave them as pseudo-TeX (mainly for text browsers where neither PNG
nor HTML rendering would be visible)
ISSUE 1: While HTML reduces bandwidth, it is much uglier, so default is PNG-only.
ISSUE 2: PNGs are rendered with "a bit too big" font. That's on purpose.
A big too big works well in big and medium resolution, and is still readable
in small resolution. But "a bit too little" with big resolution would be very
hard to read.
Also new table is introduced:
CREATE TABLE math (
math_inputhash char(32) NOT NULL,
math_outputhash char(32) NOT NULL,
math_html text NOT NULL,
UNIQUE KEY math_inputhash (math_inputhash)
);
math_inputhash is MD5 of input markup, math_outputhash
is MD5 of output markup, math_html is HTML rendering or ""
if it's too difficult for HTML.
ISSUE 3: MD5 should be stored in binary in final version.
OutputPage.php calls renderMath() for every occurence of <math></math>
in code. If user decided he likes pseudo-TeX, then that's the end.
Otherwise it checks in database whether it is already rendered or not.
If it is, then it either takes HTML or generates link to image.
ISSUE 4: Directory for math images should be configurable and it should be
also known to texvc (command line ? compilation option ?).
It should not be upload directory.
ISSUE 5: Maybe it should use a/ab/ab*.png like other images. Or maybe
Wikipedia servershould move to reserfs.
ISSUE 6: Image should have ALT= tag
If image/html isn't generated yet, texvc is called. If it fails, message
is generated.
ISSUE 7: this message should be localized.
ISSUE 8: texvc shouldn't be in cgi-bin or care should be taken it can't
be called with any evil options.
Depending on return value of texvc results are generated and put into table
for caching.
ISSUE 9: failures are not cached. In final version they should be cached,
but cleaned on every upgrade of texvc (which may support more
TeX than previous version).
Now texvc takes input in first argument.
ISSUE 10: I'd rather use stdin but proc_open (popen2 for perl hackers) appears
only in PHP 4.3, but PHP 4.2 is still the standard
Then it LALR-parses it. What it parses is not real TeX. If HTML contains &foo;
and TeX doesn't, this preudo-TeX will contain \foo anyway. This ensures
that it's very easy to use.
Then it is standarized and md5 of standarized version computed.
ISSUE 11: race condition of 2 runs of texvc trying to generate the same PNG,
will have to be investigated
ISSUE 12: texvc should check here whether output PNG already exists (HTML is fast
to generate so it doesn't hurt to regenerate it). It may happen not only
in case of race condition, but also if it was generated from different
input markup (say from "x + y", and we do it from "x+y" now)
Then it prints md5 and HTML (if any) on stdout.
ISSUE 13: PHP should not wait for texvc to finish from this point. texvc should
probably fork() here.
Now latex, dvips and convert (which in turn uses ghostscript) are called.
ISSUE 14: Latex creates some temporary files. They should be created in some
tmp/ directory, not in current directory.