Tomasz Wegrzanowski wrote:
>"Show "s and "List "s should be removed.
>They are only distracting.
I was just thinking the same thing earlier today.
-- Toby Bartels
<toby+wikitech-l(a)math.ucr.edu>
You can now update Wikipedia using rsync.
Get directory wiki-xx/ from tarball
or create empty one and then:
rsync -rz wikipedia.com::xx/ wiki-xx/
Because rsync works in real time (unlike daily tarballs)
and transfers only what actually changed,
now it's quite reasonable to have mirror of wikipedia that's
updated by cron every hour or even more often without using any
extraordinary bandwidth.
Only pl, de and eo wikipedias are available now.
It should be easy to add, but it still hasn't been added.
If it's just because Jimbo is too busy, could he give some other
person access to Wikipedia server so that rsync can be set up ?
The fact that tarball is generated only once per day is serious
problem for topological analysis that must be done in Polish
Wikipedia to make optimal start page.
On Wed, Jul 31, 2002 at 02:02:48PM -0700, lcrocker(a)nupedia.com wrote:
> > I've hacked the phpwiki code to add a new namespace
> > [[math: ]] so you can write formulas like [[math:a^2+b^2=c^2]] or
> > [[math:\sum_{n=0}^{\infty}\frac{1}{n}=\infty]].
> >
> > The code will make TeX create an image (PNG) of the formula. Those
> > images will be cached, they will be created only once and will be
> > shared between articles. ( [[math:E=m c^2]] might be used on many
> > pages ).
>
> That's almost exactly what I had in mind; only I think it would be
> better for the back end to have to have a spearate process
> communicating with the Wiki code over IPC--the "TeX server"--
> which will take formulas and return images, out of a cache if
> they've been rendered already, calling TeX to render them if needed.
> The cache will be indexed by a hash function on a canonicalized
> text of the expression.
>
> The wiki end of this won't be hard. The other server is the real
> work. I'm afraid I can't take on that project right now, but if
> that server gets built, I'll be happy to call it.
Hi,
I moved this to wikitech-l since it will be much to detailed for
wikipedia-l.
The code is ready. It doesn't use IPC, no client/server. It only
exec's a shell script.
The PHP script creates a tempfile with suffix .tex and calls a
shellscript with the name of the tempfile as a parameter.
The script
- renders TeX to DVI using LaTeX
- converts DVI to eps using dvips
- renders eps to PNG using convert from ImageMagick
- removes all tempfiles created by LaTeX
The .png has the same name as the .tex. The name is the
MD5-Hash of the formula. The .png will placed in an
image directory of the webserver, in my implementation
it's /upload/, but should probably be different, e.g.
/upload/math/.
The PHP code than adds the formula and the name of the
image file to a new table in the database. Using this
cache it decides whether to render a formula.
It then replaces [[math:formula]] by
<img src="/upload/imagename.png" alt="formula" align="middle">
(OK, using the TeX-Formula as alt-text is not always really
readable, but always better than no alt-text at all)
The only thing still to be done is preparing some TeX preamble to
prevent something like
[[math:\mbox{\include{"/etc/passwd"}}]]
Alternatively, everything could be installed chroot, but
this would have to include Apache and PHP and LaTeX.
The shell script looks like this:
#!/bin/sh
exec >> /tmp/tex2png.log
exec 2>&1
set -x
cd /tmp
echo $*
cat $1
echo
latex $1
dvips -E $1
convert -density 110 $1.ps $1.png
cp $1.png /home/jf/wiki/phpwiki/newcodebase/upload/
chmod a+r /home/jf/wiki/phpwiki/newcodebase/upload/$1.png
rm ${1}*
$1 is the hashcode of the formula.
Providing the diff of the PHP-code would take a little
time - I have to remove my attempts in wiki-table-markup
first.
For an example see
http://jeluf.mine.nu/jf/newcodebase/wiki.phtml?title=Triangle
but please keep in mind - it's a 486/DX2 66 with 20 MB RAM,
a machine I normaly use for fetching mail only.
Regards,
JeLuF
JeLuF wrote:
> The script
> - renders TeX to DVI using LaTeX
> - converts DVI to eps using dvips
> - renders eps to PNG using convert from ImageMagick
Pdftex combined with a pdf-to-png converter should be faster since it
skips a conversion step.
Axel
I just realized that TeX by default can also read all files on the
system that the process has permissions to read, and we may want to
restrict that; this is done with the line
openin_any = p
in the file texmf.cf.
Axel
Neil wrote:
>We should also be really cautious about TeX doing insecure things. Is
>there a subset of TeX syntax we could parse and validate before we
>pass it to TeX?
There are two dangerous commands in TeX: the ability to write to
arbitrary files, and the ability to call shell scripts. Both are
disabled in all standard TeX distributions. Parsing and validating is
thus not necessary (and next to impossible without reimplementing a
good chunk of TeX). We have to start TeX in a temporary directory
which is cleaned out afterwards, and we have to guard against
run-away TeX processes which eat time and/or memory. The TeX process
needs to have its resources limited.
See also the discussion at http://groups.google.com/groups?threadm=d55ab765.0111091929.1e4b9af4%40post…
Axel
Are there any reasons that the DNS entry for wikipedia.com still
points to the old server rather than to the new? When I type a URL
into a browser, I always omit the "www", expecting that it won't make a
difference.
Axel