Hi
I released a new version of the Eclipse Wikipedia Plugin [1].
Changes:
* a new context-menu item in the editor for creating all files for a
given category [2].
* a first Export Wizard to convert Wikipedia articles into a single
PDF file [3].
The PDF creation depends on rendering the wikipedia article into a
complete xml compliant file in the first step.
In the next step, the internal iText PDF library [4] uses a SAX parser
to convert the HTML to PDF.
As this is not always possible for the plugins current Wikipedia to
HTML parser,
sometimes articles are not included in the pdf file.
Therefore the next step for a "better PDF output" is IMHO to improve
the internal parser
to handle all special cases for the Wikipedia tags.
I created some JUnit tests [5] which I would like to use to improve the parser.
Does someone has a list of typical "evil" Wikipedia syntax constructs
and how they should
be rendered in (x)html?
Is there someone on the list who knows the iText library and could
give some general hints
how to improve the PDF output
(i.e. best design for Chapters, Sections and internal links between the pages)?
[1]
http://prdownloads.sourceforge.net/phpeclipse/phpeclipse.community_tools_1.…
[2]
http://www.plog4u.org/index.php/Using_Eclipse_Wikipedia_Editor:Download_a_W…
[3]
http://www.plog4u.org/index.php/Using_Eclipse_Wikipedia_Editor:Export_to_PD…
[4]
http://www.lowagie.com/iText
[5]
http://cvs.sourceforge.net/viewcvs.py/phpeclipse/org.plog4u.wiki.test/src/o…
--
Axel Kramer
http://www.plog4u.org - Wikipedia Eclipse Plugin