Hi
I released a new version of the Eclipse Wikipedia Plugin [1].
Changes: * a new context-menu item in the editor for creating all files for a given category [2]. * a first Export Wizard to convert Wikipedia articles into a single PDF file [3].
The PDF creation depends on rendering the wikipedia article into a complete xml compliant file in the first step. In the next step, the internal iText PDF library [4] uses a SAX parser to convert the HTML to PDF. As this is not always possible for the plugins current Wikipedia to HTML parser, sometimes articles are not included in the pdf file.
Therefore the next step for a "better PDF output" is IMHO to improve the internal parser to handle all special cases for the Wikipedia tags. I created some JUnit tests [5] which I would like to use to improve the parser. Does someone has a list of typical "evil" Wikipedia syntax constructs and how they should be rendered in (x)html? Is there someone on the list who knows the iText library and could give some general hints how to improve the PDF output (i.e. best design for Chapters, Sections and internal links between the pages)?
[1] http://prdownloads.sourceforge.net/phpeclipse/phpeclipse.community_tools_1.1... [2] http://www.plog4u.org/index.php/Using_Eclipse_Wikipedia_Editor:Download_a_Wi... [3] http://www.plog4u.org/index.php/Using_Eclipse_Wikipedia_Editor:Export_to_PDF... [4] http://www.lowagie.com/iText [5] http://cvs.sourceforge.net/viewcvs.py/phpeclipse/org.plog4u.wiki.test/src/or...