Hi,
Is there a way to parse a wiki text to get a simplified text (without HTML,
external and internal replaced by their text, ...) ?
My need is the following :
- The project Check
Wikipedia<http://de.wikipedia.org/wiki/Benutzer:Stefan_K%C3%BChn/Check_W…
a configuration file for each wiki (for example:
en <http://toolserver.org/%7Esk/checkwiki/enwiki/enwiki_translation.txt>)
- It's used among other things to generate pages in Wiki format (for
example:
en<http://en.wikipedia.org/wiki/Wikipedia:WikiProject_Check_Wikipedia>
)
- In the configuration file, you can see for example a description of
error n°1: *error_001_desc_script=This article has no bold title like
<nowiki>'''Title'''</nowiki>*, so it contains Wiki
text.
- I am writing a Java program
(
WikiCleaner<http://en.wikipedia.org/wiki/User:NicoV/Wikipedia_Cleaner/Do…)
to help fixing the errors reported by this tool. I'd like to display this
text in my program as a simple text: *This article has no bold title like
'''Title'''.*
Thanks,
Nico