Hi Magnus,
This would be really great if I could do that!
Where can I download the "real" parser?
Can I use it in the following way:
=> let's suppose: - the parser's name is "wiki_to_html_parser", - I have a "Wikipedia" article in its "Wikitext" version "article.wikitext", - I want to generated the corresponding "HTML" file "article.html"
=> could I execute something like:
------------------------------------------------------------------------------- command_line> wiki_to_html_parser -wikitext article.wikitext -html article.html
------------------------------------------------------------------------------- which would generate "article.html" from "article.wikitext" using the "real" parser?
And what would be even better for me, would be to be able to do that from inside a Java program. Is it possible?
Thank you for your help. Sincerely, -- Léa
On 8/7/2010 8:19 PM, Magnus Manske wrote:
So why not use the "real" parser?
- Get rendered HTML page
- Extract<div id="bodyContent">
- Take the first<p> element in there
Profit!
Magnus