On Sat, Feb 23, 2008 at 8:32 PM, Ragib Hasan ragibhasan@gmail.com wrote:
Hi, I need to extract the only the text from a Wikipedia page. I.e., I need to remove all wiki markup, section headings etc, to extract only the text a reader will read.
Get the rendered HTML, and remove all the HTML markup.