[Wiki-research-l] Library to filter HTML

31 Jan 2008


      Hi all.
I'm adding some tweaks to the WikiXRay parser of meta-history dumps. I
 now extract internal, external links, and so on, but I'd also like to
 extract the plain text (without HTML code and, possibly, also filtering
 wiki tags).
Does anyone nows a good python library to do that? I believe there
 should be something out there, as there exist bots and crawlers automating
 the data extraction process from one wiki to other.
Thanks in advance for your comments.
Felipe.
---------------------------------
¿Con Mascota por primera vez? - Sé un mejor Amigo
Entra en Yahoo! Respuestas.

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

[Wiki-research-l] Library to filter HTML