Hi all.
I'm adding some tweaks to the WikiXRay parser of meta-history dumps. I now extract internal, external links, and so on, but I'd also like to extract the plain text (without HTML code and, possibly, also filtering wiki tags).
Does anyone nows a python library to do that? I believe there should be something out there, as there exist bots and crawlers automating the data extraction process from one wiki to other.
Thanks in advance for your comments.
Felipe.
---------------------------------
¿Con Mascota por primera vez? - Sé un mejor Amigo Entra en Yahoo! Respuestas.