On Sat, Jul 25, 2009 at 6:21 AM, Danny B.<Wikipedia.Danny.B(a)email.cz> wrote:
I'm looking for any kind of tool which would take
the XML dump (most probably the
pages-meta-current.xml.bz2, at least the pages-articles.xml.bz2) and would return
the list of page titles (or alternatively/configurably page ids) of pages containing
given string.
I have had good luck in the past simply writing little C programs to
use libexpat for this purpose.
- Carl