Danny B. wrote:
Hello,
I'm looking for any kind of tool which would take the XML dump (most probably the
pages-meta-current.xml.bz2, at least the pages-articles.xml.bz2) and would return the list
of page titles (or alternatively/configurably page ids) of pages containing given string.
Does anybody have such (kind of) tool and is willing to share? Both command line or
webpage interface are OK.
Thank you.
Danny B.
I have one program to do so, dated 3 years ago. Many of us have probably
rewritten that wheel. I'll send you privately.
There's also
http://meta.wikimedia.org/wiki/User:Micke/WikiFind
Not the ultimate solution, but good enough. I have some changes to unify
both versions if you're interested.