Try the Sweble parser for extracting structured data from Wikitext
http://sweble.org
http://dirkriehle.com, +49 157 8153 4150, +1 650 450 8550
On Nov 22, 2011 9:35 PM, "Fred Zimmerman" <zimzaz.wfz(a)gmail.com> wrote:
hi,
I want to programmatically extract lists from list pages on Wikipedia. That
is to say, if there is a page that mostly consists of a list (list of
episodes, list of presidents, etc.) I want to be able to extract the list
from the page, with article names/links. Has anyone already done this? can
anyone suggest a good strategy?
FredZ
_______________________________________________
Wikitech-l mailing list
Wikitech-l(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l