Try the Sweble parser for extracting structured data from Wikitext http://sweble.org
http://dirkriehle.com, +49 157 8153 4150, +1 650 450 8550 On Nov 22, 2011 9:35 PM, "Fred Zimmerman" zimzaz.wfz@gmail.com wrote:
hi,
I want to programmatically extract lists from list pages on Wikipedia. That is to say, if there is a page that mostly consists of a list (list of episodes, list of presidents, etc.) I want to be able to extract the list from the page, with article names/links. Has anyone already done this? can anyone suggest a good strategy?
FredZ _______________________________________________ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l