Hello, In my project, I need to access and get the contents in Wikipedia,and then build a inverted index using these contents. But I don't know how to get the contents in Wikipedia efficiently and easily. What can I do ? Are there any APIs ? Thank you for your time and help.
===================== best wishes!
In my project, I need to access and get the contents in Wikipedia,and then build a inverted index using these contents. But I don't know how to get the contents in Wikipedia efficiently and easily. What can I do ? Are there any APIs ?
Yes, see http://mediawiki.org/wiki/API
If you need the full-text of many or even all articles you need to download a database dump: http://download.wikimedia.org
If the project is useful for other Wikipedians you may want to look at Wikimedia's Toolserver which offers a live copy of the database: http://meta.wikimedia.org/wiki/TS
mediawiki-l@lists.wikimedia.org