We used a python mysql library to access the article text and it was quite fast. The critical step for increased speed was the following index:
dbCursor.execute("ALTER TABLE page ADD INDEX (page_title);")
We've got several thousands lines of python code that is already GPL'ed (since the copyleft goes at the top, it's the first code I write ;) and will be released at Wikimania.
On 7/2/07, GerardM gerard.meijssen@gmail.com wrote:
Hoi, It is nice that the pywikipediabot provides certain functionality. The question is how usable is it in the first place when this info is needed interactively. Also when there are more tools that provide a great job, we will get a situation where advances in one tool will egg on the people of another tool to do even better.
Thanks, GerardM
On 7/2/07, Brian Brian.Mingus@colorado.edu wrote:
Pywikipediabot provides this functionality under a free license.
/Brian
On 7/2/07, Jimmy Wales <jwales@wikia.com > wrote:
Why don't you release this under a free license so that the Wikimedia Foundation could use it?
On Jul 2, 2007, at 8:33 AM, Torsten Zesch wrote:
JWPL - Java Wikipedia Library version 0.3 beta is now available http://www.ukp.tu-darmstadt.de/software/JWPL
INTRODUCTION
Lately, Wikipedia has been recognized as a promising lexical semantic resource. We present JWPL, a free Java-based Wikipedia application programming interface, that enables the use of Wikipedia as a NLP resource by providing efficient programmatic access to the knowledge therein.
FUNCTIONALITY
Fast access to:
- article text
- categories
- redirects
- links between articles (ingoing and outgoing).
Discrimination between
- article pages
- disambiguation pages
- redirect pages.
Available languages:
- English
- German
- Czech
- Ukrainian
Other languages will be added step by step.
DOWNLOAD
JWPL Java library http://www.ukp.tu-darmstadt.de/software/JWPL
Wikipedia data (with database scheme optimized for large-scale NLP tasks) ftp://ftp.tu-darmstadt.de/pub/tud/informatik/JWPL_data
LICENCE
JWPL is free for non-profit and non-commercial use.
ABOUT
JWPL was developed by the Ubiquitous Knowledge Processing Lab at Darmstadt University of Technlogy.
http://www.ukp.tu-darmstadt.de
Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org http://lists.wikimedia.org/mailman/listinfo/wiki-research-l
Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org http://lists.wikimedia.org/mailman/listinfo/wiki-research-l
Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org http://lists.wikimedia.org/mailman/listinfo/wiki-research-l
Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org http://lists.wikimedia.org/mailman/listinfo/wiki-research-l