We used a python mysql library to access the article text and it was quite
fast. The critical step for increased speed was the following index:
dbCursor.execute("ALTER TABLE page ADD INDEX (page_title);")
We've got several thousands lines of python code that is already GPL'ed
(since the copyleft goes at the top, it's the first code I write ;) and will
be released at Wikimania.
On 7/2/07, GerardM <gerard.meijssen(a)gmail.com> wrote:
> Hoi,
> It is nice that the pywikipediabot provides certain functionality. The
> question is how usable is it in the first place when this info is needed
> interactively. Also when there are more tools that provide a great job, we
> will get a situation where advances in one tool will egg on the people of
> another tool to do even better.
> Thanks,
> GerardM
> On 7/2/07, Brian <Brian.Mingus(a)colorado.edu> wrote:
>
> > Pywikipediabot provides this
functionality under a free license.
>
> > /Brian
>
> > On 7/2/07, Jimmy Wales
<jwales(a)wikia.com > wrote:
> >
> > > Why
don't you release this under a free license so that the Wikimedia
> > > Foundation could use it?
> >
> > > On Jul
2, 2007, at 8:33 AM, Torsten Zesch wrote:
> >
> > >
> > > > JWPL - Java
Wikipedia Library
> > >
> > > >
version 0.3 beta is now available
> > >
> > > >
http://www.ukp.tu-darmstadt.de/software/JWPL
> > >
> > >
> > > > INTRODUCTION
> > >
> > > >
Lately, Wikipedia has been recognized as a promising lexical
> > > > semantic resource. We present JWPL, a free Java-based Wikipedia
> > > > application programming interface, that enables the use of
> > > > Wikipedia as a NLP resource by providing efficient programmatic
> > > > access to the knowledge therein.
> > >
> > >
> > > > FUNCTIONALITY
> > >
> > > >
Fast access to:
> > > > * article text
> > > > * categories
> > > > * redirects
> > > > * links between articles (ingoing and outgoing).
> > >
> > > >
Discrimination between
> > > > * article pages
> > > > * disambiguation pages
> > > > * redirect pages.
> > >
> > > >
Available languages:
> > > > * English
> > > > * German
> > > > * Czech
> > > > * Ukrainian
> > >
> > > >
Other languages will be added step by step.
> > >
> > >
> > > > DOWNLOAD
> > >
> > > >
JWPL Java library
> > > >
http://www.ukp.tu-darmstadt.de/software/JWPL
> > >
> > > >
Wikipedia data
> > > > (with database scheme optimized for large-scale NLP tasks)
> > > > ftp://ftp.tu-darmstadt.de/pub/tud/informatik/JWPL_data
> > >
> > >
> > > > LICENCE
> > >
> > > >
JWPL is free for non-profit and non-commercial use.
> > >
> > >
> > > > ABOUT
> > >
> > > >
JWPL was developed by the Ubiquitous Knowledge Processing Lab
> > > > at Darmstadt University of Technlogy.
> > >
> > > >
http://www.ukp.tu-darmstadt.de
> > >
> > > >
_______________________________________________
> > > > Wiki-research-l mailing list
> > > > Wiki-research-l(a)lists.wikimedia.org
> > > >
http://lists.wikimedia.org/mailman/listinfo/wiki-research-l
> > >
> >
> >
> > > _______________________________________________
> > > Wiki-research-l mailing list
> > > Wiki-research-l(a)lists.wikimedia.org
> > >
http://lists.wikimedia.org/mailman/listinfo/wiki-research-l
> >
>
>
> > _______________________________________________
> > Wiki-research-l mailing list
> > Wiki-research-l(a)lists.wikimedia.org
> >
http://lists.wikimedia.org/mailman/listinfo/wiki-research-l
>
>
>
_______________________________________________
> Wiki-research-l mailing list
> Wiki-research-l(a)lists.wikimedia.org
>
http://lists.wikimedia.org/mailman/listinfo/wiki-research-l