On Fri, Aug 6, 2010 at 10:06 AM, Léa Massiot <lea.massiot(a)ign.fr> wrote:
A colleague told me about that... so we had a look at
it.
Unfortunately, abstracts are not correct most of the time...
-----------------------------------------------------------------
Example (in French):
-----------------------------------------------------------------
<title>Wikipédia : Arabie saoudite</title>
<url>http://fr.wikipedia.org/wiki/Arabie_saoudite</url>
<abstract>| lien_villes=Villes d'Arabie saoudite</abstract>
<links>
[...]
-----------------------------------------------------------------
Cheers,
--
Lmhelp
In that case I recommend the parser in mwlib for fast extraction of the text
in the first paragraph:
http://code.pediapress.com/wiki/wiki/mwlib