Dear Ben Sidi Ahmed, DBpedia might have all the data you need, already extracted (also in over 80 languages): http://wiki.dbpedia.org/Downloads37
Here are all first 2 sentences of each article in a structured format: http://downloads.dbpedia.org/3.7/en/short_abstracts_en.nt.bz2 Here is the first abstract: http://downloads.dbpedia.org/3.7/en/long_abstracts_en.nt.bz2
If you just want them for single articles you can also query the DBpedia API: The first 2 sentences for London (all languages) : http://dbpedia.org/snorql/?query=SELECT+*+WHERE+%7B%0D%0A%3Chttp%3A%2F%2Fdbp...
The first 2 sentences for London (only English ): http://dbpedia.org/snorql/?query=SELECT+*+WHERE+%7B%0D%0A%3Chttp%3A%2F%2Fdbp...
All that contain the keyword "London": http://dbpedia.org/snorql/?query=SELECT+*+WHERE+%7B%0D%0A%3Fs+rdfs%3Acomment...
You can also query them on a synchronized database (which gets updates every 5 minutes from Wikipedia): http://live.dbpedia.org/
Hope that helps, Sebastian
On 11/27/2011 06:02 PM, Khalida BEN SIDI AHMED wrote:
Hello! I don't know if the subject of this question belongs to the scope of this group. Anyway, I will be pleased if I find an aswer to my question. I'm writing some Java code in order to realize NLP tasks upon texts using Wikipedia. What can I do in order to extract the first paragraph of a Wikipedia article? Thanks a lot.
Truly yours Ben Sidi Ahmed _______________________________________________ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l