Dear Ben Sidi Ahmed,
DBpedia might have all the data you need, already extracted (also in
over 80 languages):
http://wiki.dbpedia.org/Downloads37
Here are all first 2 sentences of each article in a structured format:
http://downloads.dbpedia.org/3.7/en/short_abstracts_en.nt.bz2
Here is the first abstract:
http://downloads.dbpedia.org/3.7/en/long_abstracts_en.nt.bz2
If you just want them for single articles you can also query the DBpedia
API:
The first 2 sentences for London (all languages) :
http://dbpedia.org/snorql/?query=SELECT+*+WHERE+{%0D%0A%3Chttp%3A%2F%2Fdbpe…
The first 2 sentences for London (only English ):
http://dbpedia.org/snorql/?query=SELECT+*+WHERE+{%0D%0A%3Chttp%3A%2F%2Fdbpe…
All that contain the keyword "London":
http://dbpedia.org/snorql/?query=SELECT+*+WHERE+{%0D%0A%3Fs+rdfs%3Acomment+…
You can also query them on a synchronized database (which gets updates
every 5 minutes from Wikipedia):
http://live.dbpedia.org/
Hope that helps,
Sebastian
On 11/27/2011 06:02 PM, Khalida BEN SIDI AHMED wrote:
Hello!
I don't know if the subject of this question belongs to the scope of this
group. Anyway, I will be pleased if I find an aswer to my question.
I'm writing some Java code in order to realize NLP tasks upon texts using
Wikipedia. What can I do in order to extract the first paragraph of a
Wikipedia article? Thanks a lot.
Truly yours
Ben Sidi Ahmed
_______________________________________________
Wikitech-l mailing list
Wikitech-l(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
--
Dipl. Inf. Sebastian Hellmann
Department of Computer Science, University of Leipzig
Projects:
http://nlp2rdf.org ,
http://dbpedia.org
Homepage:
http://bis.informatik.uni-leipzig.de/SebastianHellmann
Research Group:
http://aksw.org