Hi,
I would like to fetch all page text information of all wiki pages that
belong to a movie category. Eg:
http://en.wikipedia.org/wiki/Category:Hindi_songs
From the page text I would like to extract information
related to song
title, song length, singer, name of movie/album etc. I am not
interested in
extracting images just the information about the song.
My questions:
1) Is there a way to download only those pages that I am interested in that
belong to a particular category instead of downloading the entire dump?
2) Is it required to have PHP knowledge to install the db dump on a local
machine?
3) Are there are tools that extract the information and provide the
required data to be stored in MySQL database?
If this is not the right forum to have my questions answered could you
please redirect me to the appropriate forum.
Thanks and regards,
Venkatesh Channal