Re: [Xmldatadumps-l] Wikipedia page content dump based on category

9 Oct 2012

      On Mon, Oct 8, 2012 at 3:45 PM, Venkatesh Channal <
venkateshchannal@gmail.com> wrote:
...
Hi,
I would like to fetch all page text information of all wiki pages that
belong to a movie category. Eg:
http://en.wikipedia.org/wiki/Category:Hindi_songs
From the page text I would like to extract information related to song
title, song length, singer, name of movie/album etc. I am not interested in
extracting images just the information about the song.
My questions:

Is there a way to download only those pages that I am interested in

that belong to a particular category instead of downloading the entire dump?

Is it required to have PHP knowledge to install the db dump on a local

machine?

Are there are tools that extract the information and provide the

required data to be stored in MySQL database?
If this is not the right forum to have my questions answered could you
please redirect me to the appropriate forum.
Thanks and regards,
Venkatesh Channal

Xmldatadumps-l mailing list
Xmldatadumps-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/xmldatadumps-l
-- 
*

**Expertini - the international job board *
29th Floor
One Canada Square,
Canary Wharf,
London,
E14 5DY,
United Kingdom,
(Phone)  +44 (0) 207 193 1729
(Mob) +44 (0) 742 5873 580, +44 (0) 7881 346475
Email: Info@SearchLondonJobs.Co.UK
Our sites:
America: www.SearchAmericanJobs.Com http://www.searchamericanjobs.com/
Australia: www.SearchAustralianJobs.Com
Canada: www.SearchCanadaJobs.Com
Europe: www.SearchEuropeanJobs.Com
UK: www.SearchLondonJobs.Co.UK

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

Re: [Xmldatadumps-l] Wikipedia page content dump based on category