Re: [Wikitech-l] Fwd: Reg. Research using Wikipedia

9 Mar 2011


      On 9 March 2011 16:00, Platonides Platonides@gmail.com wrote:
...
...
Dear Members,
I am Ramesh, pursuing my PhD in Monash University, Malaysia. My
Research is on blog classification using Wikipedia Categories.
As for my experiment, I use 12 main categories of Wikipedia.
I want to identify " which particular article belongs to which main 12
categories?".
So I wrote a program to collect the subcategories of each article and
classify based on 12 categories offline.
I have downloaded already wiki-dump which consists of around 3 million
article titles.
My program takes this 3 million article titles and goes to online
Wikipedia website and fetch the subcategories.
Why do you need to access the live wikipedia for this?
Using categorylinks.sql and page.sql you should be able to fetch the
same data. Probably faster.
I concur. Everything required for this project should be in the dumps.

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

Re: [Wikitech-l] Fwd: Reg. Research using Wikipedia