Re: [Wikitech-l] Fwd: Reg. Research using Wikipedia

10 Mar 2011


      On 3/10/2011 3:46 AM, David Gerard wrote:
...
feel the program takes 71 days to finish all the 3.1 million article titles.
Is there anyway, our university IP address will be given permission or
sending a official email from our department head to Wikipedia Server
administrator to consider that the program, I run from this particular
IP address is not any attack. so, our administrator allows us to do
faster request like 0.5 sec. So, I can finish my experiment within 35
days.
expecting your positive reply
regards
Ramesh
I can say,  positively,  that you'll get the job done faster by 
downloading the dump file and cracking into it directly.  I've got 
scripts that can download and extract stuff from the XML dump in an hour 
or so.  I still have some processes that use the API,  but I'm 
increasingly using the dumps because it's faster and easier.
Note that many facts about Wikipedia topics have already been 
extracted by DBpedia and Freebase.  These are complimentary,  and if 
you're interested in getting results,  you should use both.  DBpedia has 
some things that aren't in Freebase,  such as Wikipedia's link graph and 
redirects,  but Freebase has a type system with 2x better recall for 
many of the prevalent types.
You might find that DBpedia + Freebase has the information you 
need.  And if it doesn't,  you'll still find it's a useful 'guidance 
control' system for anything you're doing with Wikipedia data.

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

Re: [Wikitech-l] Fwd: Reg. Research using Wikipedia