Hello,
My name is Wanyu Chen. I am a postgraduate student in National University of Defense Technology.
My colleagues and I are trying to collect a certain type of data from Wikipedia and would like some advice on the most efficient and user friendly way of collecting this data.
We are interested in Entity Research. We haven’t found a way to do that through the channels on the web page and were wondering if you have any ideas on how such data could be collected?
Thank you for your help! I look forward to hearing from you.
Best Regards
Wanyu
Hello,
I would like to know if it is possible to get two specific dumps of the
English Wikipedia version? These dumps are:
enwiki-20130403-pages-articles.xml.bz2
enwiki-20140502-pages-articles.xml.bz2
Thanks in advance for any help.
Regards.
*Julien Plu*
PhD Student, EURECOM
plu.julien(a)gmail.com | julien.plu(a)eurecom.fr
*http://jplu.github.io* <http://jplu.github.io/>
Campus SophiaTech
450 route des Chappes
06410 Biot, France
Phone: +33 (0) 4 93008103 <+33%20(0)4%2093008103>
Dumps watchers may have noticed that several zh wiki project dumps failed
the abstract dumps step today. This is probably fixed, tracking here:
https://phabricator.wikimedia.org/T174906
I'll be sure it's fixed when a few more wikis have run without problems.
Ariel
what's this all about??
On Aug 31, 2017 12:45 PM, <xmldatadumps-l-request(a)lists.wikimedia.org>
wrote:
>
> Send Xmldatadumps-l mailing list submissions to
> xmldatadumps-l(a)lists.wikimedia.org
>
> To subscribe or unsubscribe via the World Wide Web, visit
> https://lists.wikimedia.org/mailman/listinfo/xmldatadumps-l
> or, via email, send a message with subject or body 'help' to
> xmldatadumps-l-request(a)lists.wikimedia.org
>
> You can reach the person managing the list at
> xmldatadumps-l-owner(a)lists.wikimedia.org
>
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of Xmldatadumps-l digest..."
>
>
> Today's Topics:
>
> 1. Collecting data on page revisions over time
> (SEAN CHRISTOPHER BUCHANAN)
>
>
> ----------------------------------------------------------------------
>
> Message: 1
> Date: Tue, 29 Aug 2017 15:39:21 +0000
> From: SEAN CHRISTOPHER BUCHANAN <Sean.Buchanan2(a)umanitoba.ca>
> To: "xmldatadumps-l(a)lists.wikimedia.org"
> <xmldatadumps-l(a)lists.wikimedia.org>
> Subject: [Xmldatadumps-l] Collecting data on page revisions over time
> Message-ID: <535ae4a943364fd59fec90200a1d38d6(a)umanitoba.ca>
> Content-Type: text/plain; charset="utf-8"
>
> Hello,
>
> Hello,
> My name is Sean Buchanan. I am a professor of Business Administration at
the Asper School of Business at the University of Manitoba.
> My colleagues and I are trying to collect a certain type of data from
Wikipedia and would like some advice on the most efficient and user
friendly way of collecting this data.
>
> We are looking to collect data on the difference between revisions over
the lifetime of a three Wikipedia pages (see attached screenshot)
> We haven't found a way to do that through the channels on the web page
and were wondering if you have any ideas on how such data could be
collected?
>
> We are interested in the revision history for the following pages:
>
> 1) Capitalism
> 2) Socialism
> 3) Communism
>
> >
> > Thank you for your help! I look forward to hearing from you.
> > Sincerely,
> > Sean Buchanan
>