Hi everybody,
we are currently creating a new "Research:" namespace on Meta:
https://bugzilla.wikimedia.org/show_bug.cgi?id=28742
as soon as the request is processed we will be able to use this dedicated namespace to host all RCom-related activities and documentation, which will allow us to filter and search pages more easily, to create shortcuts and to set up special properties for all pages in the namespace.
You may want to wait until the change is implemented, or we will rename any existing research page to the new namespace as soon as it goes live.
Dario
On May 1, 2011, at 11:22 PM, Yaroslav M. Blanter wrote:
Well, we obviously need to set up a page on Meta linked from the RCom page. If this has not been done yet, I can do it today.
Cheers Yaroslav
On Tue, 19 Apr 2011 12:25:55 -0700, "Fuster, Mayo" Mayo.Fuster@EUI.eu wrote:
Hi! This e-mail was sent to Research_l. Shall we suggest to collect and systematize these resources somewhere in Meta and linked to it from Rcom page? Although, I don't know exactly where to suggest it. Cheers! Mayo ________________________________________ From: wiki-research-l-bounces@lists.wikimedia.org [wiki-research-l-bounces@lists.wikimedia.org] On Behalf Of mohamad mehdi [mohamad_mehdi@hotmail.com] Sent: 18 April 2011 15:19 To: wiki-research-l@lists.wikimedia.org Subject: [Wiki-research-l] Wikipedia Literature Review - Tools and Data Sets
Hi everyone,
This is a follow up on a previous thread (Wikipedia data sets) related
to
the Wikipedia literature review (Chitu Okoli). As I mentioned in my previous email, part of our study is to identify the data collection methods and data sets used for Wikipedia studies. Therefore, we searched for online tools used to extract Wikipedia articles and for pre-compiled Wikipedia articles data sets; we were able to identify the following
list.
Please let us know of any other sources you know about. Also, we would
like
to know if there is any existing Wikipedia page that includes such a
list
so we can add to it. Otherwise, where do you suggest adding this list so
it
is noticeable and useful for the community?
http://download.wikimedia.org/ /* official Wikipedia database dumps */ http://datamob.org/datasets/tag/wikipedia /* Multiple data sets (English Wikipedia articles that have been transformed into XML) */ http://wiki.dbpedia.org/Datasets /*
Structured
information from Wikipedia*/ http://labs.systemone.at/wikipedia3 /* Wikipedia³ is a conversion of the English Wikipedia into RDF. It's a monthly
updated
dataset containing around 47 million triples.*/ http://www.scribd.com/doc/9582/integrating-wikipediawordnet /* article talking about integrating WorldNet and Wikipedia with YAGO */
http://www.infochimps.com/datasets/taxobox-wikipedia-infoboxes-with-taxonomi...
http://www.infochimps.com/link_frame?dataset=11043 /* Wikipedia
Datasets
for the Hadoop Hack | Cloudera */ http://www.infochimps.com/link_frame?dataset=11166 /* Wikipedia: Lists of common misspellings/For machines */ http://www.infochimps.com/link_frame?dataset=11028 /* Building a
(fast)
Wikipedia offline reader */ http://www.infochimps.com/link_frame?dataset=11004 /* Using the Wikipedia page-to-page link database */ http://www.infochimps.com/link_frame?dataset=11285 /* List of films */ http://www.infochimps.com/link_frame?dataset=11598 /* MusicBrainz Database */ http://dammit.lt/wikistats/ /* Wikitech-l page counters */ http://snap.stanford.edu/data/wiki-meta.html /* Complete Wikipedia
edit
history (up to January 2008) */ http://aws.amazon.com/datasets/2596?_encoding=UTF8&jiveRedirect=1 /* Wikipedia Page Traffic Statistics */ http://aws.amazon.com/datasets/2506 /* Wikipedia XML Data */
http://www-958.ibm.com/software/data/cognos/manyeyes/datasets?q=Wikipedia+
/* list of Wikipedia data sets */
Examples:
http://www-958.ibm.com/software/data/cognos/manyeyes/datasets/top-1000-acces...
/* Top 1000 Accessed Wikipedia Articles */
http://www-958.ibm.com/software/data/cognos/manyeyes/datasets/wikipedia-hits...
/* Wikipedia Hits */
Tools to extract data from Wikipedia: http://www.evanjones.ca/software/wikipedia2text.html /* Extracting Text from Wikipedia */ http://www.infochimps.com/link_frame?dataset=11121 /*
Wikipedia
article traffic statistics */
http://blog.afterthedeadline.com/2009/12/04/generating-a-plain-text-corpus-f...
/* Generating a Plain Text Corpus from Wikipedia */ http://www.infochimps.com/datasets/wikipedia-articles-title-autocomplete
Thank you, Mohamad Mehdi
RCom-l mailing list RCom-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/rcom-l