Create a script that makes a request to Special:Export using this category
as feed
Well I whould be happy for items like this :
http://en.wikipedia.org/wiki/Template:Db-a7
would it be possible to extract them easily?
mike
On Thu, May 17, 2012 at 2:23 PM, Ariel T. Glenn <ariel(a)wikimedia.org>
wrote:
There's a few other reasons articles get
deleted: copyright issues,
personal identifying data, etc. This makes maintaning the sort of
mirror you propose problematic, although a similar mirror is here:
http://deletionpedia.dbatley.com/w/index.php?title=Main_Page
The dumps contain only data publically available at the time of the run,
without deleted data.
The articles aren't permanently deleted of course. The revisions texts
live on in the database, so a query on toolserver, for example, could be
used to get at them, but that would need to be for research purposes.
Ariel
Στις 17-05-2012, ημέρα Πεμ, και ώρα 13:30 +0200, ο/η Mike Dupont έγραψε:
> Hi,
> I am thinking about how to collect articles deleted based on the "not
> notable" criteria,
> is there any way we can extract them from the mysql binlogs? how are
> these mirrors working? I would be interested in setting up a mirror of
> deleted data, at least that which is not spam/vandalism based on tags.
> mike
>
> On Thu, May 17, 2012 at 1:09 PM, Ariel T. Glenn <ariel(a)wikimedia.org>
wrote:
> > We now have three mirror sites, yay!
The full list is linked to from
> >
http://dumps.wikimedia.org/ and is also available at
> >
http://meta.wikimedia.org/wiki/Mirroring_Wikimedia_project_XML_dumps#Curren…
> >
> > Summarizing, we have:
> >
> > C3L (Brazil) with the last 5 good known dumps,
> > Masaryk University (Czech Republic) with the last 5 known good dumps,
> >
Your.org (USA) with the complete archive of dumps, and
> >
> > for the latest version of uploaded media,
Your.org with http/ftp/rsync
> > access.
> >
> > Thanks to Carlos, Kevin and Yenya respectively at the above sites for
> > volunteering space, time and effort to make this happen.
> >
> > As people noticed earlier, a series of media tarballs per-project
> > (excluding commons) is being generated. As soon as the first run of
> > these is complete we'll announce its location and start generating
them
> > on a semi-regular basis.
> >
> > As we've been getting the bugs out of the mirroring setup, it is
getting
easier to add new locations. Know anyone interested?
Please let us
know; we would love to have them.
Ariel
_______________________________________________
Wikitech-l mailing list
Wikitech-l(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
_______________________________________________
Wikitech-l mailing list
Wikitech-l(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
--
James Michael DuPont
Member of Free Libre Open Source Software Kosova
http://flossk.org
Contributor FOSM, the CC-BY-SA map of the world
http://fosm.org
Mozilla Rep
https://reps.mozilla.org/u/h4ck3rm1k3
_______________________________________________
Wikitech-l mailing list
Wikitech-l(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
--
Emilio J. Rodríguez-Posada. E-mail: emijrp AT gmail DOT com
Pre-doctoral student at the University of Cádiz (Spain)
Projects: AVBOT <http://code.google.com/p/avbot/> |
StatMediaWiki<http://statmediawiki.forja.rediris.es>
| WikiEvidens <http://code.google.com/p/wikievidens/> |
WikiPapers<http://wikipapers.referata.com>
| WikiTeam <http://code.google.com/p/wikiteam/>
Personal website: