Re: [Wikibots-l] Retrieving page source without editing

11 Mar 2006


      Mark Wagner wrote:
...
I'm working on a bot to deal with the flood of no-source and untagged images
on the English Wikipedia.  My current design calls for, once a day,
downloading the upload log for the previous 24 hours, then checking each
image description page and adding a template as appropriate.
Sounds useful. Are you using the Python Wikipedia Bot Framework? If so, 
we should add it to the repository as soon as your script is working.
...
About 2000
images are uploaded each day, and only around 15% need tagging.  What's the
best way of getting the wikitext of an article if there's an 85% chance that
you won't be editing it?
My suggestion is that you first add a method called newimages() to the 
Site class. If you haven't already done it, you can copy Site.newpages() 
and modify it to make it look up new images.
Then you can add a generator called NewImagesGenerator() to 
pagegenerators.py. It will look a bit like AllpagesPageGenerator.
When you then have such a generator, you can just wrap the existing 
PreloadingGenerator around it, and it will do all the work for you.
...
Is Special:Export faster than starting an edit, or
is there some other method?
Using Special:Export is much faster because it allows you to load 
several pages at the same time. So you need much less requests to the 
server. Especially in this case, where you don't need edit tokens and 
stuff for most of the pages.
If you need help, just send me what you got, then I can help you.
Daniel

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

Re: [Wikibots-l] Retrieving page source without editing