On Wed, Jan 13, 2016 at 12:47 PM, Legoktm legoktm.wikipedia@gmail.com wrote:
When that work completes, we'll have somewhere around half a million
links
which differ only in the URL scheme. What would be the best way to
rewrite
all of those URLs? I'd like to reduce the window during which users
transit
from HTTPS -> HTTP -> HTTPS.
You can use Pywikbot's replace.py[1], which lets you provide regex find/replace and can get a list of pages from the API equivalent of Special:LinkSearch.
Thanks – I gave this a test using our simplest site ( https://gist.github.com/acdha/77354c76bf503b6f455f) to produce a minor edit like this:
https://en.wikipedia.org/w/index.php?title=World_Digital_Library&diff=70...
I had a question about etiquette: is a one-time operation like this considered a bot for the purposes of needing to go through the approval process? I anticipate running this multiple times as each application is migrated but it would be a one-time process and since there will be permanent redirects there won't be a need for this to run automatically in the future since users won't be seeing http: URLs any more.
Chris