On Wed, Jan 13, 2016 at 12:47 PM, Legoktm legoktm.wikipedia@gmail.com wrote:
You can use Pywikbot's replace.py[1], which lets you provide regex find/replace and can get a list of pages from the API equivalent of Special:LinkSearch.
Thanks - I'll look into that as we get various batches of URLs ready for testing.
You should also consider setting up HSTS[2] so regardless if users click on an HTTP link, they'll be sent to the HTTPS version of the site.
Yes – that's on the plan as soon as we finishing remediating the older legacy content. I've been using lists from Wikipedia, a sampling of web access logs, etc. to feed a script[1] to find cases where someone used an absolute URL in a <script> tag, etc. We have a couple of subdomains which should be ready to HSTS quickly since they were only used for a single application.
Chris