Max Semenik wrote:
A month ago, PageImages extension was black-deployed, intended to automatically associate images with articles.
I looked at https://www.mediawiki.org/wiki/Extension:PageImages and I'm still having difficulty understanding this extension's purpose. Is there a related bug or request for comment (RFC) for this?
select count(*), avg(page_len) from page where page_namespace=0 and page_is_redirect=0 and page_touched < '20121229000000'; +----------+---------------+ | count(*) | avg(page_len) | +----------+---------------+ | 977568 | 3172.0948 | +----------+---------------+ 1 row in set (5 min 59.55 sec)
select count(*) from page where page_namespace=0 and page_is_redirect=1 and page_touched < '20120101000000'; +----------+ | count(*) | +----------+ | 16 | +----------+ 1 row in set (26.61 sec)
I ran a script in December 2012 on the English Wikipedia that updated the page_touched date of every redirect in NS:0 (and a few other namespaces, I believe) where the page_touched date was not like '2012%'. I'd considered running the same script on non-redirects. It turns out that if you take the stored wikitext of pages and echo (post) it back at the wiki via the edit action a few million times, you can discover some interesting bugs.
Thus, I would like to populate this data with a script[3]. To reduce the scare, let me remark that these pages have almost no templates and are significantly smaller than average: 3172 bytes vs. 5673 so they should be mostly fast to parse.
I don't think there's any reason to be scared here.
MZMcBride