[MediaWiki-l] Need advice fixing corrupted revision history

John phoenixoverride at gmail.com
Mon Feb 29 20:33:13 UTC 2016

Are both the usernames and user ids in the revision history corrupt?

On Mon, Feb 29, 2016 at 3:24 PM, Steve Rainwater <srainwater at ncc.com> wrote:

> This is a somewhat complex problem so bear with me, I'll try to
> describe it as concisely as possible.
> Background:
> I'm in the process of updating a large (20k pages), old (started ca
> 2004) MediaWiki site. It was running MediaWiki 1.18.3 and I thought a
> good first step would be to update the site to the 1.19.x LTS version,
> which I did (with the plan of moving to 1.23.x in March). Extensions
> were updated as well. Everything seemed stable and pre-upgrade backups
> were not kept after a month of no problem reports (big mistake, I
> know!) There are nightly backups but only a week's worth are kept for
> storage reasons. However, I have located some very old (3 year+)
> database backups.
> The site admins use a MW extension called "merge-and-delete" to deal
> with spammers. There is a permanent, blocked user called "spammer" and
> any time a new user account is created by a spammer, the editors merge-
> and-delete that account into the "spammer" account. There are < 100
> real editors on the site and this process kept them from being
> overwhelmed by the thousands of spammers creating accounts on the site
> in recent years.
> The Problem
> About a month after the 1.19.x update, a problem was discovered.
> Somehow, all edits dated 2011 or earlier were altered such that they
> are now credited to the "spammer" user rather than the actual user who
> made the edits. The cause of the corruption appears to be a bug/problem
> related to the merge-and-delete extension and MW 1.19. I'm not seeking
> help for that here - I have turned off and removed the extension.
> My problem is: how to restore the edit history so that edits are
> credited to the correct users again. The timestamp and page names of
> all edits are still correct, only the user name was corrupted. But the
> only backup old enough to have uncorrupted edit history data is years
> old and from a much older version of MW (maybe 1.16). And I need to fix
> the problem without losing years of edits.
> The Solution?
> I can't see any easy fix for this. But I have thought of an approach
> that might work. If I write a bot that reads the very old database,
> extracts only <=2011 edit history, and then compares that data to the
> corrupted live site, perhaps it could work its way through the edit
> history making corrections. Does this seem plausible? And, if so, any
> advice on what to look out for? If anyone has alternate suggestions,
> I'm up for entertaining just about any idea on how to fix this.
> -Steve
> _______________________________________________
> MediaWiki-l mailing list
> To unsubscribe, go to:
> https://lists.wikimedia.org/mailman/listinfo/mediawiki-l

More information about the MediaWiki-l mailing list