Hi all,
Just as a safety net I'd like to publish copies of the databases powering my MediaWiki installations, so that if I lose all my data and backups, I might still be able to get something back (as well as reassuring my users that their contributions won't be lost forever!)
I don't want to publish the raw database dump because it contains people's e-mail addresses and password hashes, but I'm thinking that perhaps if I take a copy of the database and erase the values in those tables, I might be able to export and publish the 'anonymised' copy. I'd rather not omit the user table entirely, because if I do ever need to restore the wiki from this copy I'll need it to associate the right account with each edit.
Does anyone know which fields contain sensitive data that I should remove? I can see these obvious candidates:
user.user_password user.user_newpassword user.user_email user.user_token user.user_email_token
Are there any others that are potentially confidential and should not be made public?
Many thanks, Adam.
I can't think of anything offhand. You might want to run a database search to make sure none of those values in those user fields is available elsewhere on the database in cleartext, but otherwise, you should be fine.
To: mediawiki-l@lists.wikimedia.org From: a.nielsen@shikadi.net Date: Sat, 28 Sep 2013 22:51:38 +1000 Subject: [MediaWiki-l] How to anonymise MW database?
Hi all,
Just as a safety net I'd like to publish copies of the databases powering my MediaWiki installations, so that if I lose all my data and backups, I might still be able to get something back (as well as reassuring my users that their contributions won't be lost forever!)
I don't want to publish the raw database dump because it contains people's e-mail addresses and password hashes, but I'm thinking that perhaps if I take a copy of the database and erase the values in those tables, I might be able to export and publish the 'anonymised' copy. I'd rather not omit the user table entirely, because if I do ever need to restore the wiki from this copy I'll need it to associate the right account with each edit.
Does anyone know which fields contain sensitive data that I should remove? I can see these obvious candidates:
user.user_password user.user_newpassword user.user_email user.user_token user.user_email_token
Are there any others that are potentially confidential and should not be made public?
Many thanks, Adam.
MediaWiki-l mailing list MediaWiki-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mediawiki-l
I suggest you ask Coren for this. He's in charge of the database of WMF Labs.
On Sep 28, 2013, at 8:51 PM, Adam Nielsen a.nielsen@shikadi.net wrote:
Hi all,
Just as a safety net I'd like to publish copies of the databases powering my MediaWiki installations, so that if I lose all my data and backups, I might still be able to get something back (as well as reassuring my users that their contributions won't be lost forever!)
I don't want to publish the raw database dump because it contains people's e-mail addresses and password hashes, but I'm thinking that perhaps if I take a copy of the database and erase the values in those tables, I might be able to export and publish the 'anonymised' copy. I'd rather not omit the user table entirely, because if I do ever need to restore the wiki from this copy I'll need it to associate the right account with each edit.
Does anyone know which fields contain sensitive data that I should remove? I can see these obvious candidates:
user.user_password user.user_newpassword user.user_email user.user_token user.user_email_token
Are there any others that are potentially confidential and should not be made public?
Many thanks, Adam.
MediaWiki-l mailing list MediaWiki-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mediawiki-l
mediawiki-l@lists.wikimedia.org