Hi! I am starting this thread because Brion's revision r94289 reverted r94289 [0] stating "core schema change with no discussion" [1]. Bugs 21860 [2] and 25312 [3] advocate for the inclusion of a hash column (either md5 or sha1) in the revision table. The primary use case of this column will be to assist detecting reverts. I don't think that data integrity is the primary reason for adding this column. The huge advantage of having such a column is that it will not be longer necessary to analyze full dumps to detect reverts, instead you can look for reverts in the stub dump file by looking for the same hash within a single page. The fact that there is a theoretical chance of a collision is not very important IMHO, it would just mean that in very rare cases in our research we would flag an edit being reverted while it's not. The two bug reports contain quite long discussions and this feature has also been discussed internally quite extensively but oddly enough it hasn't happened yet on the mailinglist.
So let's have a discussion!
[0] http://www.mediawiki.org/wiki/Special:Code/MediaWiki/94289 [1] http://www.mediawiki.org/wiki/Special:Code/MediaWiki/94541 [2] https://bugzilla.wikimedia.org/show_bug.cgi?id=21860 [3] https://bugzilla.wikimedia.org/show_bug.cgi?id=25312
Best,
Diederik