I propose that an additional checksum of the revision text be added to
the mediawiki database and that this checksum be made available via the
database dumps and api calls.
This additional field would allow many computations such as revert and
noop detection without having to ask the system to provide the full text
of revisions. For example, if I were to build a user script to show
users which revisions have been reverted, it would be beneficial to not
have to ask the API for the full text of a large list of revisions. On
that same note, even when I need the full text of revisions, I could
determine which revisions I do not need to request by determining that
their content is exactly the same as one that has already been retrieved.
It does not seem that such a field would require considerably more
storage or computational power since computing an MD5 checksum in PHP is
cheap and storing 32 hex characters compared to the size of an articles
text is not appreciable.
Thanks,
-Aaron Halfaker