On Sun, Sep 18, 2011 at 6:00 PM, Roan Kattouw roan.kattouw@gmail.com wrote:
On Sun, Sep 18, 2011 at 11:00 PM, Anthony wikimail@inbox.org wrote:
Now I don't know how important the CPU differences in calculating the two versions would be. If they're significant enough, then fine, use MD5, but make sure there are warnings all over the place about its use.
I ran some benchmarks on one of the WMF machines. The input I used is a 137.5 MB (144,220,582 bytes) OGV file that someone asked me to upload to Commons recently. For each benchmark, I hashed the file 25 times and computed the average running time.
MD5: 393 ms SHA-1: 404 ms SHA-256: 1281 ms
Did you try any of the non-secure hash functions? If you're going to go with MD5, might as well go with the significantly faster CRC-64.
If you're just using it to detect reverts, then you can run the CRC-64 check first, and then confirm with a check of the entire message.