Am 19.09.2017 um 20:48 schrieb Gergo Tisza:
On Tue, Sep 19, 2017 at 6:42 AM, Daniel Kinzler <daniel.kinzler@wikimedia.de Can't you just split it into a separate table? Core would only need to touch it on insert/update, so that should resolve the performance concerns.
Yes, we could put it into a separate table. But that table would be exactly as tall as the content table, and would be keyed to it. I see no advantage. But if DBAs prefer a separate table with a 1:1 relation to the content table, that's fine with me.
Note that the content table is indeed touched a lot less than the revision table.
Also, since content is supposed to be deduplicated (so two revisions with the exact same content will have the same content_address), cannot that replace content_sha1 for revert detection purposes?
Only if we could detect and track "manual" reverts. And the only reliable way to do this right now is by looking at the sha1.