On Tue, Sep 19, 2017 at 6:42 AM, Daniel Kinzler <daniel.kinzler(a)wikimedia.de
wrote:
That table will be tall, and the sha1 is the (on
average) largest field.
If we
are going to use a different mechanism for tracking reverts soon, my hope
was
that we can do without it.
Can't you just split it into a separate table? Core would only need to
touch it on insert/update, so that should resolve the performance concerns.
Also, since content is supposed to be deduplicated (so two revisions with
the exact same content will have the same content_address), cannot that
replace content_sha1 for revert detection purposes? That wouldn't work over
large periods of time (when the original revision and the revert live in
different kinds of stores) but maybe that's an acceptable compromise.