On Feb 5, 2008 2:29 PM, Roan Kattouw roan.kattouw@home.nl wrote:
does MySQL distinguish between NULL and 0, anyway?
Yes. Everything except Oracle does, as far as I know (and for Oracle maybe it's only empty strings that are NULL, I can't remember).
This would make the prev/next links less expensive to generate (only one DB query rather than 3), but would cause an extra UPDATE when adding revisions
Three very cheap selects versus one (since when do we need three selects for this anyway?) is not a big difference. Nor is an extra update in the extremely rare case of adding a new revision.
Also, filling these fields for existing revisions would be a huge operation (maybe this could be done on the fly?).
Probably not as big a deal as altering the table in the first place. :)
On Feb 5, 2008 2:51 PM, Brion Vibber brion@wikimedia.org wrote:
Everything gets horrifyingly ugly when you try to do anything other than straight timelines, though.
That's exactly what I always said to David about LQT. He's sticking to trees, and look where that's gotten. :P
On Feb 5, 2008 4:45 PM, Magnus Manske magnusmanske@googlemail.com wrote:
If timestamps are not unique, and revisions might not be ordered, wouldn't ORDER BY revision GROUP BY timestamp order it by timestamp, and then by revision number if two or more identical timestamps are found?
GROUP BY x ORDER BY y is a filesort in MySQL, always. No index use is possible for the sorting, in any version to date. You could, however, do ORDER BY rev_timestamp, rev_id. That would work with an appropriate index, one beginning with the columns needed for retrieval and ending with (rev_timestamp, rev_id), *if* the retrieval condition were appropriate -- no range scan on anything but rev_timestamp, and no range scan on rev_timestamp other than comparison, like no IN( ... ). Something of the pattern SELECT ... WHERE rev_page = X AND rev_timestamp >= Y ORDER BY rev_timestamp, rev_id would work as desired with an index on (rev_page, rev_timestamp, rev_id).
Note that I'm not sure if we want rev_timestamp > Y or rev_timestamp
= Y. Either one is, I think, actually wrong, in a case where you
have three values of (rev_id, rev_timestamp) like (1, 20080205000000), (2, 20080205000000), (3, 20080205000000), and are trying to find the revision before or after revision 2. I guess you would need WHERE rev_page = X AND (rev_timestamp > Y OR (rev_timestamp = Y AND rev_id > Z)). That would have to be rewritten as a UNION to permit index use before 5.0ish (and in 5.0ish still, unless you feel lucky), and I *think* the UNION would be perfectly efficient using the same index, but this is getting kind of hairy. :)
On Feb 5, 2008 5:26 PM, Thomas Dalton thomas.dalton@gmail.com wrote:
Can you not create a new index for it?
Every extra index means slower inserts and slower deletes. Unneeded indexes should be avoided. In this case, I wonder if it's not best to just do the logic in PHP in the extremely rare case where two revisions have the same timestamp, rather than adding what amounts to a penalty on *every single query on the revision table* of a particular form.