Inside of Article.php, there is a query that joins the user table
to the revisions table in the function getContributors. The
important bits are:
$fields = array(
"$userTable.*",
'rev_user_text AS user_name',
'MAX(rev_timestamp) AS timestamp',
);
...
'GROUP BY' => array( 'rev_user', 'rev_user_text' ),
...
$res = $dbr->select( $tables, $fields, $conds, __METHOD__, $options, $jconds );
return new UserArrayFromResult( $res );
This breaks for those databases that don't allow for a
non-deterministic GROUP BY (yes, sometimes you can
technically get away with leaving fields out of the GROUP BY,
but this is not one of those cases :). One solution that
has been raised to this general problem is to have MW
recognize 'group by' queries and add fields as needed to the
GROUP BY by parsing the SELECT list. Obviously this is
a non-starter in this case, as we are doing a user.* in the
SELECT clause. Which makes me wonder if we really need that
here in the first place. We definitely don't need all those fields
to display user contributions, but what's the minimum needed
to satisfy UserArrayFromResult? This tracks back to loadFromRow
in User.php. No doubt it is convenient having a complete user
object ready to use anywhere, but do we really want to encourage
pulling back user.* everywhere or is another approach better?
So I'm looking for comments on the best solution here, both
in the specific and general case. As I see it, the options are:
1) Manually add in all the columns from the user table to the
GROUP BY. Painful, and subject to immediate breakage when a
column is added or removed from the user table.
2) Teach MediaWiki how to parse queries and expand the user.*
into a list of columns to feed to the GROUP BY. Actually not
as hard as it sounds, but not very elegant at all.
3) Rewrite the SELECT above to only pull the columns we need
for that section of code. In this case, probably user_id,
user_name, user_real_name, and user_email? This would mean
that the user object returned would have unset fields for
things like $this->mToken. The best short-term solution, but
breaks the underlying promise that a user object is a complete
and populated user object.
4) Something else. Return a new object that's a subset of the
User table? Perhaps with the fields that are commonly used
for displays of other users, but without the you-as-the-user
specific fields such as the password, last touched, etc.?
Seems silly to be slinging around all those attributes when
we don't need to be.
--
Greg Sabino Mullane greg(a)turnstep.com
End Point Corporation
http://www.endpoint.com/
PGP Key: 0x14964AC8 201104271012
http://biglumber.com/x/web?pk=2529DF6AB8F79407E94445B4BC9B906714964AC8