[Foundation-l] Commons Usurp issue

Simetrical Simetrical+wikilist at gmail.com
Wed Jun 4 22:39:25 UTC 2008


Thoughts:

1) If you're arguing about licenses, best to argue about CC-BY-SA 3.0,
since it looks like that's what we're going to be switching to pretty
soon.  The GFDL may not be relevant for very long.  (Although the
switch is not, AFAIK, set in stone yet.)

2) It's not very nice to change people's pseudonyms without their
permission.  On the other hand, in practice the pseudonym will
hopefully be changed to something very similar (is there such a
difference between "Bob" and "Bob 2"); and any conceivable way that
anyone could contact the person would be updated to use the new name.
Since the pseudonyms in question existed only on Wikimedia projects in
the first place, changing them over on Wikimedia projects means that
not much is lost, in practice.  In theory third parties might now use
a different pseudonym, which could prove confusing, but in practice
most third parties just link back to Wikimedia for the history anyway.

3) It seems likely to be illegal to change people's pseudonyms without
their permission, strictly speaking.  It should be fairly clear that
the legal risk to Wikimedia is negligible, but of course we like to
respect the law whether or not anyone's likely to sue.  In general,
however, Wikimedia projects tend to take a sloppy approach to this
kind of thing, not paying so much attention to the letter of the
licenses but rather the general principle.  (Look, for instance, at
the fact that the MediaWiki interface and default messages are all
GPL, but modified versions of same are purportedly GFDL, and modified
and unmodified versions are mixed together with GFDL content.)  Thus,
see (3).

4) Last I heard, the deployment plan is indeed that eventually all
conflicts are going to be forcibly resolved through some means,
presumably with no consideration for local project policy.

On Wed, Jun 4, 2008 at 1:19 AM, Ryan <wiki.ral315 at gmail.com> wrote:
> That seems really odd.  I don't doubt you, because given problems I've seen
> with buggy renames, it makes sense, but why in the world would we duplicate
> this information?  It seems to violate every principle of database design
> and normalization.

For anonymous users the rev_user_text field stores the IP address in
dotted-quad form, with user id 0.  As for registered users, well, it
saves a join.  I've always wondered if this is worth it,
performance-wise.  On the other hand, other than user renames (which
are negligibly frequent), there's not likely to be any consistency
problem.  Of course, user renames also change all the rev_user_text
rows, along with comparable rows in other tables, so any consistency
problems that arise would be transient.

The logging table doesn't have a username field, incidentally, only a
user id field -- but this causes problems because that gives us
nowhere to put an IP address for anonymous users.  Thus, anonymous
users can't perform any logged action (e.g., page moves) no matter how
you configure the permissions.  Which is pretty annoying.  I wonder if
a user id and a user IP field would be a better idea than user id and
user name (with the IP field being left blank for registered users, at
least on Wikimedia wikis).

Of course, one possible benefit of having a separate username field is
that you can index it.  I notice that we have separate indexes on
(rev_user, rev_timestamp) and (rev_user_text, rev_timestamp).  Maybe
we order by rev_user_text somewhere?  I haven't looked into it, but
the extra index seems more or less redundant at first glance.
(Although various queries rely on it and would have to be rewritten if
it were removed.)



More information about the foundation-l mailing list