Aryeh Gregor wrote:
On Wed, Jul 21, 2010 at 7:03 PM, Roan Kattouw
<roan.kattouw(a)gmail.com> wrote:
It doesn't make a great deal of sense and can
be changed fairly easily
in Title::isValidMoveTarget().
On Thu, Jul 22, 2010 at 3:01 AM, Tim Starling <tstarling(a)wikimedia.org> wrote:
This restriction is enforced by
Title::isValidMoveOperation().
Any objections to changing this so files can't be moved over non-files
or vice versa?
It could even make sense to move the text revisions to/from the file
namespace (with an appropiate warning). I once wanted to moving a file
talk page to the file namespace, to archive (delete) them together. Or
you might want to move an overdescription to NS_MAIN.
...
Since we
won't be sorting on the plain text form anymore, we could use
some tricks to save space. For instance, if the sort key is the same
as the article title, we could store NULL instead of another copy of
the article title. That should save 95% or so.
It doesn't seem like it would save nearly that much. On the Welsh
Wikipedia (small enough database to be manageable), I get the
following:
(...)
I filtered out the main namespace in the last two to avoid false
positives from namespace prefixes. This suggests savings of maybe
50-75%. The story may be different on larger wikis. It's worth
remembering, though, that a lot of these sortkeys might be set to work
around deficiencies in the current default sortkey generation, so
maybe it would be higher savings in the long term.
It's still not at all clear to me that saving a raw copy in the
database is worth it. If we really need sectioning by first letter on
category pages, we could save the first letter instead, and leave that
NULL when it's the same as the first letter of the page title (all of
this for some locale-specific definition of "first letter"). But I
don't know if we need that.
Note that even if you're not storing it, the database is likely to be
reserving the space for that.
It can be useful to discern explicit sortkeys when the rules for
language parsing change, though.