Re: [Wikitech-l] MediaWiki aliases feature proposal

24 Oct 2007

On 10/25/07, Simetrical &lt;Simetrical+wikilist(a)gmail.com&gt; wrote:
...

 This would essentially be like regexes, but defined *without* the
 operation of iteration: only catenation and union are allowed.  This
 is a large benefit because it means there are a finite number of
 possible patterns, and so they can be stored in enumerated form. 

Yes, I'm undecided whether nesting (aka iteration) is a good idea or not.
Quite possibly it's a good idea to force people to explicitly state
all the variations
they intend. If iteration/nesting is not allowed, then multiple #ALIASES
statements *should* be allowed, imho, for readability.

...
  All whitespace is equivalent to a single space. So
"Boo [Foo]
  [Moo] Woo" matches "Boo Woo",
rather than
 "Boo<space><space><space>Woo" for instance. 
 Generally speaking I would like to see titles that differ only up to
 compression of whitespace to be considered identical.  If this were
 the case, the searchable forms of all titles would be
 whitespace-normalized, and this point would be resolved automatically.
 Until then, I suggest that this aspect of it be brushed under the
 carpet for aliases as for anything. 

I think that's what I was trying to say. :)

...
  - Search term matches one real page, some aliases:
takes you to real page.
  (Arguably gives you a "did you
mean...?" banner, but not critical)
 - Search term matches one alias, no real page: takes you to page.
 - Search term matches several aliases, no real
 page: either an automatically generated disambiguation page, or shows  you
  search results with the matching aliases shown
first. 
 I see.  Possibly this is better than having the aliases be unique, yes. 

Yeah. Ultimately,
it's helpful for the reader if they *can* search for "J Smith".
Obviously they don't expect it to be unique,
but if that's all they have to go on, it's better than nothing.

It can create exponential database rows in the length of the alias
...
  string, yes, so that needs to be dealt with -- if
we're doing explicit
 storage, anyway.  I think 20 is probably too low. 

The right number is probably easy to come up with if someone can
decide how big the table can be. I just don't have
a feel for whether 1 million, 10 million, 100 million rows is "too many".

...
  * The role of redirects once this system is in place.
One possible
  implementation would simply create and destroy
redirects as required. In  any
  case, they would still be needed for some
licensing issues. 
 Why? 

Because when articles get merged, one is turned into a redirect with
the history of all the edits that were made. If we kill
that redirect, we lose that history, including attribution. Ergo,
non-compliance with GFDL.

aliases into account.  (Actually, you seem to have caught on to this
...
  point in your last post, written after I wrote that.)

Heh, yeah. I don't do much DB programming these days.

Of course, that wouldn't be quite enough.  There would be all sorts of
...
  things expecting particular behavior of redirects, and
so this would
 create a fair amount of backwards incompatibility, and generally
 confuse things.  Ideally I would like to see a proposal that merges
 redirects and aliases altogether: do we want them to have a
 corresponding page entry or not?  They shouldn't be treated as
 distinct. 

That would be even better, but I wasn't that ambitious. Do you have
any ideas? Even better would be something that
redefines the concept of disambiguation, which is again, a huge amount
of manpower to set up and maintain.

One problem that just occurred to me is what happens when one query
matches two aliases *and* a disambiguation page.
Every possible outcome looks bad:
- Just show the disambiguation page (with two missing entries)
- Show a list of aliased pages plus the disambiguation page (what, I have to
choose whether I want a real page or a disambiguation page?)
- Attempt to jam the alias links somewhere in the disambiguation page
(possibly duplicating actual links, or possibly requiring every
disambiguation page to be updated with an <aliases> section).

Just like with the category/list dilemma, it doesn't seem possibly to create
a fully dynamic disambiguation page that will be "as good as" a hand-edited
one. But long term, it would be a very valuable thing if we could come
close.

What we're looking for is a way to easily create and maintain
...
  redirects, not some totally new feature, and despite
my suggestions
 above and below, I think that's how the problem should be posed.  A
 special page to easily manage all redirects to a page, including to
 batch-create and -delete* them, is probably the best way to handle
 this.  Grouping on this redirects page by category would be a good
 feature to have, for instance, and category management from it as
 well.  But to start with, reversible batch creation and deletion is
 all that's needed. 

Are you thinking in terms of a special GUI, or a wikitext language
feature? Say you used the #ALIASES idea, but it
constructed actual pages with #REDIRECT text. Those pages could be marked
with an "automatically generated" flag, so they would be killed when the
corresponding #ALIASES text was modified.

Now, however, you have a different problem with ambiguous redirects: the
user adds an #ALIASES tag pointing at the current page, but the redirect
already exists and points somewhere else. What happens?

*(Unprivileged users should indeed ideally be allowed to delete
...
  redirects in general if they have no substantial
content, as currently
 they can during moves.  However, history and easy reversibility needs
 to be built into this before it can be deployed, needless to say.)

 Steve

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

Re: [Wikitech-l] MediaWiki aliases feature proposal