[WikiEN-l] Tool announcement: Alternative names redirect
Steve Bennett
stevagewp at gmail.com
Mon Oct 22 05:42:29 UTC 2007
On 10/22/07, William Pietri <william at scissor.com> wrote:
>
> You know that these things only change on save, so at that point you
> look at the difference between the old aliases and the new and update
> the master set. Computationally, it's only a smidgen more expensive than
> our current approach. And given that we're such a read-heavy
> environment, unnoticeably so.
Yeah, I did some more thinking about this in bed and thought about a
possible implementation.
The wiki code would be a single line, like #ALIASES [City of ][Greater
]Melbourne[, Victoria| (Australia)]
Two tables would store all the aliases: one would store the raw patterns,
and another would expand them, possibly only partially. You could expand say
the first 5 characters:
"City ","of [Greater ]Melbourne[, Victoria| (Australia)]"
"Great", "er Melbourne[, Victoria| (Australia)]"
"Melbo", "urne[, Victoria| (Australia)]
3 entries. That way, once a user types an actual request (say, "Greater
Melbourne"), you
just look up the first 5 characters ("Great"), then iterate over the
matches there. There are lots of algorithms and data
structures that would help here.
>> (And that's without addressing the issue of
>> duplicates. A redirect can only point to one article, a given string
>> can match the regexps in many articles. Automated disambig pages,
>> perhaps?)
>
>That'd be a great way to solve that. And the main bit could be done as
>automatically updating our redirect pages. As a first pass, anyhow.
Omg, automated disambig pages. Yes please! Maintaining disambiguation pages
is horribly time consuming. You could conceive of another keyword like
"{{disambigtext|Second largest city in Australia.}}" that would be shown
where necessary. But I'm getting ahead of myself.
>From a user experience perspective, I'd be a little worried about
>putting more mysterious Wiki markup at the top of a page. On another
>wiki I'm working on, we're moving more of this metadata outside the
>markup and to specialized UIs, so that it doesn't clutter the edit box.
So put it at the bottom, next to {{DEFAULTSORT}}. I do agree that
location-independent metadata should
be separated from content though. Categories and interwikis fall into
that category too.
>I think the only real abuse potential comes from either putting in a
>giant list or trying to redirect in a bunch of existing articles. But
>one you can catch with a size limit, and the other you could fix by
>refusing to mess with real articles.
Most likely from having a redirect which expands to too many possibilities,
like [A|b|c|d|e][A|b|c|d|e][A|b|c|d|e][A|b|c|d|e]. But that would be easily
catchable. The trouble is what to do about it, besides failing silently.
Perhaps reject it into a special page that admins can browse from time to
time?
>So Steve, I'd say it's a great idea. However, I'd want to do some user
>testing. Since I've been doing regular expressions for so long, they
>make instant sense to me, but even this limited version might be too
>mysterious for most of our editors. Perhaps the special UI would show
>them the list of generated alternatives as they edit?
Well the great thing with such a limited expression language is that there's
very little to learn, and very little to stuff up. And even better, users
can just use the most naive approach imaginable. So, while a CS major would
readily write an expression like:
#ALIASES [Dr] Grace [Smith|Jones]
A beginner user might simply write:
#ALIASES [Dr Grace Smith|Dr Grace Jones|Grace Smith|Grace Jones]
or even:
#ALIASES Dr Grace Smith
#ALIASES Dr Grace Jones
#ALIASES Grace Smith
#ALIASES Grace Jones
A UI tool would obviously help, but that would be a slight departure for
MediaWiki. There's nothing else like that atm (afaik), so it's hard to
picture how it would fit in exactly.
Steve
More information about the WikiEN-l
mailing list