G'day All,
There's a link suggesting tool I'm temporarily putting out there for you all to
have a play with and to give feedback and comments
on.
What it does is it takes an article of your choosing from the English Wikipedia, and
suggests bits of text in that article that
could potentially be linked. You can then accept or reject those individual suggestions,
and then save your changes back to the
Wikipedia.
It tries to do this in a reasonably pleasant UI, where you see the list of suggestions,
and then simply select "yes", "no", or
"don't know" for each suggestion, and click "Preview with Added
Links".
Quick overview of the UI:
* On the landing page, you type the name of the article that you want to see links for. It
should appear in the list as you type
(this bit uses suggestion searching).
* Press enter or click the relevant link, then wait for up to 10 seconds for it to fetch
the current version and suggest links, and
then you'll be presented with a list of possible links that you can make.
* To go through the list, you can either use the mouse, or you can use the keyboard.
* For the keyboard, the keys are: Up arrow, Down arrow, "y" for yes,
"n" for no, and "s" or "d" for skip/don't know.
* "Yes" adds the link, "no" doesn't, and "don't
know" doesn't add the link either; but "Don't know" will make the
exact same link
suggestion in future, whereas "yes" and "no" bring closure in that the
same suggestion will no longer be made for that page in
future.
* If you don't make any choice for a suggestion, that's treated the same as
choosing "don't know".
* Each suggestion has a link that opens in a new tab/window, so if you want to determine
whether something is an appropriate link or
not, you can just click its link.
If you want to play with it now, it's at:
http://can-we-link-it.nickj.org/
Some caveats to be aware of:
* Currently only works for the English Wikipedia. Although I haven't tried it yet,
conceptually similar languages like French should
probably work (i.e. Left-to-Right, spaces between words to separate out ideas [no or quite
limited compound words], general use same
characters in both article text and article names for the same idea, etc). No idea if this
can be made to work for languages which
differ substantially from this.
* This site will disappear in a few days. It's just a temporary experiment to see what
happens, and is currently running on a
development box which has other duties to perform.
* Super-alpha status. It may blow up, eat your homework, key your vehicle, trash your
favourite article, etc.
* The tool will work much better if you have JavaScript turned on, and the front page
won't work at all if you have JavaScript
turned off.
* It's SLOOOW (e.g. might take 7 seconds to generate suggestions for a 32 Kb page). It
doesn't inherently have to be slow, but it is
currently - partially because it's behind a DSL link, but mostly because it's not
very efficient currently. I'd rather put out an
early version though with some rough edges and slowness than wait until getting something
perfect (which I may never get around to
doing).
* Currently the suggested links will include links to disambiguation pages. Shouldn't
really do this (i.e. disambig pages should
ideally be excluded from the results).
* Saving suggestions back to the Wikipedia is a less than optimal process. Currently it
goes to an intermediate page, which saves
the user's choices to a local database, and then uses a JavaScript form submission to
transfer the user to a preview on the
Wikipedia with links added; ideally this intermediate step could be skipped. Also it would
be nicer to go to "Show Changes" rather
than show a preview, but that's not possible currently because Show Changes is
protected by an edit token, so you'll have to
manually click the "Show Changes" button if you want to see a highlight of
what's been changed (a request that this be changed has
been logged as bug #7369).
* Someone reported that they saw the "null edit summary" detector complain at
them when using this; Not sure why this happens, as
there is a default edit summary supplied.
Other things:
* It has "learning from its mistakes" functionality, in that a suggestion which
is regularly rejected will no longer be suggested.
The current cut-offs are that a suggestion must be rejected at least 5 times, and also
rejected 50% or more of the time; once this
threshold has been crossed the suggestion will no longer be made. Thus, the bad cruft
should hopefully be progressively filtered out
as the tool is used more, and what remains should hopefully be mostly useful.
* There are some hidden switches, which you can add to the URL that shows the link
suggestions, if you want to fiddle with stuff:
** The first is to add "&exhaustive" to the end of the URL, in which case it
will stop trying to be smarter about suggesting links
based on grammatical structure (e.g. by excluding single word links), and will be
exhaustive about showing you the links it finds.
This will result in roughly 4 times as many links being found.
** The second switch is that you can specify the number of characters to include in the
"context" before and after the suggested
link. The default is 60 characters, but you can set this to anything between 0 and 100
characters inclusive, such as by adding
"&context=20" to the URL for 20 characters of context.
** Lastly you can specify to just check the wiki syntax. It performs some very simplistic
checks on the wiki syntax automatically,
that are all about balance (e.g. checks number of [[ equals number of ]] and so forth),
and if an article's syntax looks invalid
then it'll tell you what's wrong, but deliberately won't give you the link
suggestions until you've fixed the syntax on the
Wikipedia :-) However, if you don't want link suggestions, and only want syntax
checking, then tack "&onlyCheckSyntax" onto the
URL.
I also want to give a big thank you to Julien Lemoine for writing his Suggestion Search
daemon / server, which this tool uses (or
rather, abuses) in a rather cruel way to determine what's a valid article name and
what's not :-) Also the front page uses a
modified version of his web form to help you find the right page that you want to suggest
links for.
All the best,
Nick.