TL;DR: How can we collaboratively put together a list of non-spammy sites
that wikis may want to add to their interwiki tables for whitelisting
purposes; and how can we arrange for the list to be efficiently distributed
and imported?
Nemo bis points out that
Interwiki<https://www.mediawiki.org/wiki/Extension:Interwiki>is "the
easiest way to manage whitelisting" since nofollow isn't applied to
interwiki links. Should we encourage, then, wikis to make more use of
interwiki links? Usually, MediaWiki installations are configured so that
only sysops can add, remove, or modify interwiki prefixes and URLs. If a
user wants to link to another wiki, but it's not on the list, often he will
just use an external link rather than asking a sysop to add the prefix
(since it's a hassle for both parties and often people don't want to bother
the sysops too much in case they might need their help with something else
later). This defeats much of the point of having the interwiki table
available as a potential whitelist, unless the sysops are pretty on top of
their game when it comes to figuring out what new prefixes should be added.
In most cases, they probably aren't; the experience of Nupedia shows that
elitist, top-down systems tend not to work as well as egalitarian,
bottom-up systems.
Currently,
interwiki.sql<https://git.wikimedia.org/blob/mediawiki%2Fcore//maintenan…
100 wikis, and there doesn't seem to be much rhyme or reason to which
ones are included (e.g. Seattlewiki?) I wrote
InterwikiMap<https://www.mediawiki.org/wiki/Extension:InterwikiMap>ap>,
which dumps the contents of the wiki's interwiki table into a backup page
and substitutes in its place the interwiki table of some other wiki (e.g. I
usually use Wikimedia's), with such modifications as the sysops see fit to
make. The extension lets sysops add, remove and modify interwiki prefixes
and URLs in bulk rather than one by one through Special:Interwiki, which is
a pretty tedious endeavor. Unfortunately, as written it is not a very
scalable solution, in that it can't accommodate very many thousand wiki
prefixes before the backup wikitables it generates exceed the capacity of
wiki pages, or it breaks for other reasons.
I was thinking of developing a tool that WikiIndex (or some other wiki
about wikis) could use to manage its own interwiki table via edits to
pages. Users would add interwiki prefixes to the table by adding a
parameter to a template that would in turn use a parser function that, upon
the saving of the page, would add the interwiki prefix to the table.
InterwikiMap could be modified to do incremental updates, polling the API
to find out what changes have recently been made to the interwiki table,
rather than getting the whole table each time. It would then be possible
for WikiIndex (or whatever other site were to be used) to be the
wikisphere's central repository of canconical interwiki prefixes. See
http://wikiindex.org/index.php?title=User_talk%3AMarkDilley&diff=172654…
But there's been some question as to whether there would be much demand for
a 200,000-prefix interwiki table, or whether it would be desirable. It
could also provide an incentive for spammers to try to add their sites to
WikiIndex. See
https://www.mediawiki.org/wiki/Talk:Canonical_interwiki_prefixes
It's hard to get stuff added to meta-wiki's interwiki
map<https://meta.wikimedia.org/wiki/Interwiki_map>because one of the
criteria is that the prefix has to be one that would be
used a lot on Wikimedia sites. How can we put together a list of non-spammy
sites that wikis would be likely to want to have as prefixes for nofollow
whitelisting purposes, and distribute that list efficiently? I notice that
people are more likely to put together lists of spammy than non-spammy
sites; see e.g. Freakipedia's
list<http://freakipedia.net/index.php5?title=Spam_Site_List>st>.
(Hmm, I think I'll pimp my websites to that wiki when I get a chance; the
fact that the spam isn't just removed but put on permanent record in a
public denunciation means it's a potential opportunity to gain exposure for
my content. They say there's no such thing as bad publicity. ;) )
--
Nathan Larson <https://mediawiki.org/wiki/User:Leucosticte>
Distribution of my contributions to this email is hereby authorized
pursuant to the CC0
license<http://creativecommons.org/publicdomain/zero/1.0/>
.