[Foundation-l] Hotlinked images Was: GLAM-WIKI report

Gregory Maxwell gmaxwell at gmail.com
Wed Aug 12 18:11:28 UTC 2009


On Wed, Aug 12, 2009 at 3:58 AM, Tim Starling<tstarling at wikimedia.org> wrote:
[snip]
> Brianna Laugher was receptive to the idea of having
> Wikimedia projects hotlink or cache images from galleries.

So there have been a number of statements against doing something like
this, but (unsurprisingly) I don't think they have been strong enough
stated or hit all the arguments that I think are important.  So please
humour me for a moment.

I think hotlinking images is something we ought not to do for several
independent reasons.

(1)  There is no reason to do so.

The so far cited reasons for GLAM interest in this are Branding and Statistics.

Hotlinking or caching would do nothing to improve branding— Most of
the time a hot linked image looks just like a local one to users.
Whatever branding we'd find acceptable could be accomplished as well
or better locally.

Statistics gathering is something that is interesting to many of our
contributors, we cand should have good statistics for everything (and
caching would be useless for statistics), so hotlinking should create
no improvement.

GLAMS have spent money building their own databases, yes. But ours are
an additional copy, our problem, and not a significant cost.

The only other reason I can see for hotlinking would be collecting
resellable marketing data on Wikipedia viewers, and I do not believe
that this would be a use we'd wish to support. (I'm not making a value
judgement here— If that is indeed someone's goal thats fine— only that
it's not one WMF would intentionally support). See below for more…

(2)  Hotlinking has enormous privacy problems

When the rubber hits the road NDAs are ineffective: People make
mistakes. Governments and ISPs snoop. Privacy polices are often bad
and allow things which would horrify people. Hotlinking would greatly
increase readers exposure to information leaks.

Some random museum has no business knowing that I loaded the pederasty
article just because some art was placed in it.

Wikimedia's handling of reader privacy ought to be leading-edge
trend-setting stuff. That would be an nearly impossible goal if media
were inlined from many third party sites.

(3) It significantly reduces the atomicity of the Wikimedia projects.

Today are *things*, objects you can obtain (± temporary problems with
the dump system), archive, data-mine, etc.  I have complete (though
not current right now) copies of Wikipedia in all languages along with
all images and other media, as well as the core software.  Not just
partial bits and pieces, but the whole thing.

External links are a clear boundary between what is in Wikipedia and
what isn't. ... and the stuff *in* wikipedia is all freely licensed
and available for download.   They are now all tracked with a common
revision control system, have common (if bad…) metadata.

External dependency would lower reliability and make the generally
less tractable. It would become more difficult to retain backups and
historical records.

Perhaps some day Wikipedia will be too big to maintain any singular
copy of for purely technical reasons, but we are a long long way away
from that now!


So basically I think there are a bunch of practical and principled
problems with hotlinking, but that hot-linking isn't actually needed.
Really good upload systems that preserve metadata and provide good
links to external resources?  Statistics collection?  These are good
an uncontroversial things. They don't require hotlinking.


Cheers—




More information about the wikimedia-l mailing list