[WikiEN-l] Wikipedia and libraries

Gwern Branwen gwern0 at gmail.com
Wed Feb 9 15:22:00 UTC 2011


On Tue, Feb 8, 2011 at 9:35 AM, Ed Summers <ehs at pobox.com> wrote:
> Linkypedia kind of turns Wikipedia inside out, and lets content
> publishers see what articles reference their content. So for example
> the British Museum can see what Wikipedia articles reference their
> site [3]. And folks who are interested in keeping current with how
> Wikipedia uses their content can subscribe to a feed that lists them
> as they are added [4].
>
> I'd like to scale this project significantly by allowing any domain to
> be looked at, and include links from all language wikipedias [5]. But
> this will require a small (but not insignificant) investment in a
> server with a couple gigabytes of RAM. I was thinking of contacting
> the toolserver people to see if I could potentially work in that
> environment.

Perhaps I am missing something, but aren't there existing SEO tools
for seeing 'where are my domains being linked from'?

I occasionally go into Google's Webmaster Tools
(https://www.google.com/webmasters/tools/home?hl=en) and see where
gwern.net pages are being linked from. (I was surprised to learn that
my [[dual n-back]] FAQ (http://www.gwern.net/N-back%20FAQ.html) had
been linked on the German Wikipedia.) And surely Google is not the
only purveyor of such tools.

I also wonder how much such a server would cost, even if you *had* to
roll your own service. It sounds like it'd be trivial to provide a
browsable web front-end, so I assume the gigabytes you speak of are
needed for analyzing the database dump. But dumps occur so rarely you
don't need a 24/7 server crunching the numbers. For a server with 7.5
GB of RAM, Amazon charges only $0.34/hr
(http://aws.amazon.com/ec2/pricing/), so even a very long
number-crunching session would only cost a few dollars.

-- 
gwern
http://www.gwern.net



More information about the WikiEN-l mailing list