[WikiEN-l] Ephemeral nature of web pages - a silly idea?
Sherool
jamydlan at online.no
Wed Jan 3 18:36:56 UTC 2007
On Wed, 03 Jan 2007 18:08:22 +0100, Andrew Gray <shimgray at gmail.com> wrote:
<snip>
> archive.org's six-month delay is intentional, but I suppose it could
> be possible for them to display some form of "we have this site
> archived on X date and just not displayed yet" identifier to the
> date-selection page; this would obviate the "not known" problem whilst
> meaning they don't have to publish it. hmm. If anyone wants to propose
> it to them, free free.
Actualy I don't think they physicaly get the data untill 6 months have
passed. The founder of the archive explained this in a interview with Nerd
TV[1]. Basicaly back in the day they started Alexa and the Internet
Archive at the same time. Alexa was for profit and the archive is
non-profit, and they had a contract between the two that once Alexa was
"done" with the data they collected (6 months delay to take the
"commercial edge" off it) it was handed over to the Internet Archive.
Alexa have since changed ownership, but the contract to supply data to the
archive remains in effect, so untill 6 months have passed it's actualy
Alexa, not the Internet Archive that have that data as I understand it.
1: <http://www.pbs.org/cringely/nerdtv/transcripts/004.html>
P.S. To increase the odds of a particular page getting archived (with the
caveant that some sites may include no-archive directives in theyr
robots.txt or meta HTML haders, such sites are not archived) visit it with
a browser that have the Alexa toolbar installed (yuck) or use Internet
Explorer (yuck) and choose Tools -> What's related (or some such), wich
will also cause Alexa to crawl the page as I understand it (they supply
that feature, dunno if it's in IE 7), or just put the url into the form at
<http://www.alexa.com/site/help/webmasters/#crawl_site>
--
[[:en:User:Sherool]]
More information about the WikiEN-l
mailing list