On Mon, Jul 7, 2014 at 5:21 AM, James Salsman jsalsman@gmail.com wrote:
Kevin Gorman wrote:
Regarding the IA: they have a significant interest in working with the Wikimedia projects, a lot more experience than the Wikimedia projects
have
caching absolutely tremendous quantities of data, a willinness to handle
a
degree of legal risk that would be inappropriate for the Wikimedia
projects
to take on....
Because they censor things retroactively when requested by new domain owners' robots.txt,
<Note I am replying in my personal capacity as an enwiki editor, and nothing here at all represents the views of WMF or anyone else>
This point shouldn't get lost in the various other issues of more dubious veracity and/or applicability raised in the original message.
I've seen cases where domain ownership changes or a major corporate restructuring results in a domain being completely reorganized or even redirected wholesale to some other domain. And the robots.txt for the new version of the site denies everything, likely because the new owners don't want the redirects or other old content showing up in Google searches. But this has the unfortunate side effect that IA removes all the old content from public access.
I really wish that IA would reconsider their policy of *automatically* retroactively honoring robots.txt.