[Foundation-l] Long-term archiving of Wikimedia content

Tue May 5 05:24:55 UTC 2009

Brian wrote:
> Wouldn't the most cost effective solution to be to first fund research in
> compression so fewer bits have to be etched out?
> In that case these guys are already on the job: http://prize.hutter1.net/

The obvious reply to that is that the Rosetta project aims to make an
archive readable with 17th century technology, which digital
information compressed with advanced algorithms is not.

They try to make an issue out of the obsolescence of digital
technology, which I think is overwrought. Just because I don't have a
slot in my computer where I can insert a 1970s era magnetic tape
doesn't mean it's unreadable. I don't have a 750x optical microscope
lying around either. Both media are readable using extant technology.

There have been some problems with restoration of data where the
decoding software has been lost. But the popular, well-documented
digital formats of the past are as readable as ever: I have a program
on my computer called groff which is largely backwards-compatible with
runoff, one of the earliest digital typesetting formats, dating back
to the 1960s.

There is still a great deal of extant text dating to ancient times,
despite the fact that copying was fantastically expensive, and that
everything was written on flammable materials in a time when flame was
the only artificial light source. Maybe the future will be more like
Orson Scott Card's Homecoming series than the dark ages: a future with
such a weight of carefully recorded and preserved history that
studying it, even in overview, becomes the work of a lifetime.

Anyone who claims to know what the far future will be like is a
charlatan. But I think it would be foolish to assume that it will be
anything like the past.

-- Tim Starling