[Foundation-l] Long-term archiving of Wikimedia content

Samuel Klein meta.sj at gmail.com
Thu May 7 09:17:50 UTC 2009

On Thu, May 7, 2009 at 12:16 AM, Tim Starling <tstarling at wikimedia.org> wrote:
> I wouldn't go quite that far. The idea of doing it (or having done it)
> makes people feel good, due to the collective sci-fi-like fantasy
> implicitly promulgated by the project itself -- a future world of
> poverty and decay, saved by the serendipitous discovery of a
> time-capsule sent from the past. It's a spectacle, a stunt, and it has
> PR value.

Producing long-lived snapshots of important projects, and preserving
them for posterity, is more than a feel-good effort -- it is good
practice.   Most of the layout work needed to produce this sort of
copy must be done for any sort of print copy, which is certainly a
useful class of dump.

If this were simply a stunt, then it would be worth doing a few times
for impact.  In this case I think it's a valid and beautiful way to
store data in its own right.  And it is getting cheap enough that
individuals might want to get offline copies in this format.  I
checked in with Alexander Rose of the Rosetta project, and here's the
latest news:

They are working with their etchers to lower the cost of the process.
The current amortized cost of making 10 nickel discs (each with 10,000
pages in a 100x100 grid) is around $500 each.   They can also make
polymer copies for much less that are likely stable for at least a
century.  As they standardize the process, the price may continue to

The specific process they use involves a few steps : material is
rendered at 300-600dpi (text and images), and laid out in 11x11"
pages.  These are saved as separate image files, numbered 1 to 100,000
in 100 directories, and sent to the etcher; which fits each directory
of 100 images onto one row.  You need a microscope to read the result,
but a decent USB microscope could do it.

> I certainly don't begrudge the Long Now Foundation for having done
> this with the Rosetta Project, since their primary goal is to encourage
> long-term thinking, and expensive stunts are obviously a key part of that.
> But Wikimedia's goals are somewhat different, and we could probably
> find some stunts which are more relevant to our mission.

Are the goals so different?  It seems to me long-term thinking is part
and parcel of comprehensively realizing Wikimedia's goals, from
licensing and access to archival revision tracking and
multilingualism.  We should probably discuss this in the original
thread about strategic planning.

Rose says they would be glad to include something like a "top 10,000
articles" selection from as many languages as possible in the Rosetta
library.  (Thanks!)  He suggests this would be a sought-after gift for
major donors.  I think that's probably true, and could pay for such an
initiative, but encouraging long-term thinking and long-term valuation
of knowledge around Wikimedia (not just in general) is a more
important reason to consider it.


More information about the foundation-l mailing list