[Foundation-l] Old Wikipedia backups discovered

ViswaPrabha (വിശ്വപ്രഭ) vp2007 at gmail.com
Tue Dec 14 20:35:45 UTC 2010


Right in time! And the rightly early version too!
Kudos to the diggers and bashers!




On Tue, Dec 14, 2010 at 21:23, Moka Pantages <mpantages at wikimedia.org>wrote:

> This is so exciting!  To Steven's point: we've also started a page
> where folks can add bits of interesting information as they excavate
> the files [1].   Can't wait to dig in!
>
> Congrats, Tim!
>
> [1] http://ten.wikipedia.org/wiki/Wikipedia_in_the_Beginning
>
>
> Date: Tue, 14 Dec 2010 08:20:10 -0800
> From: Steven Walling <steven.walling at gmail.com>
> Subject: Re: [Foundation-l] Old Wikipedia backups discovered
> To: Wikimedia Foundation Mailing List
>        <foundation-l at lists.wikimedia.org>
> Message-ID:
>       <AANLkTin9CjXR1S_eCfR3nR6Xmt6C4o=6oHDhTXP4JPzL at mail.gmail.com>
> Content-Type: text/plain; charset=ISO-8859-1
>
> This is fantastic, and the timing could not be better.
>
> If anyone finds anything noteworthy, please add it to the timeline of
> Wikipedia that we're building at the 10th anniversary wiki,[1] as well as
> the other tools for cataloging interesting tidbits from our history.[2]
>
> 1. http://ten.wikipedia.org/wiki/Wikipedia_timeline
> 2. http://ten.wikipedia.org/wiki/Share
>
> On Tue, Dec 14, 2010 at 8:11 AM, Chad <innocentkiller at gmail.com> wrote:
>
> > On Tue, Dec 14, 2010 at 10:54 AM, Tim Starling <tstarling at wikimedia.org>
> > wrote:
> > > I was looking through some old files in our SourceForge project. I
> > > opened a file called wiki.tar.gz, and inside were three complete
> > > backups of the text of Wikipedia, from February, March and August 2001!
> > >
> > > This is exciting, because there is lots of article history in here
> > > which was assumed to be lost forever.
> > >
> > > I've long been interested in Wikipedia's history, and I've tried in
> > > the past to locate such backups. I asked various people who might have
> > > had one. I had given up hope.
> > >
> > > The history of particularly old Wikipedia articles, as seen in the
> > > present Wikipedia database, is incomplete, due to Usemod's policy of
> > > deleting old revisions of pages after about a month. The script which
> > > Brion wrote to import the article histories from UseMod to MediaWiki
> > > only fetched those revisions which hadn't been purged yet.
> > >
> > > I didn't want to believe that those revisions had been lost forever,
> > > and I even opened the UseMod source code and stared forlornly at the
> > > unlink() call. What I (and Brion before) missed is that UseMod appends
> > > a record of every change made to two files, called diff_log and rclog.
> > > In these two files is a record of every change made to Wikipedia from
> > > January 15 to August 17, 2001.
> > >
> > > I've put the two log files up on the web, at:
> > >
> > > http://noc.wikimedia.org/~tstarling/wikipedia-logs-2001-08-17.7z<http://noc.wikimedia.org/%7Etstarling/wikipedia-logs-2001-08-17.7z>
> > >
> > > The 7-zip archive is only 8.4MB -- much more manageable than today's
> > > backups.
> > >
> > > rclog contains IP addresses. The Usemod software made IP addresses of
> > > logged-in users public, so the people who made these edits had no
> > > expectation that their IP address would be kept private. That, coupled
> > > with the passage of time, makes me think that no harm to user privacy
> > > can come from releasing these files.
> > >
> > > -- Tim Starling
> > >
> >
> > I have to say this is super cool. It's like digging up a time capsule
> > right before the 10th anniversary. One of my favorite early edits:
> >
> > "This is the new WikiPedia!  The idea here is to write a complete
> > encyclopedia from scratch, without peer review process, etc.
> > Some people think that this may be a hopeless endeavor, that
> > the result will necessarily suck.  We aren't so sure.  So, let's get
> > to work!"
> >
> > -Chad
> >
> > _______________________________________________
> > foundation-l mailing list
> > foundation-l at lists.wikimedia.org
> > Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
>
> _______________________________________________
> foundation-l mailing list
> foundation-l at lists.wikimedia.org
> Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
>


More information about the foundation-l mailing list