This is so exciting! To Steven's point: we've also started a page where folks can add bits of interesting information as they excavate the files [1]. Can't wait to dig in!
Congrats, Tim!
[1] http://ten.wikipedia.org/wiki/Wikipedia_in_the_Beginning
Date: Tue, 14 Dec 2010 08:20:10 -0800 From: Steven Walling steven.walling@gmail.com Subject: Re: [Foundation-l] Old Wikipedia backups discovered To: Wikimedia Foundation Mailing List foundation-l@lists.wikimedia.org Message-ID: AANLkTin9CjXR1S_eCfR3nR6Xmt6C4o=6oHDhTXP4JPzL@mail.gmail.com Content-Type: text/plain; charset=ISO-8859-1
This is fantastic, and the timing could not be better.
If anyone finds anything noteworthy, please add it to the timeline of Wikipedia that we're building at the 10th anniversary wiki,[1] as well as the other tools for cataloging interesting tidbits from our history.[2]
1. http://ten.wikipedia.org/wiki/Wikipedia_timeline 2. http://ten.wikipedia.org/wiki/Share
On Tue, Dec 14, 2010 at 8:11 AM, Chad innocentkiller@gmail.com wrote:
On Tue, Dec 14, 2010 at 10:54 AM, Tim Starling tstarling@wikimedia.org wrote:
I was looking through some old files in our SourceForge project. I opened a file called wiki.tar.gz, and inside were three complete backups of the text of Wikipedia, from February, March and August 2001!
This is exciting, because there is lots of article history in here which was assumed to be lost forever.
I've long been interested in Wikipedia's history, and I've tried in the past to locate such backups. I asked various people who might have had one. I had given up hope.
The history of particularly old Wikipedia articles, as seen in the present Wikipedia database, is incomplete, due to Usemod's policy of deleting old revisions of pages after about a month. The script which Brion wrote to import the article histories from UseMod to MediaWiki only fetched those revisions which hadn't been purged yet.
I didn't want to believe that those revisions had been lost forever, and I even opened the UseMod source code and stared forlornly at the unlink() call. What I (and Brion before) missed is that UseMod appends a record of every change made to two files, called diff_log and rclog. In these two files is a record of every change made to Wikipedia from January 15 to August 17, 2001.
I've put the two log files up on the web, at:
http://noc.wikimedia.org/~tstarling/wikipedia-logs-2001-08-17.7z
The 7-zip archive is only 8.4MB -- much more manageable than today's backups.
rclog contains IP addresses. The Usemod software made IP addresses of logged-in users public, so the people who made these edits had no expectation that their IP address would be kept private. That, coupled with the passage of time, makes me think that no harm to user privacy can come from releasing these files.
-- Tim Starling
I have to say this is super cool. It's like digging up a time capsule right before the 10th anniversary. One of my favorite early edits:
"This is the new WikiPedia! The idea here is to write a complete encyclopedia from scratch, without peer review process, etc. Some people think that this may be a hopeless endeavor, that the result will necessarily suck. We aren't so sure. So, let's get to work!"
-Chad
foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
FYI, there is an existing timeline at:
http://meta.wikimedia.org/wiki/Wikipedia_timeline
And lots of other wikipedia history pages on English, too.
:) Phoebe
On Tue, Dec 14, 2010 at 10:23 AM, Moka Pantages mpantages@wikimedia.org wrote:
This is so exciting! To Steven's point: we've also started a page where folks can add bits of interesting information as they excavate the files [1]. Can't wait to dig in!
Congrats, Tim!
[1] http://ten.wikipedia.org/wiki/Wikipedia_in_the_Beginning
Date: Tue, 14 Dec 2010 08:20:10 -0800 From: Steven Walling steven.walling@gmail.com Subject: Re: [Foundation-l] Old Wikipedia backups discovered To: Wikimedia Foundation Mailing List foundation-l@lists.wikimedia.org Message-ID: AANLkTin9CjXR1S_eCfR3nR6Xmt6C4o=6oHDhTXP4JPzL@mail.gmail.com Content-Type: text/plain; charset=ISO-8859-1
This is fantastic, and the timing could not be better.
If anyone finds anything noteworthy, please add it to the timeline of Wikipedia that we're building at the 10th anniversary wiki,[1] as well as the other tools for cataloging interesting tidbits from our history.[2]
On Tue, Dec 14, 2010 at 8:11 AM, Chad innocentkiller@gmail.com wrote:
On Tue, Dec 14, 2010 at 10:54 AM, Tim Starling tstarling@wikimedia.org wrote:
I was looking through some old files in our SourceForge project. I opened a file called wiki.tar.gz, and inside were three complete backups of the text of Wikipedia, from February, March and August 2001!
See "see also" etc in [[History of Wikipedia]].
FT2
On Tue, Dec 14, 2010 at 7:27 PM, phoebe ayers phoebe.wiki@gmail.com wrote:
FYI, there is an existing timeline at:
http://meta.wikimedia.org/wiki/Wikipedia_timeline
And lots of other wikipedia history pages on English, too.
:) Phoebe
On Tue, Dec 14, 2010 at 10:23 AM, Moka Pantages mpantages@wikimedia.org wrote:
This is so exciting! To Steven's point: we've also started a page where folks can add bits of interesting information as they excavate the files [1]. Can't wait to dig in!
Congrats, Tim!
[1] http://ten.wikipedia.org/wiki/Wikipedia_in_the_Beginning
Date: Tue, 14 Dec 2010 08:20:10 -0800 From: Steven Walling steven.walling@gmail.com Subject: Re: [Foundation-l] Old Wikipedia backups discovered To: Wikimedia Foundation Mailing List foundation-l@lists.wikimedia.org Message-ID: AANLkTin9CjXR1S_eCfR3nR6Xmt6C4o=6oHDhTXP4JPzL@mail.gmail.com Content-Type: text/plain; charset=ISO-8859-1
This is fantastic, and the timing could not be better.
If anyone finds anything noteworthy, please add it to the timeline of Wikipedia that we're building at the 10th anniversary wiki,[1] as well as the other tools for cataloging interesting tidbits from our history.[2]
On Tue, Dec 14, 2010 at 8:11 AM, Chad innocentkiller@gmail.com wrote:
On Tue, Dec 14, 2010 at 10:54 AM, Tim Starling <tstarling@wikimedia.org
wrote:
I was looking through some old files in our SourceForge project. I opened a file called wiki.tar.gz, and inside were three complete backups of the text of Wikipedia, from February, March and August
2001!
foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Here are a couple of quick indexes into the dump file. I didn't venture into the binary revision data. You'll find an alphabetized list of articles that contains all the diffs for each article in the order that they occured in the dump and a sorted index into each revision as well.
http://grey.colorado.edu/wikipedia_2001/
http://grey.colorado.edu/wikipedia_2001/Given that it's finals I don't even have enough time to dig through this at all. Guess I just wanted a distraction =)
- Brian
On Tue, Dec 14, 2010 at 12:27 PM, phoebe ayers phoebe.wiki@gmail.comwrote:
FYI, there is an existing timeline at:
http://meta.wikimedia.org/wiki/Wikipedia_timeline
And lots of other wikipedia history pages on English, too.
:) Phoebe
On Tue, Dec 14, 2010 at 10:23 AM, Moka Pantages mpantages@wikimedia.org wrote:
This is so exciting! To Steven's point: we've also started a page where folks can add bits of interesting information as they excavate the files [1]. Can't wait to dig in!
Congrats, Tim!
[1] http://ten.wikipedia.org/wiki/Wikipedia_in_the_Beginning
Date: Tue, 14 Dec 2010 08:20:10 -0800 From: Steven Walling steven.walling@gmail.com Subject: Re: [Foundation-l] Old Wikipedia backups discovered To: Wikimedia Foundation Mailing List foundation-l@lists.wikimedia.org Message-ID: AANLkTin9CjXR1S_eCfR3nR6Xmt6C4o=6oHDhTXP4JPzL@mail.gmail.com Content-Type: text/plain; charset=ISO-8859-1
This is fantastic, and the timing could not be better.
If anyone finds anything noteworthy, please add it to the timeline of Wikipedia that we're building at the 10th anniversary wiki,[1] as well as the other tools for cataloging interesting tidbits from our history.[2]
On Tue, Dec 14, 2010 at 8:11 AM, Chad innocentkiller@gmail.com wrote:
On Tue, Dec 14, 2010 at 10:54 AM, Tim Starling <tstarling@wikimedia.org
wrote:
I was looking through some old files in our SourceForge project. I opened a file called wiki.tar.gz, and inside were three complete backups of the text of Wikipedia, from February, March and August
2001!
foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Browsing through the earliest revisions in the revision index ( http://grey.colorado.edu/wikipedia_2001/revisions.html) is rather interesting and full of fodder for founder debates. Consider these very early revisions:
"[http://www.nupedia.com Nupedia.com] is an open content, international, peer reviewed project run by LarrySanger, who got the idea of supplementing NuPedia with a less formal "wiki" encyclopedia project. " - http://grey.colorado.edu/wikipedia_2001/979694938.txt
"EditorInChief of NuPedia and instigator of Nupedia's wiki. " http://grey.colorado.edu/wikipedia_2001/979690096.txt
Sanger's claims to coming up with the idea of adding the wiki concept to the online encyclopedia concept clearly go all the way back to the beginning. Of course, that doesn't speak to offline conversations that gave rise to the idea.
And Sanger clearly didn't have much faith in the concept:
None of this is to say that the Nupedia wiki will ''replace'' the main encyclopedia; of course it won't. But it will be an interesting ancillary endeavor! http://grey.colorado.edu/wikipedia_2001/979695982.txt
- Brian
On Tue, Dec 14, 2010 at 2:41 PM, Brian Brian.Mingus@colorado.edu wrote:
Here are a couple of quick indexes into the dump file. I didn't venture into the binary revision data. You'll find an alphabetized list of articles that contains all the diffs for each article in the order that they occured in the dump and a sorted index into each revision as well.
http://grey.colorado.edu/wikipedia_2001/
http://grey.colorado.edu/wikipedia_2001/Given that it's finals I don't even have enough time to dig through this at all. Guess I just wanted a distraction =)
- Brian
On Tue, Dec 14, 2010 at 12:27 PM, phoebe ayers phoebe.wiki@gmail.comwrote:
FYI, there is an existing timeline at:
http://meta.wikimedia.org/wiki/Wikipedia_timeline
And lots of other wikipedia history pages on English, too.
:) Phoebe
On Tue, Dec 14, 2010 at 10:23 AM, Moka Pantages mpantages@wikimedia.org wrote:
This is so exciting! To Steven's point: we've also started a page where folks can add bits of interesting information as they excavate the files [1]. Can't wait to dig in!
Congrats, Tim!
[1] http://ten.wikipedia.org/wiki/Wikipedia_in_the_Beginning
Date: Tue, 14 Dec 2010 08:20:10 -0800 From: Steven Walling steven.walling@gmail.com Subject: Re: [Foundation-l] Old Wikipedia backups discovered To: Wikimedia Foundation Mailing List foundation-l@lists.wikimedia.org Message-ID: AANLkTin9CjXR1S_eCfR3nR6Xmt6C4o=6oHDhTXP4JPzL@mail.gmail.com Content-Type: text/plain; charset=ISO-8859-1
This is fantastic, and the timing could not be better.
If anyone finds anything noteworthy, please add it to the timeline of Wikipedia that we're building at the 10th anniversary wiki,[1] as well
as
the other tools for cataloging interesting tidbits from our history.[2]
On Tue, Dec 14, 2010 at 8:11 AM, Chad innocentkiller@gmail.com wrote:
On Tue, Dec 14, 2010 at 10:54 AM, Tim Starling <
tstarling@wikimedia.org>
wrote:
I was looking through some old files in our SourceForge project. I opened a file called wiki.tar.gz, and inside were three complete backups of the text of Wikipedia, from February, March and August
2001!
foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Here is an interesting bit of history - the Wikipedia logo was first an American flag. Then Scott Moonen suggested we make it a globe:
In its first day of existences, because the nearest thing to hand for JimmyWales that was suitable for a logo was an American flag, WikiPedia had the American flag, OldGlory, for a logo.
ScottMoonen sensibly suggested:
I'd recommend you change the American flag logo. Exremely ethno-centric ''et. al.'' I think a globe logo would be much more fitting, if you want to keep with that metaphor. Or perhaps a book.
http://grey.colorado.edu/wikipedia_2001/979773872.txt
- Brian
On Tue, Dec 14, 2010 at 5:17 PM, Brian Brian.Mingus@colorado.edu wrote:
Browsing through the earliest revisions in the revision index ( http://grey.colorado.edu/wikipedia_2001/revisions.html) is rather interesting and full of fodder for founder debates. Consider these very early revisions:
"[http://www.nupedia.com Nupedia.com] is an open content, international, peer reviewed project run by LarrySanger, who got the idea of supplementing NuPedia with a less formal "wiki" encyclopedia project. " - http://grey.colorado.edu/wikipedia_2001/979694938.txt
"EditorInChief of NuPedia and instigator of Nupedia's wiki. " http://grey.colorado.edu/wikipedia_2001/979690096.txt
Sanger's claims to coming up with the idea of adding the wiki concept to the online encyclopedia concept clearly go all the way back to the beginning. Of course, that doesn't speak to offline conversations that gave rise to the idea.
And Sanger clearly didn't have much faith in the concept:
None of this is to say that the Nupedia wiki will ''replace'' the main encyclopedia; of course it won't. But it will be an interesting ancillary endeavor! http://grey.colorado.edu/wikipedia_2001/979695982.txt
- Brian
On Tue, Dec 14, 2010 at 2:41 PM, Brian Brian.Mingus@colorado.edu wrote:
Here are a couple of quick indexes into the dump file. I didn't venture into the binary revision data. You'll find an alphabetized list of articles that contains all the diffs for each article in the order that they occured in the dump and a sorted index into each revision as well.
http://grey.colorado.edu/wikipedia_2001/
http://grey.colorado.edu/wikipedia_2001/Given that it's finals I don't even have enough time to dig through this at all. Guess I just wanted a distraction =)
- Brian
On Tue, Dec 14, 2010 at 12:27 PM, phoebe ayers phoebe.wiki@gmail.comwrote:
FYI, there is an existing timeline at:
http://meta.wikimedia.org/wiki/Wikipedia_timeline
And lots of other wikipedia history pages on English, too.
:) Phoebe
On Tue, Dec 14, 2010 at 10:23 AM, Moka Pantages mpantages@wikimedia.org wrote:
This is so exciting! To Steven's point: we've also started a page where folks can add bits of interesting information as they excavate the files [1]. Can't wait to dig in!
Congrats, Tim!
[1] http://ten.wikipedia.org/wiki/Wikipedia_in_the_Beginning
Date: Tue, 14 Dec 2010 08:20:10 -0800 From: Steven Walling steven.walling@gmail.com Subject: Re: [Foundation-l] Old Wikipedia backups discovered To: Wikimedia Foundation Mailing List foundation-l@lists.wikimedia.org Message-ID: AANLkTin9CjXR1S_eCfR3nR6Xmt6C4o=6oHDhTXP4JPzL@mail.gmail.com Content-Type: text/plain; charset=ISO-8859-1
This is fantastic, and the timing could not be better.
If anyone finds anything noteworthy, please add it to the timeline of Wikipedia that we're building at the 10th anniversary wiki,[1] as well
as
the other tools for cataloging interesting tidbits from our history.[2]
On Tue, Dec 14, 2010 at 8:11 AM, Chad innocentkiller@gmail.com
wrote:
On Tue, Dec 14, 2010 at 10:54 AM, Tim Starling <
tstarling@wikimedia.org>
wrote:
I was looking through some old files in our SourceForge project. I opened a file called wiki.tar.gz, and inside were three complete backups of the text of Wikipedia, from February, March and August
2001!
foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
An'n 15.12.2010 01:36, hett Brian J Mingus schreven:
Nice to see that the quality of posts on the mailing lists was low and discussions lame and rapidly off-topicking since ... the very first day! ;-)
Marcus Buck User:Slomox
This discovery is so great! Good work, Tim.
Perhaps we could make a book with the first pages as a souvenir.
/Lennart
2010/12/15 Marcus Buck me@marcusbuck.org
An'n 15.12.2010 01:36, hett Brian J Mingus schreven:
Nice to see that the quality of posts on the mailing lists was low and discussions lame and rapidly off-topicking since ... the very first day! ;-)
Marcus Buck User:Slomox
foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Brian J Mingus, 15/12/2010 01:36:
Here is an interesting bit of history - the Wikipedia logo was first an American flag. Then Scott Moonen suggested we make it a globe:
No news, this is already on Meta: http://meta.wikimedia.org/wiki/Logo_history http://meta.wikimedia.org/wiki/OldWikiPediaLogo
Nemo
On Wed, Dec 15, 2010 at 9:17 AM, Federico Leva (Nemo) nemowiki@gmail.com wrote:
Brian J Mingus, 15/12/2010 01:36:
Here is an interesting bit of history - the Wikipedia logo was first an American flag. Then Scott Moonen suggested we make it a globe:
No news, this is already on Meta: http://meta.wikimedia.org/wiki/Logo_history http://meta.wikimedia.org/wiki/OldWikiPediaLogo
Nemo
It's not news but AFAIK an actual image of the flag used is missing. So if that turns up, that would be cool :) But I think it was already gone by Feb. 2001.
-- phoebe
phoebe ayers wrote:
It's not news but AFAIK an actual image of the flag used is missing. So if that turns up, that would be cool :) But I think it was already gone by Feb. 2001.
-- phoebe
Isn't it the first piece of http://meta.wikimedia.org/wiki/File:Terribly_wrong.png ?
On 15/12/10 11:17, Brian J Mingus wrote:
Browsing through the earliest revisions in the revision index ( http://grey.colorado.edu/wikipedia_2001/revisions.html) is rather interesting and full of fodder for founder debates. Consider these very early revisions:
"[http://www.nupedia.com Nupedia.com] is an open content, international, peer reviewed project run by LarrySanger, who got the idea of supplementing NuPedia with a less formal "wiki" encyclopedia project. " - http://grey.colorado.edu/wikipedia_2001/979694938.txt
"EditorInChief of NuPedia and instigator of Nupedia's wiki. " http://grey.colorado.edu/wikipedia_2001/979690096.txt
Sanger's claims to coming up with the idea of adding the wiki concept to the online encyclopedia concept clearly go all the way back to the beginning. Of course, that doesn't speak to offline conversations that gave rise to the idea.
I've long suspected that the early FAQs and history pages gave Larry Sanger an exaggerated role because he wrote them himself. It will be interesting to see if any such conclusion can be drawn from the archives. Note that 979694938 was by dhcp058.246.lvcm.com, which appears to be Larry.
By the way, the numbers in the revisions, e.g. 979694938, are UNIX timestamps. That one was 17 Jan 2001, 01:28:58 UTC.
-- Tim Starling
Larry didn't have an exaggerated role, he really did run the project in the early days.
On Tue, Dec 14, 2010 at 7:50 PM, Tim Starling tstarling@wikimedia.orgwrote:
On 15/12/10 11:17, Brian J Mingus wrote:
Browsing through the earliest revisions in the revision index ( http://grey.colorado.edu/wikipedia_2001/revisions.html) is rather interesting and full of fodder for founder debates. Consider these very early revisions:
"[http://www.nupedia.com Nupedia.com] is an open content, international, peer reviewed project run by LarrySanger, who got the idea of
supplementing
NuPedia with a less formal "wiki" encyclopedia project. " - http://grey.colorado.edu/wikipedia_2001/979694938.txt
"EditorInChief of NuPedia and instigator of Nupedia's wiki. " http://grey.colorado.edu/wikipedia_2001/979690096.txt
Sanger's claims to coming up with the idea of adding the wiki concept to
the
online encyclopedia concept clearly go all the way back to the beginning.
Of
course, that doesn't speak to offline conversations that gave rise to the idea.
I've long suspected that the early FAQs and history pages gave Larry Sanger an exaggerated role because he wrote them himself. It will be interesting to see if any such conclusion can be drawn from the archives. Note that 979694938 was by dhcp058.246.lvcm.com, which appears to be Larry.
By the way, the numbers in the revisions, e.g. 979694938, are UNIX timestamps. That one was 17 Jan 2001, 01:28:58 UTC.
-- Tim Starling
foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Right in time! And the rightly early version too! Kudos to the diggers and bashers!
On Tue, Dec 14, 2010 at 21:23, Moka Pantages mpantages@wikimedia.orgwrote:
This is so exciting! To Steven's point: we've also started a page where folks can add bits of interesting information as they excavate the files [1]. Can't wait to dig in!
Congrats, Tim!
[1] http://ten.wikipedia.org/wiki/Wikipedia_in_the_Beginning
Date: Tue, 14 Dec 2010 08:20:10 -0800 From: Steven Walling steven.walling@gmail.com Subject: Re: [Foundation-l] Old Wikipedia backups discovered To: Wikimedia Foundation Mailing List foundation-l@lists.wikimedia.org Message-ID: AANLkTin9CjXR1S_eCfR3nR6Xmt6C4o=6oHDhTXP4JPzL@mail.gmail.com Content-Type: text/plain; charset=ISO-8859-1
This is fantastic, and the timing could not be better.
If anyone finds anything noteworthy, please add it to the timeline of Wikipedia that we're building at the 10th anniversary wiki,[1] as well as the other tools for cataloging interesting tidbits from our history.[2]
On Tue, Dec 14, 2010 at 8:11 AM, Chad innocentkiller@gmail.com wrote:
On Tue, Dec 14, 2010 at 10:54 AM, Tim Starling tstarling@wikimedia.org wrote:
I was looking through some old files in our SourceForge project. I opened a file called wiki.tar.gz, and inside were three complete backups of the text of Wikipedia, from February, March and August 2001!
This is exciting, because there is lots of article history in here which was assumed to be lost forever.
I've long been interested in Wikipedia's history, and I've tried in the past to locate such backups. I asked various people who might have had one. I had given up hope.
The history of particularly old Wikipedia articles, as seen in the present Wikipedia database, is incomplete, due to Usemod's policy of deleting old revisions of pages after about a month. The script which Brion wrote to import the article histories from UseMod to MediaWiki only fetched those revisions which hadn't been purged yet.
I didn't want to believe that those revisions had been lost forever, and I even opened the UseMod source code and stared forlornly at the unlink() call. What I (and Brion before) missed is that UseMod appends a record of every change made to two files, called diff_log and rclog. In these two files is a record of every change made to Wikipedia from January 15 to August 17, 2001.
I've put the two log files up on the web, at:
http://noc.wikimedia.org/~tstarling/wikipedia-logs-2001-08-17.7zhttp://noc.wikimedia.org/%7Etstarling/wikipedia-logs-2001-08-17.7z
The 7-zip archive is only 8.4MB -- much more manageable than today's backups.
rclog contains IP addresses. The Usemod software made IP addresses of logged-in users public, so the people who made these edits had no expectation that their IP address would be kept private. That, coupled with the passage of time, makes me think that no harm to user privacy can come from releasing these files.
-- Tim Starling
I have to say this is super cool. It's like digging up a time capsule right before the 10th anniversary. One of my favorite early edits:
"This is the new WikiPedia! The idea here is to write a complete encyclopedia from scratch, without peer review process, etc. Some people think that this may be a hopeless endeavor, that the result will necessarily suck. We aren't so sure. So, let's get to work!"
-Chad
foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
wikimedia-l@lists.wikimedia.org