May someone ask Internet Archive to open histories of the Main Pages of Wikipedias in their Time Machine? If they harvested it at all.
Their Time Machine is usually the only exact way to find when early Wikipedias were created, but histories of the pages are blocked because of our robots.txt.
Hmmm... a few days ago we didn't have any problems to retrieve history (2001-2011) of Main Pages of Polish Wikipedia: http://www.facebook.com/media/set/?set=a.10150309409533189.363981.1470564731... - similar should be with other versions :>
-- Daniel // Leinad
2011/9/18 Milos Rancic millosh@gmail.com:
May someone ask Internet Archive to open histories of the Main Pages of Wikipedias in their Time Machine? If they harvested it at all.
Their Time Machine is usually the only exact way to find when early Wikipedias were created, but histories of the pages are blocked because of our robots.txt.
foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
On Sun, Sep 18, 2011 at 12:44, Daniel ~ Leinad danny.leinad@gmail.com wrote:
Hmmm... a few days ago we didn't have any problems to retrieve history (2001-2011) of Main Pages of Polish Wikipedia: http://www.facebook.com/media/set/?set=a.10150309409533189.363981.1470564731...
- similar should be with other versions :>
It is possible to: * Retrieve time history of the very early Wikipedia (~2001), as there were no robots.txt. * Retrieve snapshot of a page when Internet Archive made it.
It is not possible to click on "history" of the page and find the first version of the page, as it's forbidden by robots.txt.
That's important for the sites created in period 2002-2004, as Internet Archive has snapshots, while we don't have versions (.../index.php?oldid=1 doesn't work for those sites or if it works, timestamp is broken (doesn't exist)).
Hi;
Oldids are not sorted by date, so, the first edits may have big oldids. Which Wikipedias do you want to explore? You can use the dumps or Toolserver databases to sort revisions by date and extract the oldest ones. I guess that Wikipedias created in 2002-2004, preserve the full history (unless removed pages, but the first edits use to be in the Main Page). English Wikipedia "lost" some of the first edits during the upgrading in the first MediaWiki versions, but they were later found by Tim Starling in a very old backup.
Regards, emijrp
2011/9/18 Milos Rancic millosh@gmail.com
On Sun, Sep 18, 2011 at 12:44, Daniel ~ Leinad danny.leinad@gmail.com wrote:
Hmmm... a few days ago we didn't have any problems to retrieve history (2001-2011) of Main Pages of Polish Wikipedia:
http://www.facebook.com/media/set/?set=a.10150309409533189.363981.1470564731...
- similar should be with other versions :>
It is possible to:
- Retrieve time history of the very early Wikipedia (~2001), as there
were no robots.txt.
- Retrieve snapshot of a page when Internet Archive made it.
It is not possible to click on "history" of the page and find the first version of the page, as it's forbidden by robots.txt.
That's important for the sites created in period 2002-2004, as Internet Archive has snapshots, while we don't have versions (.../index.php?oldid=1 doesn't work for those sites or if it works, timestamp is broken (doesn't exist)).
foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
wikimedia-l@lists.wikimedia.org