A wikipedian has recently been trying to find a good way to cite particular revisions of articles in the bibliography for a paper.
Current we can give URLs for the _current_ version of an article (current as of whenever it is visited), or of _previous_ versions (as of when the citation was made):
current: http://www.wikipedia.org/wiki/Foobar old: http://www.wikipedia.org/w/wiki.phtml?title=Foobar&oldid=12345
There are two main problems with this (aside from the ugliness of the old-reference URLs):
* There is no way to reference the current version _as of the time of citation_. Since that revision isn't in the old table, it has no oldid assigned yet.
* oldid values sometimes can change, as when an article is deleted and subsequently restored (done also when recombining histories of articles that have been broken by crude renaming). Possible rearrangements of the database (such as combining all languages into a single table) could require reassigning oldids en masse. They are *not* reliable long-term citations.
One possible solution would be to provide a way of citing articles as of a particular timestamp, for instance:
http://www.wikipedia.org/wiki/Foobar?version=20030224161134
which would pull up either a cur or old version with that timestamp. (It could also be prettified: version=2003-02-24-16:11:34 etc)
Advantages: * consistent, no fuss, no worries about rearrangement of db structure * citation URL can be provided in a nice handy link at the bottom of every page
Disadvantages: * timestamp has 1-second resolution. Generally this is going to be unique (at least per article), but it may occasionally not be, particularly in cases of recombined histories. Some articles had multiple revisions' timestamps set to the same time due to bugs in the rename code and other db tweaks in early '02. * for this reason it's not suitable as the mainline url for drawing up old history revisions via the history list; so people have to remember to find and use the citation url separately
Alternatively, we could supply _both_ timestamp and oldid in the URL, and let timestamp have priority if an exact match on both is not found.
Thoughts?
-- brion vibber (brion @ pobox.com)
I need this too, indeed anyone who uses a wikipedia article for a source needs it. All you can do now is to cite the url, page title and the date you took it off wikipedia like this:
http://www.internet-encyclopedia.info/statistics.html -- Original source: <a href="http://www.wikipedia.org/wiki/Statistics">Statistics - Wikipedia</a>, February, 2003, Revised: February 23, 2003<BR> Copyright © 2003 under the terms of the <a href="http://www.gnu.org/copyleft/fdl.html">GNU Free Documentation License</a> <BR>
By the way, do you think this copyright and link to the GNU license is sufficient?
Fred Bauder
http://wwww.internet-encyclopedia.info
From: Brion Vibber brion@pobox.com Reply-To: wikitech-l@wikipedia.org Date: 23 Feb 2003 21:41:13 -0800 To: wikitech-l@wikipedia.org Subject: [Wikitech-l] Citation of versions by timestamp?
A wikipedian has recently been trying to find a good way to cite particular revisions of articles in the bibliography for a paper.
Current we can give URLs for the _current_ version of an article (current as of whenever it is visited), or of _previous_ versions (as of when the citation was made):
current: http://www.wikipedia.org/wiki/Foobar old: http://www.wikipedia.org/w/wiki.phtml?title=Foobar&oldid=12345
There are two main problems with this (aside from the ugliness of the old-reference URLs):
- There is no way to reference the current version _as of the time of
citation_. Since that revision isn't in the old table, it has no oldid assigned yet.
- oldid values sometimes can change, as when an article is deleted and
subsequently restored (done also when recombining histories of articles that have been broken by crude renaming). Possible rearrangements of the database (such as combining all languages into a single table) could require reassigning oldids en masse. They are *not* reliable long-term citations.
One possible solution would be to provide a way of citing articles as of a particular timestamp, for instance:
http://www.wikipedia.org/wiki/Foobar?version=20030224161134
which would pull up either a cur or old version with that timestamp. (It could also be prettified: version=2003-02-24-16:11:34 etc)
Advantages:
- consistent, no fuss, no worries about rearrangement of db structure
- citation URL can be provided in a nice handy link at the bottom of
every page
Disadvantages:
- timestamp has 1-second resolution. Generally this is going to be
unique (at least per article), but it may occasionally not be, particularly in cases of recombined histories. Some articles had multiple revisions' timestamps set to the same time due to bugs in the rename code and other db tweaks in early '02.
- for this reason it's not suitable as the mainline url for drawing up
old history revisions via the history list; so people have to remember to find and use the citation url separately
Alternatively, we could supply _both_ timestamp and oldid in the URL, and let timestamp have priority if an exact match on both is not found.
Thoughts?
-- brion vibber (brion @ pobox.com)
Brion Vibber wrote:
One possible solution would be to provide a way of citing articles as of a particular timestamp, for instance:
http://www.wikipedia.org/wiki/Foobar?version=20030224161134
which would pull up either a cur or old version with that timestamp. (It could also be prettified: version=2003-02-24-16:11:34 etc)
Advantages:
- consistent, no fuss, no worries about rearrangement of db structure
- citation URL can be provided in a nice handy link at the bottom of
every page
Disadvantages:
- timestamp has 1-second resolution. Generally this is going to be
unique (at least per article), but it may occasionally not be, particularly in cases of recombined histories. Some articles had multiple revisions' timestamps set to the same time due to bugs in the rename code and other db tweaks in early '02.
- for this reason it's not suitable as the mainline url for drawing up
old history revisions via the history list; so people have to remember to find and use the citation url separately
Alternatively, we could supply _both_ timestamp and oldid in the URL, and let timestamp have priority if an exact match on both is not found.
Well, we could also have
http://www.wikipedia.org/wiki/Foobar?md5=1234f53fa34f253f3453abf00f549120
which would identify a unique version with high probability, and also provide a way of verifying the integrity of the old version (otherwise, you're just trusting the owner of the archive). For fanatical levels of caution, you could do:
http://www.wikipedia.org/wiki/Foobar?version=20030224161134&md5=1234f53f...
For the truly paranoid, you could substitute SHA-1 for MD5.
Perhaps we need a "permalink" at the bottom of this page marked "permanent link to this version"?
-- Neil
On Mon, 24 Feb 2003, Neil Harris wrote:
Well, we could also have
http://www.wikipedia.org/wiki/Foobar?md5=1234f53fa34f253f3453abf00f549120
which would identify a unique version with high probability, and also provide a way of verifying the integrity of the old version
Well heck, why use the name at all? (After all, pages can be renamed.) Add a hash field to the table, index it, and presto! Lookup an arbitrary version of any page, no matter how it's been shuffled around:
http://www.wikipedia.org/cite/1234f53fa34f253f3453abf00f549120
Crazy perhaps, but a thought. ;)
I'm not much for ugly incomprehensible URLs though...
-- brion vibber (brion @ pobox.com)
http://www.wikipedia.org/wiki/Foobar?md5=1234f53fa34f253f3453abf00f549120 which would identify a unique version with high probability, and also provide a way of verifying the integrity of the old version
Well heck, why use the name at all? (After all, pages can be renamed.) Add a hash field to the table, index it, and presto! Lookup an arbitrary version of any page, no matter how it's been shuffled around:
http://www.wikipedia.org/cite/1234f53fa34f253f3453abf00f549120
Yep, renaming is a sticky issue here, and the fact that "current" articles and old version are in different tables makes implementation a pain. As an initial effort, I think a title/date link could be implemented that will work most of the time and only fail on renamed articles; at some later point in the evolution of the software we can make it work correctly in all cases. I don't see any way the hashes wil help.
wikitech-l@lists.wikimedia.org