Greetings,
We appreciate the help received from this list during the development of the Memento MediaWiki Extension [1].
We have recently completed work with the World Wide Web Consortium (W3C) to bring Memento TimeTravel [2] capabilities to both their wiki and all W3C specifications - such as those for HTML and CSS [3, 4]. Their wiki runs the Memento MediaWiki Extension.
In 2012, we began a dialog about bringing Memento to Wikipedia [5], which resulted in a decision to conduct a pilot on English Wikipedia, but that effort stalled. The biggest concern at the time was the maturity of the PHP code in our extension. This has since been addressed, with thanks to many on this list. We also conducted performance testing during development to ensure minimal impact to MediaWiki installations [6]. Code for the current version of the extension, 2.1.4, is located in gerrit [7].
Considering the consensus from the RFC was to start a pilot of Memento on English Wikipedia, how do we start that process again?
Thanks in advance,
Shawn M. Jones
[1] https://www.mediawiki.org/wiki/Extension:Memento [2] https://tools.ietf.org/html/rfc7089 [3] https://www.w3.org/blog/2016/08/memento-at-the-w3c/ [4] http://blog.dshr.org/2016/09/memento-at-w3c.html [5] https://en.wikipedia.org/wiki/Wikipedia:Requests_for_comment/Memento [6] http://arxiv.org/abs/1406.3876 [7] https://gerrit.wikimedia.org/r/#/admin/projects/mediawik i/extensions/Memento
On Mon, Sep 12, 2016 at 10:57 AM, Shawn Jones jones.shawn.m@gmail.com wrote:
Considering the consensus from the RFC was to start a pilot of Memento on English Wikipedia, how do we start that process again?
Hi Shawn,
Thanks for your previous email, with all of the links.
Several of us investigated this in 2013, and we responded back in 2013 when we declined this in Bugzilla in 2013: https://phabricator.wikimedia.org/T36778#384480
As I recall, older versions of this never fully addressed the caching and infrastructure implications of using HTTP headers. Am I remembering that correctly? Is there something different about current versions?
Rob
Dear Rob,
Thanks for your response.
I would like to start by pointing out that the 2013 Phabricator ticket pertained to a significantly different version of the Memento MediaWiki extension. It’s 2016 by now, and the current version of the extension was developed from scratch with a grant from the Andrew Mellon foundation, and with interaction from MediaWiki developers, both on and off of the wikitech-l list. As a matter of fact, there are currently even two Memento extensions: one that only adds HTTP headers used by the Memento protocol and relies on an external TimeGate, another one that implements all aspects of the Memento protocol. In our work with the W3C, the latter was used, bringing all aspects of Memento to the W3C wiki.
Regarding the issues you raised:
(1) Memento supports caching. The protocol uses two registered HTTP response headers: Link and Memento-Datetime. Just like all HTTP headers, they are cacheable, both for topic pages (which use Link headers) and oldid pages (which use both Link and Memento-Datetime headers). TimeGate responses use the 302 HTTP status code, which have no body and are not cacheable by default [1]. The Memento protocol works with caches, not against them. As a matter of fact, prior to being published as RFC7089, the specification was thoroughly assessed on behalf of the IETF by Mark Nottingham, an expert on web caching.
(2) Supporting the Memento protocol is scalable. This is exemplified by means of its adoption by 16 web archives around the world, including the massive Internet Archive that has exposed Memento TimeGate and TimeMap end-points for many years. We have never heard any concerns regarding scalability from any of these web archives. Quite to the contrary, many web archives use OpenWayback software to replay pages and an effort is ongoing to build the APIs for the new OpenWayback version around the Memento protocol [2]. In addition, we have specifically assessed the performance impact of the MediaWiki extensions for Memento and found it to be negligible [3].
(3) Memento supports various obvious use cases related to “web time travel” that involves Wikipedia resources. It’s really hard to assess how big a user group would be interested in these because we are facing a chicken/egg situation. Clearly, the Wikipedia editors that were involved in the Wikipedia RFC about Memento thought that supporting the protocol would be valuable. Anyhow, I list some use cases and want to emphasize that they can be supported both for machine clients and for browsers. Browsers currently do not natively support Memento but extensions are available and it is also possible to build Memento functionality in web pages using JavaScript.
3.1 Memento TimeMaps are an RFC-specified (read “standard”) approach to expose a version history for Wikipedia topic pages. The recent W3C Data on the Web Best Practices [4] recommends the use of Memento TimeMaps for exposing resource version history. The same document also recommends using Memento TimeGates for access to temporal resource versions.
3.2 Memento can be used for intra-site time travel, as desired for Wikipedia by Vernor Vinge [5]. This is important for a variety of reasons. Historical researchers can easily determine what the state of (inter-linked) Wikipedia topic pages was at a specific date. Memento can also be used to avoid spoilers for current TV shows and sports, something that has been desired by users and discussed at Wikipedia [6, 7].
3.3 Memento can be used for inter-site time travel that involves Wikipedia pages. A user can set a sticky date in the past and visit pages around that date across web archives and version control systems that support Memento. For example, we recently conducted a study of almost 400,000 academic papers containing URI references and found that more than 3,300 research papers published between 2003-2012 reference Wikipedia articles, a number that grows each year. The content of these Wikipedia articles has most likely changed since the time the research paper was published. Using Memento, a user can revisit the state of the Wikipedia page as it was at the time the referencing paper was published by using the paper’s publication date as the sticky date. As per the above, the user can then also keep navigating subject to that date, visiting both version pages in Wikipedia (for internal links) and archived pages in web archives (for external links). This way time travel is seamless, allowing a user to stay fixed to a datetime regardless of which web site they visit. Note that Wikipedia itself uses Memento in this manner to link to archived content in web archives in case of broken external links, using a Memento library that we developed for them [8].
3.4 There is a growing interest in the use of historic web content, with efforts such as Archives Unleashed [9] promoting studies in a variety of fields, including sociology and history. With Memento, a researcher can gather content from a variety of sources as they existed around the same datetime. This data collection process, conducted in a machine-driven or browser-based manner, may include collecting old versions of Wikipedia resources. Without Memento support at Wikipedia, these pages will routinely be harvested from web archives that unfortunately have a very sparse collection of Wikipedia snapshots. Hence, the collected pages will be temporally imprecise. If Wikipedia would support Memento, these researchers would be able to collect the exact version page that was active at the desired datetime.
As indicated, Memento has been widely adopted by web archives around the world and provides a standardized approach to access historic web content. We developed the Memento extensions for MediaWiki (as well as generic TimeGate software [10]) as a means to bring the same time travel power to version control systems. We have successfully reached out to the W3C and, as a result, both their wiki and their specifications now provide an illustration of Memento’s cross-web time travel power. We have unsuccessfully reached out to Wikipedia in the past but are trying our luck again now. We believe that adoption by Wikipedia would be a game changer when it comes to native support for Memento in browsers and hope that the Wikipedia community will be willing to help make that happen.
We are very interested in hearing what the next steps would be and how we could be of help.
Greetings
Shawn Jones [1] https://tools.ietf.org/html/rfc7231#page-48 [2] https://github.com/iipc/openwayback/issues/305 [3] http://arxiv.org/abs/1406.3876 [4] https://www.w3.org/TR/dwbp/ [5] https://phabricator.wikimedia.org/T7877 [6] https://en.wikipedia.org/wiki/Wikipedia_talk:Spoiler [7] https://en.wikipedia.org/wiki/Wikipedia:Spoiler [8] https://github.com/mementoweb/py-memento-client [9] http://archivesunleashed.com/ [10] https://github.com/mementoweb/timegate
On Mon, Sep 12, 2016 at 12:58 PM, Rob Lanphier robla@wikimedia.org wrote:
On Mon, Sep 12, 2016 at 10:57 AM, Shawn Jones jones.shawn.m@gmail.com wrote:
Considering the consensus from the RFC was to start a pilot of Memento on English Wikipedia, how do we start that process again?
Hi Shawn,
Thanks for your previous email, with all of the links.
Several of us investigated this in 2013, and we responded back in 2013 when we declined this in Bugzilla in 2013: https://phabricator.wikimedia.org/T36778#384480
As I recall, older versions of this never fully addressed the caching and infrastructure implications of using HTTP headers. Am I remembering that correctly? Is there something different about current versions?
Rob
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
On Mon, 2016-09-12 at 11:57 -0600, Shawn Jones wrote:
We appreciate the help received from this list during the development of the Memento MediaWiki Extension [1]. [1] https://www.mediawiki.org/wiki/Extension:Memento
Interesting, hadn't heard of that.
https://www.mediawiki.org/wiki/Extension:Memento#Usage states that the "best way to experience this extension is by installing Memento Time Travel for the Chrome browser." Is the source code of the Chrome extension available? (I didn't spot in on https://github.com/mementoweb%C2%A0). Also, under which license is it?
Cheers, andre
Andre,
The source code for the Memento for Chrome extension is now at https://github.com/mementoweb/memento_chrome
It is available under a BSD license here: https://github.com/mementoweb/memento_chrome/blob/master/LICENSE
Cheers,
Shawn
On Mon, Sep 12, 2016 at 1:18 PM, Andre Klapper aklapper@wikimedia.org wrote:
On Mon, 2016-09-12 at 11:57 -0600, Shawn Jones wrote:
We appreciate the help received from this list during the development of the Memento MediaWiki Extension [1]. [1] https://www.mediawiki.org/wiki/Extension:Memento
Interesting, hadn't heard of that.
https://www.mediawiki.org/wiki/Extension:Memento#Usage states that the "best way to experience this extension is by installing Memento Time Travel for the Chrome browser." Is the source code of the Chrome extension available? (I didn't spot in on https://github.com/mementoweb ). Also, under which license is it?
Cheers, andre -- Andre Klapper | Wikimedia Bugwrangler http://blogs.gnome.org/aklapper/
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
wikitech-l@lists.wikimedia.org