Seb35,
I came across your extension a month ago. Ours is different in that it is also
implementing the Memento protocol as used by the Internet Archive, Archive-It, and
others.
I do however, appreciate your insight in trying to solve many of the same problems. I,
too, was trying to address the retrieval of old versions of templates, which brought me to
your extension. Your use of BeforeParserFetchTemplateAndtitle inspired parts of our
Template solution.
We're currently trying to figure out how to handle images.
What did you mean by MediaWiki messages? Are you referring to the Messages API as part of
I18N?
Thanks again,
--Shawn
On Nov 1, 2013, at 9:50 PM, Seb35 <seb35wikipedia(a)gmail.com> wrote:
Hi,
No responses to your specific questions, but just to mention I worked some years ago on
an extension [1] aiming at retrieving the as-exact-as-possible display of the page at a
given past datetime, because the current implementation of oldid is only "past
wikitext with current context (templates, images, etc.)".
I mainly implemented the retrieval of old versions of templates, but a lot of other
smaller improvements could be done (MediaWiki messages, styles/JS, images, etc.). With
this approach, some details are irremediably lost (e.g. number of articles at given
timedate, some tricky delete-and-move actions, etc.) and additional informations would
have to be recorded to retrieve more exactly the past versions.
[1]
https://www.mediawiki.org/wiki/Extension:BackwardsTimeTravel
~ Seb35
Le Fri, 01 Nov 2013 20:50:06 +0100, Shawn Jones <sjone(a)cs.odu.edu> a écrit:
Hi,
I'm currently working on the Memento Extension for Mediawiki, as announced earlier
today by Herbert Van de Sompel.
The goal of this extension is to work with the Memento framework, which attempts to
display web pages as they appeared at a given date and time in the past.
Our goal is for this to be a collaborative effort focusing on solving issues and
providing functionality in "the Wikimedia Way" as much as possible.
Without further ado, I have the following technical questions (I apologize in advance for
the fire hose):
1. The Memento protocol has a resource called a TimeMap [1] that takes an article name
and returns text formatted as application/link-format. This text contains a
machine-readable list of all of the prior revisions (mementos) of this page. It is
currently implemented as a SpecialPage which can be accessed like
http://www.example.com/index.php/Special:TimeMap/Article_Name. Is this the best method,
or is it more preferable for us to extend the Action class and add a new action to
$wgActions in order to return a TimeMap from the regular page like
http://www.example.com/index.php?title=Article_Name&action=gettimemap without using
the SpecialPage? Is there another preferred way of solving this problem?
2. We currently make several database calls using the the select method of the Database
Object. After some research, we realized that Mediawiki provides some functions that do
what we need without making these database calls directly. One of these needs is to
acquire the oldid and timestamp of the first revision of a page, which can be done using
Title->getFirstRevision()->getId() and
Title->getFirstRevision()->getTimestamp() methods. Is there a way to get the latest
ID and latest timestamp? I see I can do Title->getLatestRevID() to get the latest
revision ID; what is the best way to get the latest timestamp?
3. In order to create the correct headers for use with the Memento protocol, we have to
generate URIs. To accomplish this, we use the $wgServer global variable (through a layer
of abstraction); how do we correctly handle situations if it isn't set by the
installation? Is there an alternative? Is there a better way to construct URIs?
4. We use exceptions to indicate when showErrorPage should be run; should the hooks that
catch these exceptions and then run showErrorPage also return false?
5. Is there a way to get previous revisions of embedded content, like images? I tried
using the ImageBeforeProduceHTML hook, but found that setting the $time parameter
didn't return a previous revision of an image. Am I doing something wrong? Is there
a better way?
6. Are there any additional coding standards we should be following besides those on the
"Manual:Coding_conventions" and "Manual:Coding Conventions -
Mediawiki" pages?
7. We have two styles for serving pages back to the user:
* 302-style[2], which uses a 302 redirect to tell the user's browser to go
fetch the old revision of the page (e.g.
http://www.example.com/index.php?title=Article&oldid=12345)
* 200-style[3], which actually modifies the page content in place so that it
resembles the old revision of the page
Which of these styles is preferable as a default?
8. Some sites don't wish to have their past Talk/Discussion pages accessible via
Memento. We have the ability to exclude namespaces (Talk, Template, Category, etc.) via
configurable option. By default it excludes nothing. What namespaces should be excluded
by default?
Thanks in advance for any advice, assistance, further discussion, and criticism on these
and other topics.
Shawn M. Jones
Graduate Research Assistant
Department of Computer Science
Old Dominion University
[1]
http://www.mementoweb.org/guide/rfc/ID/#Pattern6
[2]
http://www.mementoweb.org/guide/rfc/ID/#Pattern1.1
[3]
http://www.mementoweb.org/guide/rfc/ID/#Pattern1.2
_______________________________________________
Wikitech-l mailing list
Wikitech-l(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
_______________________________________________
Wikitech-l mailing list
Wikitech-l(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l