[Foundation-l] Historical wikipedia dumps

Robert Rohde rarohde at gmail.com
Thu Aug 28 19:12:46 UTC 2008


On Thu, Aug 28, 2008 at 5:14 AM, Anthony <wikimail at inbox.org> wrote:
> On Thu, Aug 28, 2008 at 8:12 AM, Anthony <wikimail at inbox.org> wrote:
>
>> Lets see - I'd have to find the right templates, article, categories, and
>> images, presumably working from the stub dump, and then merge them all
>> together.  Anything else?
>>
>
> Red vs. Blue links...  Boy, I hope there's a standalone parser.

You also need the correct versions of the CSS and JS files, which is
pain since those file locations have changed over time.  If you wanted
to be really thorough you'd have to look at the Mediawiki space as
well (for example to capture the evolution of the sidebar), but that
has the extra wrinkle that the way the Mediawiki engine parses content
in the Mediawiki space has also evolved over time.

Being completely accurate would be nearly impossible, but one could do
a good approximation with enough effort.

-Robert Rohde



More information about the foundation-l mailing list