Re: [Wikipedia-l] Downloading Wikipedia (was: please help me)

14 Nov 2003


      Dear Sir/Madam,
...
Brion Vibber wrote:
...
Please do *not* use programs such as Webstripper. Especially on
dynamic sites like Wikipedia, they create a huge amount of load on the
servers by downloading thousands upon thousands of pages extremely
rapidly. [...] Use a web browser like everybody else, and please stop
attacking the web sites that you enjoy.
Don't we have a nice compressed static HTML tree by now that we could
offer people under the "Download Wikipedia" heading on the main page?
I haven't looked at the wikipedia software but I expect it's two tier; the
wiki markup is stored in a database and converted to HTML with every
request.  This is not very efficient.
I designed a similar piece of software but with three tiers.  The wiki
markup is converted to XML before being stored in the database, and back
again for editing.  This may be time consuming, but it doesn't happen very
often.  It also allows me to change my wiki markup language (though I
don't use that term) if necessary.
When a user requests a page, it is retrieved as XML from the database and
transformed to HTML by an XSLT program.  The important point is that most
users have a browser that can do the XSL transformation.  So they can
download as much as they want and read it offline, with the XSLT program
in their browser cache.  The download involves a couple of table lookups
per page requested from the server and little processing.  (If the user
doesn't have a modern browser then they can elect to have transformation
done by the server, but that's another story.)
Just an idea...
Regards,
John O'Leary.

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

2001

Re: [Wikipedia-l] Downloading Wikipedia (was: please help me)