Lars Aronsson <lars(a)aronsson.se> writes:
It sounds so easy. But would you accept this
procedure if it requires
that Wikipedia is unavailable or read-only for one hour? for one day?
for one week?
It could be done on-the-fly, even if it takes some weeks. "Simply" also
start storing the converted articles in a second table (or database
system)...
Assuming these sizes would be the same for an XML dump
(ROTFL) and
that export/import could be done at 1 MB/second (optimistic), this is
3500 seconds or about one hour for the "cur" table and 83,000 seconds
or close to 24 hours for the "old" table. And this is for the sizes
of February 2005, not for May 2005 or July 2008. You do the math.
Of course, you must use a database system designed for holding XML data
;) If we start using XML properly we can give up on many a lot hacks.
We will also save resources because the set of allowed tags is limited
and their usage is well-defined :)
--
http://www.gnu.franken.de/ke/ | ,__o
| _-\_<,
| (*)/'(*)
Key fingerprint = F138 B28F B7ED E0AC 1AB4 AA7F C90A 35C3 E9D0 5D1C