On 1/4/09 6:20 AM, yegg at alum.mit.edu wrote: The current enwiki database dump (http://download.wikimedia.org/enwiki/20081008/ ) has been crawling along since 10/15/2008.
The current dump system is not sustainable on very large wikis and is being replaced. You'll hear about it when we have the new one in place. :) -- brion
Following up on this thread: http://lists.wikimedia.org/pipermail/wikitech-l/2009-January/040841.html
Brion,
Can you offer any general timeline estimates (weeks, months, 1/2 year)? Are there any alternatives to retrieving the article data beyond directly crawling the site? I know this is verboten but we are in dire need of retrieving this data and don't know of any alternatives. The current estimate of end of year is too long for us to wait. Unfortunately, wikipedia is a favored source for students to plagiarize from which makes out of date content a real issue.
Is there any way to help this process along? We can donate disk drives, developer time, ...? There is another possibility that we could offer but I would need to talk with someone at the wikimedia foundation offline. Is there anyone I could contact?
Thanks for any information and/or direction you can give.
Christian
I have a decent server that is dedicated for a Wikipedia project that depends on the fresh dumps. Can this be used anyway to speed up the process of generating the dumps?
bilal
On Tue, Jan 27, 2009 at 2:24 PM, Christian Storm storm@iparadigms.comwrote:
On 1/4/09 6:20 AM, yegg at alum.mit.edu wrote: The current enwiki database dump (
http://download.wikimedia.org/enwiki/20081008/
) has been crawling along since 10/15/2008.
The current dump system is not sustainable on very large wikis and is being replaced. You'll hear about it when we have the new one in place. :) -- brion
Following up on this thread: http://lists.wikimedia.org/pipermail/wikitech-l/2009-January/040841.html
Brion,
Can you offer any general timeline estimates (weeks, months, 1/2 year)? Are there any alternatives to retrieving the article data beyond directly crawling the site? I know this is verboten but we are in dire need of retrieving this data and don't know of any alternatives. The current estimate of end of year is too long for us to wait. Unfortunately, wikipedia is a favored source for students to plagiarize from which makes out of date content a real issue.
Is there any way to help this process along? We can donate disk drives, developer time, ...? There is another possibility that we could offer but I would need to talk with someone at the wikimedia foundation offline. Is there anyone I could contact?
Thanks for any information and/or direction you can give.
Christian
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
The problem, as I understand it (and Brion may come by to correct me) is essentially that the current dump process is designed in a way that can't be sustained given the size of enwiki. It really needs to be re-engineered, which means that developer time is needed to create a new approach to dumping.
The main target for improvement is almost certainly parallelizing the process so that wouldn't be a single monolithic dump process, but rather a lot of little processes working in parallel. That would also ensure that if a single process gets stuck and dies, the entire dump doesn't need to start over.
By way of observation, the dewiki's full history dumps in 26 hours with 96% prefetched (i.e. loaded from previous dumps). That suggests that even starting from scratch (prefetch = 0%) it should dump in ~25 days under the current process. enwiki is perhaps 3-6 times larger than dewiki depending on how you do the accounting, which implies dumping the whole thing from scratch would take ~5 months if the process scaled linearly. Of course it doesn't scale linearly, and we end up with a prediction for completion that is currently 10 months away (which amounts to a 13 month total execution). And of course, if there is any serious error in the next ten months the entire process could die with no result.
Whether we want to let the current process continue to try and finish or not, I would seriously suggest someone look into redumping the rest of the enwiki files (i.e. logs, current pages, etc.). I am also among the people that care about having reasonably fresh dumps and it really is a problem that the other dumps (e.g. stubs-meta-history) are frozen while we wait to see if the full history dump can run to completion.
-Robert Rohde
On Tue, Jan 27, 2009 at 11:24 AM, Christian Storm storm@iparadigms.com wrote:
On 1/4/09 6:20 AM, yegg at alum.mit.edu wrote: The current enwiki database dump (http://download.wikimedia.org/enwiki/20081008/ ) has been crawling along since 10/15/2008.
The current dump system is not sustainable on very large wikis and is being replaced. You'll hear about it when we have the new one in place. :) -- brion
Following up on this thread: http://lists.wikimedia.org/pipermail/wikitech-l/2009-January/040841.html
Brion,
Can you offer any general timeline estimates (weeks, months, 1/2 year)? Are there any alternatives to retrieving the article data beyond directly crawling the site? I know this is verboten but we are in dire need of retrieving this data and don't know of any alternatives. The current estimate of end of year is too long for us to wait. Unfortunately, wikipedia is a favored source for students to plagiarize from which makes out of date content a real issue.
Is there any way to help this process along? We can donate disk drives, developer time, ...? There is another possibility that we could offer but I would need to talk with someone at the wikimedia foundation offline. Is there anyone I could contact?
Thanks for any information and/or direction you can give.
Christian
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Whether we want to let the current process continue to try and finish or not, I would seriously suggest someone look into redumping the rest of the enwiki files (i.e. logs, current pages, etc.). I am also among the people that care about having reasonably fresh dumps and it really is a problem that the other dumps (e.g. stubs-meta-history) are frozen while we wait to see if the full history dump can run to completion.
Even if we do let it finish, I'm not sure a dump of what Wikipedia was like 13 months ago is much use... The way I see it, what we need is to get a really powerful server to do the dump just once at a reasonable speed and then we'll have a previous dump to build on so future ones would be more reasonable.
On Tue, Jan 27, 2009 at 2:42 PM, Brion Vibber brion@wikimedia.org wrote:
On 1/27/09 2:35 PM, Thomas Dalton wrote:
The way I see it, what we need is to get a really powerful server
Nope, it's a software architecture issue. We'll restart it with the new arch when it's ready to go.
I don't know what your timetable is, but what about doing something to address the other aspects of the dump (logs, stubs, etc.) that are in limbo while full history chugs along. All the other enwiki files are now 3 months old and that is already enough to inconvenience some people.
The simplest solution is just to kill the current dump job if you have faith that a new architecture can be put in place in less than a year.
-Robert Rohde
On 1/27/09 2:55 PM, Robert Rohde wrote:
On Tue, Jan 27, 2009 at 2:42 PM, Brion Vibberbrion@wikimedia.org wrote:
On 1/27/09 2:35 PM, Thomas Dalton wrote:
The way I see it, what we need is to get a really powerful server
Nope, it's a software architecture issue. We'll restart it with the new arch when it's ready to go.
I don't know what your timetable is, but what about doing something to address the other aspects of the dump (logs, stubs, etc.) that are in limbo while full history chugs along. All the other enwiki files are now 3 months old and that is already enough to inconvenience some people.
The simplest solution is just to kill the current dump job if you have faith that a new architecture can be put in place in less than a year.
We'll probably do that.
-- brion
"Brion Vibber" brion@wikimedia.org wrote in message news:497F9C35.9050500@wikimedia.org...
On 1/27/09 2:55 PM, Robert Rohde wrote:
On Tue, Jan 27, 2009 at 2:42 PM, Brion Vibberbrion@wikimedia.org wrote:
On 1/27/09 2:35 PM, Thomas Dalton wrote:
The way I see it, what we need is to get a really powerful server
Nope, it's a software architecture issue. We'll restart it with the new arch when it's ready to go.
The simplest solution is just to kill the current dump job if you have faith that a new architecture can be put in place in less than a year.
We'll probably do that.
-- brion
FWIW, I'll add my vote for aborting the current dump *now* if we don't expect it ever to actually be finished, so we can at least get a fresh dump of the current pages.
Russ
Probably wise to poke in a hack to skip the history first. :)
-- brion vibber (brion @ wikimedia.org)
On Jan 28, 2009, at 7:34, "Russell Blau" russblau@hotmail.com wrote:
"Brion Vibber" brion@wikimedia.org wrote in message news:497F9C35.9050500@wikimedia.org...
On 1/27/09 2:55 PM, Robert Rohde wrote:
On Tue, Jan 27, 2009 at 2:42 PM, Brion Vibberbrion@wikimedia.org wrote:
On 1/27/09 2:35 PM, Thomas Dalton wrote:
The way I see it, what we need is to get a really powerful server
Nope, it's a software architecture issue. We'll restart it with the new arch when it's ready to go.
The simplest solution is just to kill the current dump job if you have faith that a new architecture can be put in place in less than a year.
We'll probably do that.
-- brion
FWIW, I'll add my vote for aborting the current dump *now* if we don't expect it ever to actually be finished, so we can at least get a fresh dump of the current pages.
Russ
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
On 1/28/09 8:32 AM, Brion Vibber wrote:
Probably wise to poke in a hack to skip the history first. :)
Done in r46545.
Updated dump scripts and canceled the old enwiki dump.
New dumps also will be attempting to generate log output as XML which correctly handles the deletion/oversighting options; we'll see hwo that goes. :)
-- brion
On Thu, Jan 29, 2009 at 11:20 AM, Brion Vibber brion@wikimedia.org wrote:
On 1/28/09 8:32 AM, Brion Vibber wrote:
Probably wise to poke in a hack to skip the history first. :)
Done in r46545.
Updated dump scripts and canceled the old enwiki dump.
New dumps also will be attempting to generate log output as XML which correctly handles the deletion/oversighting options; we'll see hwo that goes. :)
Is there somewhere that explains (or at least gives an example) of the new logging format and what has changed?
-Robert Rohde
That would be great. I second this notion whole heartedly.
On Jan 28, 2009, at 7:34 AM, Russell Blau wrote:
"Brion Vibber" brion@wikimedia.org wrote in message news:497F9C35.9050500@wikimedia.org...
On 1/27/09 2:55 PM, Robert Rohde wrote:
On Tue, Jan 27, 2009 at 2:42 PM, Brion Vibberbrion@wikimedia.org wrote:
On 1/27/09 2:35 PM, Thomas Dalton wrote:
The way I see it, what we need is to get a really powerful server
Nope, it's a software architecture issue. We'll restart it with the new arch when it's ready to go.
The simplest solution is just to kill the current dump job if you have faith that a new architecture can be put in place in less than a year.
We'll probably do that.
-- brion
FWIW, I'll add my vote for aborting the current dump *now* if we don't expect it ever to actually be finished, so we can at least get a fresh dump of the current pages.
Russ
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Brion,
We are having to resort to crawling en.wikipedia.org while we await for regular dumps. What is the minimum crawling delay we can get away with? I figure if we have 1 second delay then we'd be able to crawl the 2+ million articles in a month.
I know crawling is discouraged but it seems a lot of parties still do so after looking at robots.txt I have to assume that is how Google et al. is able to keep up to date.
Are their private data feeds? I noticed a wg_enwiki dump listed.
Christian
On Jan 28, 2009, at 10:47 AM, Christian Storm wrote:
That would be great. I second this notion whole heartedly.
On Jan 28, 2009, at 7:34 AM, Russell Blau wrote:
"Brion Vibber" brion@wikimedia.org wrote in message news:497F9C35.9050500@wikimedia.org...
On 1/27/09 2:55 PM, Robert Rohde wrote:
On Tue, Jan 27, 2009 at 2:42 PM, Brion Vibberbrion@wikimedia.org wrote:
On 1/27/09 2:35 PM, Thomas Dalton wrote:
The way I see it, what we need is to get a really powerful server
Nope, it's a software architecture issue. We'll restart it with the new arch when it's ready to go.
The simplest solution is just to kill the current dump job if you have faith that a new architecture can be put in place in less than a year.
We'll probably do that.
-- brion
FWIW, I'll add my vote for aborting the current dump *now* if we don't expect it ever to actually be finished, so we can at least get a fresh dump of the current pages.
Russ
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Thanks to everyone who got the enwiki dumps going again! Should we expect more regular dumps now? What was the final solution of fixing this?
We are having to resort to crawling en.wikipedia.org while we await for regular dumps. What is the minimum crawling delay we can get away with? I figure if we have 1 second delay then we'd be able to crawl the 2+ million articles in a month.
I know crawling is discouraged but it seems a lot of parties still do so after looking at robots.txt I have to assume that is how Google et al. is able to keep up to date.
Are their private data feeds? I noticed a wg_enwiki dump listed.
Christian
On Jan 28, 2009, at 10:47 AM, Christian Storm wrote:
That would be great. I second this notion whole heartedly.
On Jan 28, 2009, at 7:34 AM, Russell Blau wrote:
"Brion Vibber" brion@wikimedia.org wrote in message news:497F9C35.9050500@wikimedia.org...
On 1/27/09 2:55 PM, Robert Rohde wrote:
On Tue, Jan 27, 2009 at 2:42 PM, Brion Vibberbrion@wikimedia.org wrote:
On 1/27/09 2:35 PM, Thomas Dalton wrote: > The way I see it, what we need is to get a really powerful server Nope, it's a software architecture issue. We'll restart it with the new arch when it's ready to go.
The simplest solution is just to kill the current dump job if you have faith that a new architecture can be put in place in less than a year.
We'll probably do that.
-- brion
FWIW, I'll add my vote for aborting the current dump *now* if we don't expect it ever to actually be finished, so we can at least get a fresh dump of the current pages.
Russ
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Perhaps the toolserver can make you a current dump of current en?
On Wed, Mar 25, 2009 at 11:08 AM, Christian Storm storm@iparadigms.comwrote:
Thanks to everyone who got the enwiki dumps going again! Should we expect more regular dumps now? What was the final solution of fixing this?
We are having to resort to crawling en.wikipedia.org while we await for regular dumps. What is the minimum crawling delay we can get away with? I figure if we have 1 second delay then we'd be able to crawl the 2+ million articles in a month.
I know crawling is discouraged but it seems a lot of parties still do so after looking at robots.txt I have to assume that is how Google et al. is able to keep up to date.
Are their private data feeds? I noticed a wg_enwiki dump listed.
Christian
On Jan 28, 2009, at 10:47 AM, Christian Storm wrote:
That would be great. I second this notion whole heartedly.
On Jan 28, 2009, at 7:34 AM, Russell Blau wrote:
"Brion Vibber" brion@wikimedia.org wrote in message news:497F9C35.9050500@wikimedia.org...
On 1/27/09 2:55 PM, Robert Rohde wrote:
On Tue, Jan 27, 2009 at 2:42 PM, Brion Vibberbrion@wikimedia.org wrote: > On 1/27/09 2:35 PM, Thomas Dalton wrote: >> The way I see it, what we need is to get a really powerful server > Nope, it's a software architecture issue. We'll restart it with > the new > arch when it's ready to go. The simplest solution is just to kill the current dump job if you have faith that a new architecture can be put in place in less than a year.
We'll probably do that.
-- brion
FWIW, I'll add my vote for aborting the current dump *now* if we don't expect it ever to actually be finished, so we can at least get a fresh dump of the current pages.
Russ
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
toolserver users dont have access to text
On Wed, Mar 25, 2009 at 7:05 PM, Brian Brian.Mingus@colorado.edu wrote:
Perhaps the toolserver can make you a current dump of current en?
On Wed, Mar 25, 2009 at 11:08 AM, Christian Storm <storm@iparadigms.com
wrote:
Thanks to everyone who got the enwiki dumps going again! Should we
expect
more regular dumps now? What was the final solution of fixing this?
We are having to resort to crawling en.wikipedia.org while we await for regular dumps. What is the minimum crawling delay we can get away with? I figure if we have 1 second delay then we'd be able to crawl the 2+ million articles in a month.
I know crawling is discouraged but it seems a lot of parties still do so after looking at robots.txt I have to assume that is how Google et al. is able to keep up to date.
Are their private data feeds? I noticed a wg_enwiki dump listed.
Christian
On Jan 28, 2009, at 10:47 AM, Christian Storm wrote:
That would be great. I second this notion whole heartedly.
On Jan 28, 2009, at 7:34 AM, Russell Blau wrote:
"Brion Vibber" brion@wikimedia.org wrote in message news:497F9C35.9050500@wikimedia.org...
On 1/27/09 2:55 PM, Robert Rohde wrote: > On Tue, Jan 27, 2009 at 2:42 PM, Brion Vibber<brion@wikimedia.org
> wrote: >> On 1/27/09 2:35 PM, Thomas Dalton wrote: >>> The way I see it, what we need is to get a really powerful
server
>> Nope, it's a software architecture issue. We'll restart it with >> the new >> arch when it's ready to go. > The simplest solution is just to kill the current dump job if you > have > faith that a new architecture can be put in place in less than a > year.
We'll probably do that.
-- brion
FWIW, I'll add my vote for aborting the current dump *now* if we don't expect it ever to actually be finished, so we can at least get a fresh dump of the current pages.
Russ
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
On 3/25/09 10:08 AM, Christian Storm wrote:
Thanks to everyone who got the enwiki dumps going again! Should we expect more regular dumps now? What was the final solution of fixing this?
Lots of love and upkeep by everyone :)
But really its needs to be more automated and made parallelised so that we can spot issues faster, validate inconsistencies, and finish quicker.
Brion and I have met about this and we've even brought it into the Wikimedia dev meetings to brainstorm how the system could change for the better.
I've started drafting some new ideas at http://wikitech.wikimedia.org/view/Data_dump_redesign
of the various problems that were facing and what kind of job management we can put around it. Were taking this on as a full "should have been done 2 years ago" project and I'm going to be shepherding this along.
Right now I'm collecting stats about the throughput of the components to see how much in parallel this could be farmed out in a job management system.
This is a large project that has some distinct problem areas that we'll be isolating and welcoming help on.
--tomasz
Tomasz Finc wrote:
I've started drafting some new ideas at http://wikitech.wikimedia.org/view/Data_dump_redesign
of the various problems that were facing and what kind of job management we can put around it. Were taking this on as a full "should have been done 2 years ago" project and I'm going to be shepherding this along.
Right now I'm collecting stats about the throughput of the components to see how much in parallel this could be farmed out in a job management system.
This is a large project that has some distinct problem areas that we'll be isolating and welcoming help on.
--tomasz
Quite interesting. Can the images at office.wikimedia.org be moved to somewhere public?
Decompression takes as long as compression with bzip2
I think decompression is *faster* than compression http://tukaani.org/lzma/benchmarks
Let me know if I can help with anything.
On 3/26/09 3:25 PM, Keisial wrote:
Quite interesting. Can the images at office.wikimedia.org be moved to somewhere public?
I've copied those two to the public wiki. :)
Decompression takes as long as compression with bzip2
I think decompression is *faster* than compression http://tukaani.org/lzma/benchmarks
LZMA is nice and fast to decompress... but *insanely* slower to compress, and doesn't seem as parallelizable. :(
-- brion
On 03/27/09 01:14, Brion Vibber wrote:
LZMA is nice and fast to decompress... but *insanely* slower to compress, and doesn't seem as parallelizable. :(
The xz file format should allow for "easy" parallelization, both when compressing and decompressing; see
http://tukaani.org/xz/xz-file-format.txt
3. Block 3.1. Block Header 3.1.1. Block Header Size 3.1.3. Compressed Size 3.1.4. Uncompressed Size 3.1.6. Header Padding 3.3. Block Padding
At least in theory, this "length-prefixing" should make it fairly straightforward to write a multi-threaded decompressor with a splitter that can work from a pipe and is input-bound. I reckon the xz structure will eventually prove useful even for distributed compression/decompression.
lacos
On Thu, Mar 26, 2009 at 8:51 PM, ERSEK Laszlo lacos@elte.hu wrote:
On 03/27/09 01:14, Brion Vibber wrote:
LZMA is nice and fast to decompress... but *insanely* slower to compress, and doesn't seem as parallelizable. :(
The xz file format should allow for "easy" parallelization, both when compressing and decompressing; see
http://tukaani.org/xz/xz-file-format.txt
- Block
3.1. Block Header 3.1.1. Block Header Size 3.1.3. Compressed Size 3.1.4. Uncompressed Size 3.1.6. Header Padding 3.3. Block Padding
At least in theory, this "length-prefixing" should make it fairly straightforward to write a multi-threaded decompressor with a splitter that can work from a pipe and is input-bound. I reckon the xz structure will eventually prove useful even for distributed compression/decompression.
lacos
It includes an index for random access too. Cool. I wonder what kind of block size you'd need to get a compression ratio approaching that of 7z.
Brion Vibber wrote:
Decompression takes as long as compression with bzip2
I think decompression is *faster* than compression http://tukaani.org/lzma/benchmarks
LZMA is nice and fast to decompress... but *insanely* slower to compress, and doesn't seem as parallelizable. :(
-- brion
I used the lzma benchmark as evidence to support that decompressing bzip2 is faster than compressing.
Russell Blau <russblau <at> hotmail.com> writes:
FWIW, I'll add my vote for aborting the current dump *now* if we don't expect it ever to actually be finished, so we can at least get a fresh dump of the current pages.
I'd like to third/fourth/(other ordinal) this idea too. I've been using the (in comparison tiny) SQL dumps for various purposes, and it's most vexing that these have to wait until the end (or lack of any end...) of the larger XML dumps. (The same data is replicated on the toolserver, of course, but I'd get beaten to death if I tried to run some of the data collection scripts I've been running offline, there.)
Cheers, Alai.
Hoi, Two things:
- if we abort the backup now, we do not know if we WILL have something at the time it would have ended - if the toolserver data can provide a service as a stop gap measure why not provide that in the mean time
Thanks, GerardM
2009/1/29 Alai AlaiWiki@gmail.com
Russell Blau <russblau <at> hotmail.com> writes:
FWIW, I'll add my vote for aborting the current dump *now* if we don't expect it ever to actually be finished, so we can at least get a fresh
dump
of the current pages.
I'd like to third/fourth/(other ordinal) this idea too. I've been using the (in comparison tiny) SQL dumps for various purposes, and it's most vexing that these have to wait until the end (or lack of any end...) of the larger XML dumps. (The same data is replicated on the toolserver, of course, but I'd get beaten to death if I tried to run some of the data collection scripts I've been running offline, there.)
Cheers, Alai.
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
On Thu, Jan 29, 2009 at 1:52 AM, Gerard Meijssen gerard.meijssen@gmail.com wrote:
Hoi, Two things:
- if we abort the backup now, we do not know if we WILL have something at
the time it would have ended
- if the toolserver data can provide a service as a stop gap measure why
not provide that in the mean time
If you want to play the optimist and believe this dump might eventually accomplish something, then the right stopgap would be the hack the dumper so that it periodically regenerates the other files even while the big dump is still running. Such a thing, though definitely a hack, would not be hard to do.
-Robert Rohde
wikitech-l@lists.wikimedia.org