On Mon, Oct 25, 2010 at 12:38 PM, Paul Houle paul@ontology2.com wrote:
I want Wikipedia converted into facts in a representation system that supports modal, temporal, and "microtheory" reasoning. You know, in the "real" world, :James_T_Kirk is a :Fictional_Character, but in the Star Trek universe, he's a :Person.
This sounds like it would take far more work to actually write the program in the first place than to parallelize it.
On Mon, Oct 25, 2010 at 4:47 PM, Platonides Platonides@gmail.com wrote:
Make the best dump compressor ever? :)
The page http://www.mediawiki.org/wiki/Dbzip2 is worth looking at just for the available options. Continuing dbzip2 is the first idea but not the only one. I'm sure many things can be dig from there. Also worth noting, Ariel has been doing the last en dumps in page batches.
Possible. It looks like dbzip2 has had a lot of optimization put into it already, so I don't know that there would be much low-hanging fruit for me to get. I'm not sure if that would be acceptable for a final project ("do lots of benchmarking and probably not end up improving it much at all"). Whereas rewriting it to use the GPU sounds like a suspiciously large project . . . plus I'm not sure GPUs would even be suited for it.