Introducing the Hutter Prize for Lossless Compression of Human Knowledge
Artificial intelligence researchers finally have an objective and rigorously validated measure of the intelligence of their machines. Furthermore the higher the measured intelligence of their machines the more money they can win via the Hutter Prize.
The purse for the Hutter Prize was initially underwritten with a 50,000 Euro commitment to the prize fund by Marcus Hutter of the Swiss Dalle Molle Institute for Artificial Intelligence, affiliated with the University of Lugano and The University of Applied Sciences of Southern Switzerland.
The theoretic basis of the Hutter Prize is related to an insight by the 12th century philosopher, William of Ockham, called "Ockham's Razor", sometimes quoted as: "It is vain to do with more what can be done with less." But it was not till the year 2000 that this was mathematically proven*, by Marcus Hutter, to be a founding principle of intelligence. Indeed, Hutter's Razor** might be phrased, "It is truer to explain with less that which can be explained with more."
There have been previous tests and related prizes for artificial intelligence, such as the Turing Test and the Loebner Prize. However, these tests suffered from subjective definitions of intelligence. Hutter's recent theoretic breakthrough creates a mathematics of artificial intelligence which accurately measures the degree of intelligence possessed by an artificial agent. It does so by measuring how succinctly it represents knowledge of the world. As Hutter has now proven, the most succinct computer model of the world isn't just the most aesthetic or memory-efficient out of all models of the known observations -- it also most accurately predicts new observations. In short, it is the most intelligent.
Artificial intelligence has thereby entered the realm of engineering: Lossless compression of human knowledge.
This is momentous because by optimizing for rigorous metrics, the field of artificial intelligence may finally clarify the murky waters of inadequate definition, within which it has been haphazardly swimming for the last 50 years, to become both a hard science and tractable engineering discipline.
Named for the discoverer of the proof and the initial 50,000 Euro donor, the Hutter Prize currently targets the compression of a 100 megabyte sample of human knowledge drawn from the broadly based Wikipedia online encyclopedia. As Moore's Law increases the capacity of machines, and as additional donations to the prize fund increase the incentives of contestants, the intent is to increase the amount of knowledge targeted for compression. It is reasonable to expect that the 100 megabyte sample will produce, at the very least, advances in linguistic modeling. As the targeted depth and breadth of knowledge increases, conceptual frameworks will come into play, eventually covering the range of disciplines from political science to physics by applying theories that prove optimal in compressing the target.
A common objection to this approach to artificial intelligence is that it offers little that is new -- that the computational difficulty of searching for patterns in data remains what it has always been. This objection misses two important points:
1) Hutter's proof provides a new mathematics of intelligence allowing for "top down" theoretic advances which may render many problems tractable that otherwise appear intractable.
2) There is a large overlap between succinctly codified knowledge and an intelligent compression program. Indeed, a reasonable definition of "knowledge" is that it optimizes the compression of new observations as instances of old patterns. This means that even if a compressor does nothing but apply codified human knowledge, generating no new knowledge of its own, it can still demonstrate greater intelligence than competing programs and thereby make measurable progress toward artificial intelligence but also -- and this is key -- progressively more intelligent bodies of human-generated knowledge by pitting those bodies of knowledge against each other in what might be called an epistemological tournament.
The formula for winnings is modeled after the M-Prize or Methuselah Mouse Prize, which awards money to longevity researchers for progress in keeping mice alive the longest. Here, modified for compression ratios, is the formula:
S = size of program outputting the uncompressed knowledge Snew = new record Sprev = previous record P = [Sprev - Snew] / Sprev = percent improvement
Award monies:
Fund contains: Z at noon GMT on day of new record Winner receives: Z * P
Initially Z is 50,000 Euro with a minimum payout of 500 Euro (or minimum improvement of 1% over the prior winner).
Donations are welcome. The history of improvement in The Calgary Corpus Compression Challenge*** is about 3% per year. The larger the commitment from donors to the fund, the greater the rate of progress toward a high quality body of human knowledge and, quite possibly, the long-held promise of artificial intelligence.
For further details of the Hutter Prize see:
For discussion of the Hutter Prize see:
http://groups.google.com/group/Hutter-Prize
-- Jim Bowery
* http://www.hutter1.net/ai/uaibook.htm
** Hutter's Razor has some caveats relating to the nature of the universe and computability, but those conditions must be met for any computer-based intelligence.
James A. Bowery wrote:
Named for the discoverer of the proof and the initial 50,000 Euro donor, the Hutter Prize currently targets the compression of a 100 megabyte sample of human knowledge drawn from the broadly based Wikipedia online encyclopedia.
Although I doubt we have many AI researchers on this list who would be interested in competing for this prize, some readers here might be interested in the related statistical analysis of Wikipedia's text, performed by AI researcher Matt Mahoney:
http://cs.fit.edu/~mmahoney/compression/textdata.html
-- Tim Starling
wikitech-l@lists.wikimedia.org