Hi there, I've just joined the list. I've been contributing to the wiki for a while, but I've got some technical interest now. To start off with, here is a list of "conjectures" that I think are interesting about wikipedia. I'm crossposting from my blog ( http:// simonwoodside.com/weblog/2005/11/20 ).
Conjecture 1. That the distance between any two wikipedia pages, randomly chosen, as measured by wikilinks, is on average 6.
Conjecture 2. That wikipedia is sufficiently formal and complete that you could build a useful general purpose AI knowledge base using it.
Conjecture 3. That wikipedia has low information entropy.
Conjecture 4. That the development of a wikipedia article over time occurs in a manner consistent to the biological evolution of a species.
Conjecture 5. That the relationship between the amount of material in wikipedia and the number of article views is exponential.
Conjecture 6. That wikipedia is, on average, factually accurate.
The one that I'm most interested in today, actually is #5 (known as Reed's law) ... because today I went to do some editing and found that the system was very slow... I'm actually a little worried that wikipedia is going to be overwhelmed by it's own popularity. I saw something like this happen before, when I joined a MUD called MicroMUSE back in er... 1993??? Wired issue #3 wrote about it and then they were inundated with users... everything was really slow... well back then I didn't have the technical chops to do something about but now I do. I have some ideas about architectural design of WP's servers in mind that I discussed with some people on #wikimedia- tech today... I'll probably write them up in a few days.
Anyway, the motivational reason is that IF conjecture #5 is TRUE, then there's probably some severe architectural implications for WP's technical design.
--simon (founder of semacode, contributor to mozilla, etc.)
On 11/22/05, S. Woodside sbwoodside@yahoo.com wrote:
Conjecture 1. That the distance between any two wikipedia pages, randomly chosen, as measured by wikilinks, is on average 6.
Conjecture 2. That wikipedia is sufficiently formal and complete that you could build a useful general purpose AI knowledge base using it.
Conjecture 3. That wikipedia has low information entropy.
Conjecture 4. That the development of a wikipedia article over time occurs in a manner consistent to the biological evolution of a species.
Conjecture 5. That the relationship between the amount of material in wikipedia and the number of article views is exponential.
Conjecture 6. That wikipedia is, on average, factually accurate.
Ohoh ! can I produce one?
Conjecture 7. The amount of wanky conjectures produced by bloggers is bound only by the number of sites that will allow them to spam their URLs.
Do I get a prize? ;)
On a more constructive note, If you're interested in AI taught using wikipedia... you should take a look at the state of the art in English parsers (such as [[Link Grammar]]) and how they fair on Wikipedia. Impressive that it works at all, but I don't think we need to worry about a hostile takeover of the planet by a self-aware encyclopedia any time soon.
What about:
Conjecture 8. Regardless of the final result, the inner workings of Wikipedia are so chaotic and at times unpleasant that most readers would rip all of their hair out if they watched a documentary about the creative process behind the articles.
Fair?
Mark
On 22/11/05, Gregory Maxwell gmaxwell@gmail.com wrote:
On 11/22/05, S. Woodside sbwoodside@yahoo.com wrote:
Conjecture 1. That the distance between any two wikipedia pages, randomly chosen, as measured by wikilinks, is on average 6.
Conjecture 2. That wikipedia is sufficiently formal and complete that you could build a useful general purpose AI knowledge base using it.
Conjecture 3. That wikipedia has low information entropy.
Conjecture 4. That the development of a wikipedia article over time occurs in a manner consistent to the biological evolution of a species.
Conjecture 5. That the relationship between the amount of material in wikipedia and the number of article views is exponential.
Conjecture 6. That wikipedia is, on average, factually accurate.
Ohoh ! can I produce one?
Conjecture 7. The amount of wanky conjectures produced by bloggers is bound only by the number of sites that will allow them to spam their URLs.
Do I get a prize? ;)
On a more constructive note, If you're interested in AI taught using wikipedia... you should take a look at the state of the art in English parsers (such as [[Link Grammar]]) and how they fair on Wikipedia. Impressive that it works at all, but I don't think we need to worry about a hostile takeover of the planet by a self-aware encyclopedia any time soon. _______________________________________________ Wikitech-l mailing list Wikitech-l@wikimedia.org http://mail.wikipedia.org/mailman/listinfo/wikitech-l
-- "Take away their language, destroy their souls." -- Joseph Stalin
Mark Williamson wrote:
What about:
Conjecture 8. Regardless of the final result, the inner workings of Wikipedia are so chaotic and at times unpleasant that most readers would rip all of their hair out if they watched a documentary about the creative process behind the articles.
Fair?
Mark
"Politics^W Wikipedia is like sausage making, you don't want to see how it's done."
-- Neil/ //// /
On Nov 22, 2005, at 2:50 AM, Mark Williamson wrote:
What about:
Conjecture 8. Regardless of the final result, the inner workings of Wikipedia are so chaotic and at times unpleasant that most readers would rip all of their hair out if they watched a documentary about the creative process behind the articles.
Fair?
but hard to prove :-)
simon
Mark
On 22/11/05, Gregory Maxwell gmaxwell@gmail.com wrote:
On 11/22/05, S. Woodside sbwoodside@yahoo.com wrote:
Conjecture 1. That the distance between any two wikipedia pages, randomly chosen, as measured by wikilinks, is on average 6.
Conjecture 2. That wikipedia is sufficiently formal and complete that you could build a useful general purpose AI knowledge base using it.
Conjecture 3. That wikipedia has low information entropy.
Conjecture 4. That the development of a wikipedia article over time occurs in a manner consistent to the biological evolution of a species.
Conjecture 5. That the relationship between the amount of material in wikipedia and the number of article views is exponential.
Conjecture 6. That wikipedia is, on average, factually accurate.
Ohoh ! can I produce one?
Conjecture 7. The amount of wanky conjectures produced by bloggers is bound only by the number of sites that will allow them to spam their URLs.
Do I get a prize? ;)
On a more constructive note, If you're interested in AI taught using wikipedia... you should take a look at the state of the art in English parsers (such as [[Link Grammar]]) and how they fair on Wikipedia. Impressive that it works at all, but I don't think we need to worry about a hostile takeover of the planet by a self-aware encyclopedia any time soon. _______________________________________________ Wikitech-l mailing list Wikitech-l@wikimedia.org http://mail.wikipedia.org/mailman/listinfo/wikitech-l
-- "Take away their language, destroy their souls." -- Joseph Stalin _______________________________________________ Wikitech-l mailing list Wikitech-l@wikimedia.org http://mail.wikipedia.org/mailman/listinfo/wikitech-l
On Friday 25 November 2005 07:04, S. Woodside wrote:
On Nov 22, 2005, at 2:50 AM, Mark Williamson wrote:
Conjecture 8. Regardless of the final result, the inner workings of Wikipedia are so chaotic and at times unpleasant that most readers would rip all of their hair out if they watched a documentary about the creative process behind the articles.
Fair?
but hard to prove :-)
Why? Simply make a documentary about it and force some lab rats^H^H^H^H^H^H^H^Hreaders to watch it.
On Nov 22, 2005, at 2:35 AM, Gregory Maxwell wrote:
On 11/22/05, S. Woodside sbwoodside@yahoo.com wrote:
Conjecture 2. That wikipedia is sufficiently formal and complete that you could build a useful general purpose AI knowledge base using it.
On a more constructive note, If you're interested in AI taught using wikipedia... you should take a look at the state of the art in English parsers (such as [[Link Grammar]]) and how they fair on Wikipedia. Impressive that it works at all, but I don't think we need to worry about a hostile takeover of the planet by a self-aware encyclopedia any time soon.
I'm googling a little bit now for wiki-related AI projects... What stuck in my mind for conjecture 2 was the Loebner Prize, where you create these chatbots to try to win a "Turing Test" style competition. One of the biggest challenges they have is to come up with a good knowledge base, and since Cyc isn't free, well, that's a challenge. Could you build a good Loebner competitor using wiki? I have no clue. :-) But if someone put a gun to my head that's probably the track I'd go down.
It's not like I think I'm some kind of scientific genius but I did run these conjectures by a friend of mine who I think is a genius and he seemed sufficiently interested that I thought I'd spam the world with them ;-)
--simon
"Andrew" == Andrew Venier avenier@venier.net writes:
S. Woodside wrote:
Conjecture 4. That the development of a wikipedia article over time occurs in a manner consistent to the biological evolution of a species.
I would be rather surprised to hear that our articles are reproducing.
The original list of "Wikipedians who <foo>" has cloned itself numerous times. You may argue that it's nothing more than a parasite feeding on the host organisms desire to be part of as many groups as possible, but vira are also a result of evolution.
On Nov 22, 2005, at 3:44 PM, Anders Wegge Jakobsen wrote:
"Andrew" == Andrew Venier avenier@venier.net writes:
S. Woodside wrote:
Conjecture 4. That the development of a wikipedia article over time occurs in a manner consistent to the biological evolution of a species.
I would be rather surprised to hear that our articles are reproducing.
The original list of "Wikipedians who <foo>" has cloned itself numerous times. You may argue that it's nothing more than a parasite feeding on the host organisms desire to be part of as many groups as possible, but vira are also a result of evolution.
Exactly. There's a bunch of ways to approach it. One is the concept of Speciation. This is the process whereby a single species turns into multiple species. Perhaps there's a new uncross-able river that shows up between two populations of the species, they diverge over time, and eventually they become two species. The same thing might be seen to have with wikipedia article where the article splits in two or more to cover variations on the topic in greater details.
--simon
wikitech-l@lists.wikimedia.org