On Thu, Sep 29, 2005 at 11:37:08AM -0700, Kevin Hamilton wrote:
I am new to the list so I apologize if this has been previously answered but I couldn't find it in any of the FAQs, Documentation, or existing projects. I would like to know if there is a way to track the number of words contributed by user to the current version of Wikipedia. I would only want to count words that have not been struck by another user. Thank you in advance!
There's no 100% accurate way of doing that. If one person writes "There was a mouse", and someone else replaces it with "The mouse that was there", who originated the word "mouse"? Who originated the word "there"? If you want to do this heuristicly, your best bet would probably be iteratively diffing each article backwards until you've found the source of each word. It'll require a bit of work and certainly won't be a simple SQL query though.