At 03:23 PM 9/14/02 +0200, Andre Engels wrote:
It would be a safe a assumption that words that are used the most frequently are probably spelled correctly. True, "recieve" may be a very common miss spelling, but there are probably a lot more occurrences of "receive". So the flip side of this is that words that are used rarely
are
probably spelled wrong. Now we don't want to go off blindly replacing these words (mostly because we would know what with) but they are good words to look for for replacing.
I don't think this would work - I think there will be thousands, if not tens of thousands of words that are used exactly once. The great majority of these will not be mis-spellings, but (parts of) proper names and geographical names that happen to occur exactly once, words from other languages and reasonable neologisms.
As a test, here is the result from 20 pages got with 'Random page', looking at the count of those words I think might be unique:
Aratrum aujtovguon - Greek word used for lack of English equivalent phktovn - idem Mounce - name, although not proper or geographical
Thanks for the pointer; I've tidied the page some (the transcriptions are just plain weird here).