It would be a safe a assumption that words that are used the most frequently are probably spelled correctly. True, "recieve" may be a very common miss spelling, but there are probably a lot more occurrences of "receive". So the flip side of this is that words that are used rarely are probably spelled wrong. Now we don't want to go off blindly replacing these words (mostly because we would know what with) but they are good words to look for for replacing.
I don't think this would work - I think there will be thousands, if not tens of thousands of words that are used exactly once. The great majority of these will not be mis-spellings, but (parts of) proper names and geographical names that happen to occur exactly once, words from other languages and reasonable neologisms.
As a test, here is the result from 20 pages got with 'Random page', looking at the count of those words I think might be unique:
Aratrum aujtovguon - Greek word used for lack of English equivalent phktovn - idem Mounce - name, although not proper or geographical Subtractive synthesis synthisizers - indeed a misspelling (*) highpass - jargon word Shocking Blue no unique words Conjugate base no unique words Cenozoic Caenozoic - given as an alternative spelling of the subject Aldo Moro Moro's - genitive form of a proper name (**) Freedom of speech no unique words Film genres no unique words Alexandre Fleming actibacterial - actual typo (*) Hydrogen cyanide no unique words ISDN no unique words 20-GATE no unique words Unspun unspun - name of a group (**) unspinning - logical neologism (**) Apu Nahasapeempatilon octuplet - logical neologism or normal word punchcard - might actually be considered a misspelling (*) Lua programming language no unique words Eros no unique words Scud Makeyev - name of an institute UDMH - jargon term (abbreviation) RFNA - jargon term (abbrevation) Academy Award for Writing Adapted Screenplay Herczag - proper name Siliphant - proper name Hauben - proper name Peploe - proper name Zaillian - proper name Gaghan - proper name Elie Ducommun no unique words Masamune Shirow Deunan - proper name (fictional character)
(*): As I happened upon this misspelling, I corrected it, so you'll have to go to the previous version of the page to see it. (**): Occurs several times, but just on one page.
Total: 16 proper singles, 3 misspellings, 3 cases (the ones with **) not counted.
On the other hand, I came across the following mis-spellings which DID occur more than once (also corrected): missles (5 times)
I have found that by automating the mundane repetitive portions of tasks like this that humans are much more accurate. If you have to go through 10 of the same motions for every 1 that requires thought then you are more likely to not put any thought into that 1. But if it is only a 2:1 or even better 1:1 ratio then you will put much more thought into it.
If my attempt above is any guideline, the actual ratio will be more like 1:5 or 1:6.
Andre Engels