Andre Engels wrote:
It would be a
safe a assumption that words that are used the most
frequently are probably spelled correctly. True, "recieve" may be a very
common miss spelling, but there are probably a lot more occurrences of
"receive". So the flip side of this is that words that are used rarely are
probably spelled wrong. Now we don't want to go off blindly replacing
these words (mostly because we would know what with) but they are good
words to look for for replacing.
I don't think this would work - I think there will be thousands, if not tens
of thousands of words that are used exactly once. The great majority of these
will not be mis-spellings, but (parts of) proper names and geographical
names that happen to occur exactly once, words from other languages and
reasonable neologisms.
I agree with Andre. I suspect that the distribution of mis-spellings
and typos would be Zipfian, as would the finding and corrections for
such words. "Antibacterial" occurs 4 times in Wikipedia. No amount of
direct searching would have found "actibacterial"
One feature that would help for finding some common errors would be a
search feature that allows optional searching for parts of words. Thus
far we have spoken of "recieve" and "recieved", but being able to
search
the part word "reciev" would also catch "recieves",
"reciever",
"recievers", "recieving" and maybe other less common words with this
root.
Alexandre Fleming
actibacterial - actual typo (*)
Perhaps the article should be redirected to "Alexander Fleming" ;-) .
Apu Nahasapeempatilon
octuplet - logical neologism or normal word
punchcard - might actually be considered a misspelling (*)
Octuplet is a perfectly normal word.
"Punchcard" is an acceptable (and IMHO preferable) variant of "punch
card". It appears that way in the Oxford Dictionary. If I were to use
the term in an article and someone else "corrected" it to two words, I
would change it right back.
Who will ever have the enthusiasm to check the spelling of Apu's surname?
Eclecticology