I would be more than willing to help, keyword help, compile a common typo list. Just let me know and I could start up a spreadsheet or something.
Sent from my iPhone
On Nov 7, 2017, at 1:26 PM, Chad innocentkiller@gmail.com wrote:
On Tue, Nov 7, 2017 at 10:45 AM Faidon Liambotis faidon@wikimedia.org wrote:
On Tue, Nov 07, 2017 at 11:19:57AM -0700, Bryan Davis wrote: We could probably add checks for some common ones if someone compiled a
list.
Running a full spell check would be difficult because of the number of false positives there would be based on a "normal" dictionary. Commit messages often contain technical jargon (maybe something to try and avoid) and snippets of code (e.g. class names like TemplatesOnThisPageFormatter) that would not be in any traditional dictionary that we could count on being on the local host.
Debian's lintian (lint tool for packages) has a check for common typos/misspellings in its informational mode. The package ships with /usr/bin/spellintian which is a simple spellchecker that can run independently.
The benefit of using spellintian over e.g. aspell is that it addresses the issues you already identified: a) it just identifies typos, not complaining on unknown words it doesn't know, b) it's been created from observing typos in source code and package descriptions in the wild, so it's tailored to technical jargon and their misspellings. It could be a good fit to git commit messages.
That doesn't mean it's free of false positives though, so I wouldn't recommend to use it in a voting check in a CI pipeline.
Plus, you know, intentional misspellings:
"Fix misspelling of wikinedia -> wikimedia"
-Chad _______________________________________________ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l