Neil Harris wrote:
Regarding dashes and hyphens, I've now found my original data set, and a quick inspection gives this set of various similar-looking Latin hyphens, dashes and minus signs: U+002D HYPHEN-MINUS U+2010 HYPHEN U+2011 NON-BREAKING HYPHEN U+2012 FIGURE DASH U+2013 EN DASH
and at this point I missed out U+2014 EM DASH , which was hiding in the world of transitive closure mentioned below...
U+2212 MINUS SIGN U+FE58 SMALL EM DASH U+FF0D FULLWIDTH HYPHEN-MINUS
I can send the full data set of lookalikes to anyone who is interested: it can be quite easily extended by regarding the relation "looks like" as transitive, to include more distant and linguistically dubious visual confusables such as (just for example) U+2015 HORIZONTAL BAR, U+1173 HANGUL JUNGSEONG EU and U+2F00 KANGXI RADICAL ONE.
-- Neil