Neil Harris wrote:
Regarding dashes and hyphens, I've now found my original data set, and
a quick inspection gives this set of various similar-looking Latin
hyphens, dashes and minus signs:
U+002D HYPHEN-MINUS
U+2010 HYPHEN
U+2011 NON-BREAKING HYPHEN
U+2012 FIGURE DASH
U+2013 EN DASH
and at this point I missed out U+2014 EM DASH , which was hiding in the
world of transitive closure mentioned below...
U+2212 MINUS SIGN
U+FE58 SMALL EM DASH
U+FF0D FULLWIDTH HYPHEN-MINUS
I can send the full data set of lookalikes to anyone who is interested:
it can be quite easily extended by regarding the relation "looks like"
as transitive, to include more distant and linguistically dubious visual
confusables such as (just for example) U+2015 HORIZONTAL BAR, U+1173
HANGUL JUNGSEONG EU and U+2F00 KANGXI RADICAL ONE.
-- Neil