Am 01.07.2014 22:23, schrieb Markus Krötzsch:
P.S. One weakness of my algorithm you can already see:
it has troubles
estimating the relevance of very rare properties, such as "Minor
Planet Center observatory code" above. A single wrong annotation may
then lead to wrong suggestions. Also, it seems from my list under (2)
that some Grade I listed buildings are ships. This seems to be an
error that is amplified by the fact that property "masts" is used only
11 times in the dataset I evaluated (last week's data). I guess the
new property suggester rather errs on the other side, being tricked
into suggesting very frequent properties even in places that don't
need them.
However, it is obviously better if the algorithm performs well for
frequently used properties. Isn't it possible to combine those two
systems so they improve each other. One could check how often the
property is used and then rely on Markus' or the students' algorithm.
Best regards,
Bene