Thanks both! This clarifies a lot. I think the primary issue that editors had raised and I had hoped to explore was popularity/importance v. obscurity.
Specifically, there have been concerns that the results tilt towards more popular articles (
here and
here), but it seems that page traffic is not a variable. Instead, what seems to be happening is that the raw # of similar terms is being used, rather than the % of similar terms. This means that longer articles are favored. Is that a fair assessment?