The comparison tool on https://tools.wmflabs.org/copyvios/ can look for repeated phrases.
You might be able to tweak that a bit.
On Sat, 4 May 2019 at 12:48, Haifeng Zhang haifeng1@andrew.cmu.edu wrote:
Dear folks,
Is there a way to compute content similarity between two Wikipedia articles?
For example, I can think of representing each article as a vector of likelihoods over possible topics.
But, I wonder there are other work people have already explored in the past.
Thanks,
Haifeng _______________________________________________ Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l