The comparison tool on
https://tools.wmflabs.org/copyvios/ can look for repeated phrases.
You might be able to tweak that a bit.
On Sat, 4 May 2019 at 12:48, Haifeng Zhang <haifeng1(a)andrew.cmu.edu> wrote:
Dear folks,
Is there a way to compute content similarity between two Wikipedia
articles?
For example, I can think of representing each article as a vector of
likelihoods over possible topics.
But, I wonder there are other work people have already explored in the
past.
Thanks,
Haifeng
_______________________________________________
Wiki-research-l mailing list
Wiki-research-l(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l