Re: [Wiki-research-l] Content similarity between two Wikipedia articles

7 May 2019


      Dear Haifeng,
Would you not be able to use ordinary information retrieval techniques 
such as bag-of-words/phrases and tfidf? Explicit semantic analysis (ESA) 
uses this approach (though its primary focus is word semantic similarity).
There are a few papers for ESA: 
https://tools.wmflabs.org/scholia/topic/Q5421270
I have also used it in "Open semantic analysis: The case of word level 
semantics in Danish" 
http://www2.compute.dtu.dk/pubdb/views/edoc_download.php/7029/pdf/imm7029.pd...
Finn Årup Nielsen
http://people.compute.dtu.dk/faan/
On 04/05/2019 13:47, Haifeng Zhang wrote:
...
Dear folks,
Is there a way to compute content similarity between two Wikipedia articles?
For example, I can think of representing each article as a vector of likelihoods over possible topics.
But, I wonder there are other work people have already explored in the past.
Thanks,
Haifeng
_______________________________________________
Wiki-research-l mailing list
Wiki-research-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

Re: [Wiki-research-l] Content similarity between two Wikipedia articles