Hi Mara,since you were asking about ontologies, let me point you to our work on computational fact checking from knowledge networks PLoS ONE. We developed a measure of semantic similarity based on shortest paths between any two concepts of Wikipedia using the linked data from DBPedia; these the are links found in the infoboxes of Wikipedia articles; so it is a subset of the hyperlinks of the whole web page.In the article we use it as a way to check simple relational statements, but it could be used for other uses too. And there are also a couple other approaches from the literature, which we cite in the paper, that could also be relevant for what you are doing.HTH!GiovanniOn Sun, Feb 19, 2017 at 2:56 PM, Mara Sorella <sorella@dis.uniroma1.it> wrote:______________________________Hi everybody, I'm new to the list and have been referred here by a comment from a SO user as per my question [1], that I'm quoting next:
I have been successfully able to use the Wikipedia pagelinks SQL dump to obtain hyperlinks between Wikipedia pages for a specific revision time.
However, there are cases where multiple instances of such links exist, e.g. the very same https://en.wikipedia.org/wiki/Wikipedia page and https://en.wikipedia.org/wiki/Wikimedia_Foundation . I'm interested to find number of links between pairs of pages for a specific revision.
Ideal solutions would involve dump files other than pagelinks (which I'm not aware of), or using the MediaWiki API.To elaborate, I need this information to weight (almost) every hyperlink between article pages (that is, in NS0), that was present in a specific wikipedia revision (end of 2015), therefore, I would prefer not to follow the solution suggested by the SO user, that would be rather impractical.Indeed, my final aim is to use this weight in a thresholding fashion to sparsify the wikipedia graph (that due to the short diameter is more or less a giant connected component), in a way that should reflect the "relatedness" of the linked pages (where relatedness is not intended as strictly semantic, but at a higher "concept" level, if I may say so).For this reason, other suggestions on how determine such weights (possibly using other data sources -- ontologies?) are more than welcome.The graph will be used as dataset to test an event tracking algorithm I am doing research on.Thanks,_________________
Wiki-research-l mailing list
Wiki-research-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
_______________________________________________
Wiki-research-l mailing list
Wiki-research-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki- research-l