Hoi,Really, how? We have over 280 Wikipedias, we have Wikisources etc. How do you realistically think there would be something useful?Thanks,GerardMOn 5 April 2016 at 13:48, John Erling Blad <jeblad@gmail.com> wrote:First you say that the heuristic isn't perfect, then you say that "As long as we don't have notability criteria in a machine readable format we can only work with heuristics." and then "And I really don't believe machine readable notability criteria is something we should strive for." If the heuristic isn't perfect then alternatives should be investigated. There are already machine readable notability criterias in there, the only thing missing is exposing them, probably by using the existing relations.On Tue, Apr 5, 2016 at 11:32 AM, Lydia Pintscher <Lydia.Pintscher@wikimedia.de> wrote:_______________________________________________On Sun, Apr 3, 2016 at 4:28 PM John Erling Blad <jeblad@gmail.com> wrote:Just read through the doc, and found some important points. I post each one in a separate mail.Things directly noticeable like an area enclosed in an area using the language is somewhat easy to identify, but things that are noticeable by association with another noticeable thing is not. Like a Danish slave ship operated by a Norwegian firm, the ship is thus noticeable in nowiki. I would say that all things linked as an item from other noticeable things should be included. Some would perhaps say that "items with second order relevance should be included".
> Since it is hard to decide which content is actually notable, the items appear-
> ing in the search should be limited to the ones having at least one statements
> and two sitelinks to the same project (like Wikipedia or Wikivoyage).
This is a good baseline, but figuring out what is notable locally is a bit more involved. A language is used in a local area, and within that area some items are more important just because they reside within the area. This is quite noticeable in the differences between nnwiki and nowiki which both basically covers "Norway". Also items that somehow relates to the local area or language is more noticeable than those outside those areas. By traversing upwords in the claims using the "part of" property it is possible to build a priority on the area involved. It is possible to traverse "nationality" and a few other properties.Yes the heuristic we're using isn't perfect. However I believe it is good enough for 99% of the cases while being really simple. This is what we need at the beginning. As we go along we can learn and see if other things make more sense.We have taken the exact same approach to ranking for item suggestions on Wikidata. At first all we took into account was the number of sitelinks on the items. This definitely wasn't a perfect measure for how relevant an item is but it was absolutely good enough while introducing very little complexity. As we've learned more and as Wikidata grows it was no longer good enough so we switched the algorithm to also take into account the number of labels. This is still relatively low complexity while producing good results.For the particular case of notability: As long as we don't have notability criteria in a machine readable format we can only work with heuristics. And I really don't believe machine readable notability criteria is something we should strive for.Cheers
Lydia--Lydia Pintscher - http://about.me/lydia.pintscherProduct Manager for WikidataWikimedia Deutschland e.V.Tempelhofer Ufer 23-2410963 BerlinWikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V.Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter der Nummer 23855 Nz. Als gemeinnützig anerkannt durch das Finanzamt für Körperschaften I Berlin, Steuernummer 27/029/42207.
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata
_______________________________________________
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata
_______________________________________________
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata
Wikimedia Deutschland e.V. | Tempelhofer Ufer 23-24 | 10963 Berlin
Phone: +49 (0)30 219 158 26-0
http://wikimedia.de
Imagine a world in which every single human being can freely share in the sum of all knowledge.
That‘s our commitment.
Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V.
Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter der Nummer 23855 B.
Als gemeinnützig anerkannt durch das Finanzamt für Körperschaften I Berlin,
Steuernummer 27/029/42207.