Just read through the doc, and found some important points. I post each one in a separate mail.
Since it is hard to decide which content is actually notable, the items
appear-
ing in the search should be limited to the ones having at least one
statements
and two sitelinks to the same project (like Wikipedia or Wikivoyage).
This is a good baseline, but figuring out what is notable locally is a bit more involved. A language is used in a local area, and within that area some items are more important just because they reside within the area. This is quite noticeable in the differences between nnwiki and nowiki which both basically covers "Norway". Also items that somehow relates to the local area or language is more noticeable than those outside those areas. By traversing upwords in the claims using the "part of" property it is possible to build a priority on the area involved. It is possible to traverse "nationality" and a few other properties.
Things directly noticeable like an area enclosed in an area using the language is somewhat easy to identify, but things that are noticeable by association with another noticeable thing is not. Like a Danish slave ship operated by a Norwegian firm, the ship is thus noticeable in nowiki. I would say that all things linked as an item from other noticeable things should be included. Some would perhaps say that "items with second order relevance should be included".
On Sat, Apr 2, 2016 at 11:09 PM, Luis Villa luis@lu.is wrote:
On Sat, Apr 2, 2016, 4:34 AM Lucie Kaffee lucie.kaffee@wikimedia.de wrote:
I wrote my Bachelor's thesis on "Generating Article Placeholders from Wikidata for Wikipedia: Increasing Access to Free and Open Knowledge". The thesis summarizes a lot of the work done on the ArticlePlaceholder extension ( https://www.mediawiki.org/wiki/Extension:ArticlePlaceholder )
I uploaded the thesis to commons under a CC-BY-SA license- you can find it at https://commons.wikimedia.org/wiki/File:Generating_Article_Placeholders_from...
I continue working on the extension and aim to deploy it to the first Wikipedias, that are interested, in the next months.
I am happy to answer questions related to the extension!
Great work on something that I *believe *has a lot of promise - thanks! I really think this approach has a lot of promise to help take back some readership from Google, and potentially in the long-run drive more new editors as well. (I know that was part of the theory of LSJbot, though I don't know if anyone has actually a/b tested that.)
I was somewhat surprised to not see data collection discussed in Section 8.10 - are there plans to do that? I would have expected to see a/b testing discussed as part of the deployment methodology, so that it could be compared both to the current baseline and also to similar approaches (like the ones you survey in Section 3).
Thanks again for the hard work here-
Luis
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata