Yaroslav thanks for posting - I had no idea. Thanks for your work on this too

On Sat, May 30, 2015 at 10:05 AM, Yaroslav M. Blanter <putevod@mccme.ru> wrote:
On 2015-05-29 17:42, Markus Krötzsch wrote:
Hi Jane, hi Romaine,

I think we agree that valuable information should be kept if at all
possible. My chief concern is that orphaned items do not have a clear
identity. It's not useful to know that "something" is at a certain
location. The first thing we must determine is what this "thing" is
that we are talking about. Links to Wikipedia are a good way of doing
this. Without them, we need to come up with other identity providing
sources. We certainly have the right infrastructure for this (with all
the identifier properties that point to other databases and authority
files).

The first goal of anyone who wants to safe an orphan should be to
connect it with the outside world so as to give it some grounding to
build on.

A weaker way to provide basic grounding is to make internal
connections. There are cases where this is strong (one can identify
items as "the author of War & Peace" or "the mother of Marie
Skłodowska-Curie"), but there are other cases where it is too weak
("the town in Germany" or "the part of Europe" do not identify
anything). One would need to give this more thought if one wanted to
determine automatically if an item receives its identity from the
incoming/outgoing links to other items.

Cheers,

Markus



Actually, we already have tools designed by Pasleim to track such items:

https://www.wikidata.org/wiki/User:Pasleim/notability

https://www.wikidata.org/wiki/User:Pasleim/Items_for_deletion/Almost_empty

I usually check that there are no backlinks, provided there are none check the history, and if it turns out the item is empty because of a non-automated merge I merge it, and if it is empty because the only interwiki link was deleted on the project I delete it as non-notable.

The problems are often items which never had any links. Many of them are spam, but some of them can be used for structural needs and can be kept. It is not always easy to figure out in practice, especially if they are in non-Latin and non-Cyrillic alphabets.

Cheers
Yaroslav

_______________________________________________
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata