Responses inline...
On Mar 22, 2015, at 1:53 PM, Dmitry Brant dbrant@wikimedia.org wrote:
Hi Lydia,
Indeed, there are many more Wikidata items than Wikipedia articles. However, the users of our mobile apps only see Wikipedia articles in our search results (at least for now),
They are also used in "Recent" and "Nearby" and Vibha wants them in "Saved Pages" list as well.
which means that they will only be able to contribute descriptions to Wikidata items for which a Wikipedia article exists.
No doubt, the description field is an important component of each Wikidata entry. But, when there is a corresponding Wikipedia article, why not query it to provide an automatic description?
This could be based on the first sentence of the article, or a subset of the first sentence, or some other kind of metadata within the article.
For example, take the enwiki "Fish" article.
The first couple sentences are these:
*A fish is any member of a paraphyletic group of organisms that consist of all gill-bearing aquatic craniate animals that lack limbs with digits. Included in this definition are the living hagfish, lampreys, and cartilaginous and bony fish, as well as various extinct related groups.*
So if the we reduce the description to its first sentence we have:
*A fish is any member of a paraphyletic group of organisms that consist of all gill-bearing aquatic craniate animals that lack limbs with digits. *
Now, for the sake of argument, let's imagine the *bold *words below represent a best case scenario for a relevant *subset* of the first sentence:
*A fish is any member of a paraphyletic group of organisms that consist of all gill-bearing aquatic craniate animals that lack limbs with digits. *
"A fish is a gill-bearing aquatic animal".
That's nice and short and descriptive and reads like a little sentence. It's arguably the best reduction of the first sentence possible.
But reducing the first sentence in this way is deceptively complicated to do programmatically, precisely because of the word "arguably" in the preceding sentence. You have to know *what a fish is* to know what parts of the first sentence are *most* important.
In other words, the "best" description is much more qualitative than it is quantifiable.
type of living organism typified by living in water and having gills
The key is that the description would stay with the article, which would eliminate the need for duplication and synchronization.
So, in a sense, I would look at it the other way: descriptions within Wikipedia articles would be useful for Wikidata entries.
-Dmitry
On Sun, Mar 22, 2015 at 4:17 PM, Lydia Pintscher < lydia.pintscher@wikimedia.de> wrote:
On Sun, Mar 22, 2015 at 9:10 PM, Dmitry Brant dbrant@wikimedia.org wrote:
Hi Jane,
Perhaps my comments came off as more pessimistic than I intended. Of
course
I believe in the power of crowdsourcing, and I would never want to make anyone feel like their contributions are being marginalized.
I'll agree for now that the idea of "fully" automated descriptions leans more towards science fiction than reality. :)
However, my whole point has more to do with the apparent duplication of content that seems to be happening between the first sentence of
Wikipedia
articles and the corresponding Wikidata description. There's something about it that seems unnecessary. If we can figure out a way to automatically extract the description from the first sentence of the article, it would simplify things in two ways:
- People wouldn't need to edit Wikidata descriptions, and would instead
focus on improving the Wikipedia article. 2) People who monitor changes made to articles would need to monitor only the article, instead of the article plus its corresponding Wikidata description.
There are a lot more items on Wikidata than articles on Wikipedia. And not every language has a Wikipedia article for each item. Don't just look at descriptions on Wikidata as something useful for Wikipedia. They're much more than that.
Cheers Lydia
-- Lydia Pintscher - http://about.me/lydia.pintscher Product Manager for Wikidata
Wikimedia Deutschland e.V. Tempelhofer Ufer 23-24 10963 Berlin www.wikimedia.de
Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V.
Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter der Nummer 23855 Nz. Als gemeinnützig anerkannt durch das Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985.
Mobile-l mailing list Mobile-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mobile-l
_______________________________________________ Mobile-l mailing list Mobile-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mobile-l