Hi Monte!
inline:
> Deeply hard, in fact, because it's complicated not only by language syntax and grammatical rules, but also by qualitative factors (readability, meaning, context, relevance etc).
> This already complicated situation then becomes many orders of magnitude more difficult because these qualitative factors can differ between languages.
Again, I agree that this is not an easy problem. However, in the case of language translations, automated descriptions have the potential of simplifying things tremendously. The algorithm for the grammar and syntax of a certain language needs to be written only once. And once it's written, it can be applied to every Wikidata item, past and future. Sure, there would likely be a different algorithm for each language, and maybe even different algorithms for various taxa of Wikidata items. But this kind of solution simply feels more scalable, and I'm surprised that researching methods of accomplishing this are of little interest.
> I predict this won't be any worse than what happened when we enabled section editing.
But when we enabled section editing, did we do it with a prominent call to action? I just feel a little hesitation about going full-on with something like this, without having a baseline level of administrative feedback in the apps (e.g. a notification for when a description is reverted, and the reason for it).
To be clear, of course I'm totally on board for experimenting with allowing users to contribute descriptions. Making bold moves is what makes our team so great. My goal is simply to point out various other solutions that, to me, make slightly more sense (and to welcome feedback on why they don't!).
> But reducing the first sentence in this way is deceptively complicated
to do programmatically, precisely because of the word "arguably" in the
preceding sentence - it's almost entirely a matter of qualitative
judgement. You have to know what a fish is to know what parts of the
first sentence are most important
That's almost convincing :) but still... why duplicate content when the essential information is already there?
Maybe I didn't convey my idea of "markup" for extracting a description properly. For example, the description for the [[Fish]] article can be marked up as follows:
A fish is any member of a paraphyletic group of organisms that consist of all <description>gill-bearing aquatic craniate animal</description>s that lack limbs with digits.
The above markup would be done by a human editor, with the knowledge that the text within the <description> tag will end up as the Wikidata description. I would wager that a similar scheme could be applied to any number of articles. Let's try it for a few random articles:
Poland (Polish: Polska; pronounced [ˈpɔlska] ( listen)), officially the Republic of Poland (Polish: Rzeczpospolita Polska; pronounced [ʐɛt͡ʂpɔˈspɔʎit̪a ˈpɔlska] ( listen)), is a
bordered by Germany to the west; the Czech Republic and Slovakia to the south...