In case you missed it, there is a great post by Magnus about descriptions [1]
The case is made often that descriptions as they exist are evil. They are atrocious and for whatever reason it does not make a difference that a much better solution exists. It was discussed at the London Wikimania and it seems as if people have a religious belief that people will do better.
Automated descriptions can easily be improved upon in two ways all the time every time by
- improving the algorithm for automated description - improving the algorithm for automated descriptions by considering language specific issues - improving the result of the algorithm by adding statements where they are lacking on items
I have blogged about this issue in the past. The arguments against the current crop of descriptions is convincing. Why do we not get rid of all that rubbish. The only argument I know that has some merit is that people invested time in them. The sad thing is it was as waste because the results are not good, they are not convincing and they will never support all 280+ languages Wikidata supports.
Thanks,
GerardM [1] http://magnusmanske.de/wordpress/?p=342
The case is made often that descriptions as they exist are evil. They are atrocious Why do we not get rid of all that rubbish. [and replace with] Automated descriptions … can easily be improved upon in two ways ..
I agree in general, except for items that don’t have much data, e.g. person’s life years, (Or have too much data that can’t be selected easily, e.g. 10 occupations but only 1 is really notable). For people: I mostly copy the description from Getty ULAN: that’s very good, even if the life years are unknown (thus set too wide, or missing).
So my point is, there should also be an algorithm to decide whether to replace the manual description.
Why people invest time in writing “rubbish”: because there’s no worse description than a missing description. Most everything should have an EN description, to allow a user to understand what that is, esp in an auto-complete list. Even a very bad description usually allows that.
Thanks for your work including ULAN descriptions! I agree they are great. As for Monte's earlier response to Magnus's comment about people vs other stuff, I think that Monte's sample effort proves how much "headway" we have achieved on person-items and this is excellent to read. I am a big fan of enabling the crowd, and have been having fun with Magnus latest gadget that shows me the auto-description, which is of course most challenging when that is blank (no "instance of" property). I spent fifteen minutes trying on this one and couldn't think of anything better than "machine": https://en.wikipedia.org/wiki/Banknote_counter
I am just one Wikidatan but it would be great if others could also keep Wikidata in mind while browsing Wikipedia. Can we publish this gadget in all languages on Wikidata? Maybe we should create a project on Wikidata called "Wikipedia"?
On Thu, Aug 20, 2015 at 8:59 AM, Vladimir Alexiev < vladimir.alexiev@ontotext.com> wrote:
The case is made often that descriptions as they exist are evil. They
are atrocious
Why do we not get rid of all that rubbish. [and replace with] Automated descriptions … can easily be improved upon in two ways ..
I agree in general, except for items that don’t have much data, e.g. person’s life years, (Or have too much data that can’t be selected easily, e.g. 10 occupations but only 1 is really notable). For people: I mostly copy the description from Getty ULAN: that’s very good, even if the life years are unknown (thus set too wide, or missing).
So my point is, there should also be an algorithm to decide whether to replace the manual description.
Why people invest time in writing “rubbish”: because there’s no worse description than a missing description. Most everything should have an EN description, to allow a user to understand what that is, esp in an auto-complete list. Even a very bad description usually allows that.
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
I am just one Wikidatan but it would be great if others could also keep
Wikidata in mind while browsing Wikipedia. Can we publish this gadget in all languages on Wikidata? Maybe we should create a project on Wikidata called "Wikipedia"?
I totally agree, people don't really realize yet that Wikidata is not really another project but another aspect of the same project. An example is the data quality question (I had to answer this one more time today with enwiki chemist bot owner), «is data quality of Wikidata enough for Wikipedia»). The question disappear when you realize data quality of both projects is essentially the same after the data migration step, and that a more coordinated effort on local communities means better quality for everything …
Of course as Wikidata is not full featured yet (chemists needs units for their numbers) this can mitigate the discourse a lot, but it becomes more and more credible as development progress.
2015-08-20 9:22 GMT+02:00 Jane Darnell jane023@gmail.com:
Thanks for your work including ULAN descriptions! I agree they are great. As for Monte's earlier response to Magnus's comment about people vs other stuff, I think that Monte's sample effort proves how much "headway" we have achieved on person-items and this is excellent to read. I am a big fan of enabling the crowd, and have been having fun with Magnus latest gadget that shows me the auto-description, which is of course most challenging when that is blank (no "instance of" property). I spent fifteen minutes trying on this one and couldn't think of anything better than "machine": https://en.wikipedia.org/wiki/Banknote_counter
I am just one Wikidatan but it would be great if others could also keep Wikidata in mind while browsing Wikipedia. Can we publish this gadget in all languages on Wikidata? Maybe we should create a project on Wikidata called "Wikipedia"?
On Thu, Aug 20, 2015 at 8:59 AM, Vladimir Alexiev < vladimir.alexiev@ontotext.com> wrote:
The case is made often that descriptions as they exist are evil. They
are atrocious
Why do we not get rid of all that rubbish. [and replace with] Automated descriptions … can easily be improved upon in two ways ..
I agree in general, except for items that don’t have much data, e.g. person’s life years, (Or have too much data that can’t be selected easily, e.g. 10 occupations but only 1 is really notable). For people: I mostly copy the description from Getty ULAN: that’s very good, even if the life years are unknown (thus set too wide, or missing).
So my point is, there should also be an algorithm to decide whether to replace the manual description.
Why people invest time in writing “rubbish”: because there’s no worse description than a missing description. Most everything should have an EN description, to allow a user to understand what that is, esp in an auto-complete list. Even a very bad description usually allows that.
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
Hoi, This gadget is in active use on many Wikipedias. It makes a big difference because it is part of the extended Wikidata search in those Wikipedias.
When I have to disambiguate between multiple items, I add statements so that I see the difference between items. I can then decide if I need another item or not because Reasonator has its automated descriptions always "up to date". Thanks, GerardM
On 20 August 2015 at 09:22, Jane Darnell jane023@gmail.com wrote:
Thanks for your work including ULAN descriptions! I agree they are great. As for Monte's earlier response to Magnus's comment about people vs other stuff, I think that Monte's sample effort proves how much "headway" we have achieved on person-items and this is excellent to read. I am a big fan of enabling the crowd, and have been having fun with Magnus latest gadget that shows me the auto-description, which is of course most challenging when that is blank (no "instance of" property). I spent fifteen minutes trying on this one and couldn't think of anything better than "machine": https://en.wikipedia.org/wiki/Banknote_counter
I am just one Wikidatan but it would be great if others could also keep Wikidata in mind while browsing Wikipedia. Can we publish this gadget in all languages on Wikidata? Maybe we should create a project on Wikidata called "Wikipedia"?
On Thu, Aug 20, 2015 at 8:59 AM, Vladimir Alexiev < vladimir.alexiev@ontotext.com> wrote:
The case is made often that descriptions as they exist are evil. They
are atrocious
Why do we not get rid of all that rubbish. [and replace with] Automated descriptions … can easily be improved upon in two ways ..
I agree in general, except for items that don’t have much data, e.g. person’s life years, (Or have too much data that can’t be selected easily, e.g. 10 occupations but only 1 is really notable). For people: I mostly copy the description from Getty ULAN: that’s very good, even if the life years are unknown (thus set too wide, or missing).
So my point is, there should also be an algorithm to decide whether to replace the manual description.
Why people invest time in writing “rubbish”: because there’s no worse description than a missing description. Most everything should have an EN description, to allow a user to understand what that is, esp in an auto-complete list. Even a very bad description usually allows that.
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
I also started a lua module on frwiki in the same spirit for on wiki without gadgets description generation: https://fr.wikipedia.org/wiki/Module:Description . It's used in the "Lien Wikidata" template, but it's unclear wether or not Wikipedians in frwiki will catch the bait :)
I think from an ergonomic standpoint it would be helpful to treat "descriptions" as "comments". In the case of something from Wikipedia there is a link to Wikipedia and that helps.
For objects where curators and users need to know what this object is, what gotchas are associated with using it, etc, such a facility would be necessary.
Some standard should exist for "auto-generated descriptions" to be considered good enough, but for records like
https://www.wikidata.org/wiki/Q4876286
there ought to be some kind of red mark to say this record is thinner then we like. If somebody has a problem with that situation they ought to add enough data to autogenerate a description better than
Exists(something): something has label "Beanie Babies 2.0" in the English Language
hopefully the community can improve the database in terms of where their needs are.
On Thu, Aug 20, 2015 at 6:18 AM, Thomas Douillard < thomas.douillard@gmail.com> wrote:
I also started a lua module on frwiki in the same spirit for on wiki without gadgets description generation: https://fr.wikipedia.org/wiki/Module:Description . It's used in the "Lien Wikidata" template, but it's unclear wether or not Wikipedians in frwiki will catch the bait :)
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
Hoi, There are many, many items without ANY description. The most important statement for any item is an indication what that item is about.. ie instance of or subclass of. With such a statement we can start automating all kinds of things for that item.
I do not really understand your point. When an item is known for what it is, it can already have some automated description. When special attention is to be had for specific types of item, we can make routines that tell it well in whatever language.
The biggest problem with the standard descriptions is that nobody cares about them and the quality of them is abysmal and not improving. It does not cover all our languages. It is imho a serious waste of effort. Thanks, GerardM
On 20 August 2015 at 18:03, Paul Houle ontology2@gmail.com wrote:
I think from an ergonomic standpoint it would be helpful to treat "descriptions" as "comments". In the case of something from Wikipedia there is a link to Wikipedia and that helps.
For objects where curators and users need to know what this object is, what gotchas are associated with using it, etc, such a facility would be necessary.
Some standard should exist for "auto-generated descriptions" to be considered good enough, but for records like
https://www.wikidata.org/wiki/Q4876286
there ought to be some kind of red mark to say this record is thinner then we like. If somebody has a problem with that situation they ought to add enough data to autogenerate a description better than
Exists(something): something has label "Beanie Babies 2.0" in the English Language
hopefully the community can improve the database in terms of where their needs are.
On Thu, Aug 20, 2015 at 6:18 AM, Thomas Douillard < thomas.douillard@gmail.com> wrote:
I also started a lua module on frwiki in the same spirit for on wiki without gadgets description generation: https://fr.wikipedia.org/wiki/Module:Description . It's used in the "Lien Wikidata" template, but it's unclear wether or not Wikipedians in frwiki will catch the bait :)
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
-- Paul Houle
*Applying Schemas for Natural Language Processing, Distributed Systems, Classification and Text Mining and Data Lakes*
(607) 539 6254 paul.houle on Skype ontology2@gmail.com
:BaseKB -- Query Freebase Data With SPARQL http://basekb.com/gold/
Legal Entity Identifier Lookup https://legalentityidentifier.info/lei/lookup/ http://legalentityidentifier.info/lei/lookup/
Join our Data Lakes group on LinkedIn https://www.linkedin.com/grp/home?gid=8267275
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
"Things not Strings", i.e., more Statements and less Descriptions.
Beanie Babies 2.0 needs Statement love, that's all...it already has a page long description with the Wikipedia linkout.
Thad +ThadGuidry https://www.google.com/+ThadGuidry
Exactly
When I find a page with a poor or no auto description I add statements, not descriptions.
Joe
On Thu, 20 Aug 2015 18:53 Thad Guidry thadguidry@gmail.com wrote:
"Things not Strings", i.e., more Statements and less Descriptions.
Beanie Babies 2.0 needs Statement love, that's all...it already has a page long description with the Wikipedia linkout.
Thad +ThadGuidry https://www.google.com/+ThadGuidry
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata