[Wikidata-l] Announce: WikiProject Structured Data for Commons

List overview All Threads
Download

newer

older

[Wikidata-l] A note on "instance...

[Wikidata-l] new features deployed...

James Heald

17 Aug 2014 17 Aug '14

3:54 a.m.

https://www.wikidata.org/wiki/Wikidata:WikiProject_Structured_Data_for_Commo...

The aim of of the WikiProject Structured Data for Commons is:

* To develop templates that draw directly on Wikidata (and in future also on Commons Wikibase), that will act as drop-in replacements for templates currently in use on Commons.

* To develop new templates that can bring new functionality to Commons filepages (eg "topics" listings)

* To support the cataloguing of particularly idiosyncratic templates currently in use on Commons (eg institutional credit/backlink templates, and other source templates), and try to produce more generalised, standardised forms that can draw on Wikidata.

* To work with other WikiProjects on Wikidata to understand, document and develop the data models on Wikidata, and make sure that they are sufficient to accommodate the needs of GLAM organisations and others currently or in future uploading or maintaining metadata on Commons.

* To start to port existing such data that can be represented in structured form, and is appropriate to do so, from Commons to Wikidata

* To examine the divide between what should be stored on Wikidata and what should be stored on the proposed Commons Wikibase.

* To support, as a user-space community, the work of the staffers developing Commons Wikibase and other aspects of the Foundation initiative for Structured Data for Commons in any way we can.

Talk-page comments, or wholescale re-editing, of this essay at https://commons.wikimedia.org/wiki/Commons:Wikidata/How_GLAMs_can_help_the_S... also very welcome.

-- J.

Show replies by date

Gerard Meijssen

17 Aug 17 Aug

8:05 a.m.

Hoi, I may be stupid, but for me there is no reason in there that will help us in what we do.

For me reasons to wikidatify multi media files are:

- bring labels to Commons that are inherently multi lingual - this will enable search in multiple languages - it will make it easy to associate photos with the subject matter - bringing structured data to licenses - this will make them intelligible in many languages - it will allow for easy categorisation of images that are considered problematic - it will allow for an easy "world wide" inclusion once a free license becomes available

Sorry but what you write is only technical, hard to understand and does not motivate at all because it lacks any reason why we should do this. Thanks, GerardM

On 17 August 2014 10:54, James Heald j.heald@ucl.ac.uk wrote:

...

https://www.wikidata.org/wiki/Wikidata:WikiProject_ Structured_Data_for_Commons

The aim of of the WikiProject Structured Data for Commons is:

To develop templates that draw directly on Wikidata (and in future

also on Commons Wikibase), that will act as drop-in replacements for templates currently in use on Commons.

To develop new templates that can bring new functionality to Commons

filepages (eg "topics" listings)

To support the cataloguing of particularly idiosyncratic templates

currently in use on Commons (eg institutional credit/backlink templates, and other source templates), and try to produce more generalised, standardised forms that can draw on Wikidata.

To work with other WikiProjects on Wikidata to understand, document

and develop the data models on Wikidata, and make sure that they are sufficient to accommodate the needs of GLAM organisations and others currently or in future uploading or maintaining metadata on Commons.

To start to port existing such data that can be represented in

structured form, and is appropriate to do so, from Commons to Wikidata

To examine the divide between what should be stored on Wikidata and

what should be stored on the proposed Commons Wikibase.

To support, as a user-space community, the work of the staffers

developing Commons Wikibase and other aspects of the Foundation initiative for Structured Data for Commons in any way we can.

Sign up now!

Talk-page comments, or wholescale re-editing, of this essay at https://commons.wikimedia.org/wiki/Commons:Wikidata/How_ GLAMs_can_help_the_Structured_Data_for_Commons_initiative also very welcome.

-- J.

Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l

Luca Martinelli

9:45 a.m.

Actually, this is important - it is just that may be extended also to all projects and not just limited to Commons. One of the main limitations, sometimes, is that people do not know how to actually import data from Wikidata into their templates. A bit of help or, G*d forgive me, some documentation would be interesting.

L. Il 17/ago/2014 15:06 "Gerard Meijssen" gerard.meijssen@gmail.com ha scritto:

...

Hoi, I may be stupid, but for me there is no reason in there that will help us in what we do.

For me reasons to wikidatify multi media files are:

bring labels to Commons that are inherently multi lingual

this will enable search in multiple languages

it will make it easy to associate photos with the subject matter

bringing structured data to licenses

this will make them intelligible in many languages

it will allow for easy categorisation of images that are

considered problematic

it will allow for an easy "world wide" inclusion once a free

license becomes available

Sorry but what you write is only technical, hard to understand and does not motivate at all because it lacks any reason why we should do this. Thanks, GerardM

On 17 August 2014 10:54, James Heald j.heald@ucl.ac.uk wrote:

...
https://www.wikidata.org/wiki/Wikidata:WikiProject_ Structured_Data_for_Commons

The aim of of the WikiProject Structured Data for Commons is:

To develop templates that draw directly on Wikidata (and in future

also on Commons Wikibase), that will act as drop-in replacements for templates currently in use on Commons.

To develop new templates that can bring new functionality to Commons

filepages (eg "topics" listings)

To support the cataloguing of particularly idiosyncratic templates

currently in use on Commons (eg institutional credit/backlink templates, and other source templates), and try to produce more generalised, standardised forms that can draw on Wikidata.

To work with other WikiProjects on Wikidata to understand, document

and develop the data models on Wikidata, and make sure that they are sufficient to accommodate the needs of GLAM organisations and others currently or in future uploading or maintaining metadata on Commons.

To start to port existing such data that can be represented in

structured form, and is appropriate to do so, from Commons to Wikidata

To examine the divide between what should be stored on Wikidata and

what should be stored on the proposed Commons Wikibase.

To support, as a user-space community, the work of the staffers

developing Commons Wikibase and other aspects of the Foundation initiative for Structured Data for Commons in any way we can.

Sign up now!

Talk-page comments, or wholescale re-editing, of this essay at https://commons.wikimedia.org/wiki/Commons:Wikidata/How_ GLAMs_can_help_the_Structured_Data_for_Commons_initiative also very welcome.

-- J.

Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l

Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l

Gerard Meijssen

10 a.m.

Hoi, Importing data from Wikidata (where do you want it??) is just one application. There are so many potential applications for structured data and Wikidata implicitly covers the sum of all knowledge as we know it (in the Wikimedia projects) so there are opportunities galore.

For people "not to know how to" is a given. I do not care to know about Wikipedia templates because they are freaking impossibly hard.

For me the data in Wikidata has other applications. One important one is providing information when there is no Wikipedia article another will be helping children find pictures of a *YOUR FAVOURITE ANIMAL HERE* in their own language.That is my vision, motivation. Abstract language does not help motivate others.. Saying things like "I want every eight year old find pictures of a horse in their own language" may. It also jives better with what the WMF aims to achieve. Thanks, GerardM

On 17 August 2014 16:45, Luca Martinelli martinelliluca@gmail.com wrote:

...

Actually, this is important - it is just that may be extended also to all projects and not just limited to Commons. One of the main limitations, sometimes, is that people do not know how to actually import data from Wikidata into their templates. A bit of help or, G*d forgive me, some documentation would be interesting.

L. Il 17/ago/2014 15:06 "Gerard Meijssen" gerard.meijssen@gmail.com ha scritto:

Hoi,

...
I may be stupid, but for me there is no reason in there that will help us in what we do.

For me reasons to wikidatify multi media files are:

bring labels to Commons that are inherently multi lingual

this will enable search in multiple languages

it will make it easy to associate photos with the subject matter

bringing structured data to licenses

this will make them intelligible in many languages

it will allow for easy categorisation of images that are

considered problematic

it will allow for an easy "world wide" inclusion once a free

license becomes available

Sorry but what you write is only technical, hard to understand and does not motivate at all because it lacks any reason why we should do this. Thanks, GerardM

On 17 August 2014 10:54, James Heald j.heald@ucl.ac.uk wrote:

...
https://www.wikidata.org/wiki/Wikidata:WikiProject_ Structured_Data_for_Commons

The aim of of the WikiProject Structured Data for Commons is:

To develop templates that draw directly on Wikidata (and in future

also on Commons Wikibase), that will act as drop-in replacements for templates currently in use on Commons.

To develop new templates that can bring new functionality to

Commons filepages (eg "topics" listings)

To support the cataloguing of particularly idiosyncratic templates

currently in use on Commons (eg institutional credit/backlink templates, and other source templates), and try to produce more generalised, standardised forms that can draw on Wikidata.

To work with other WikiProjects on Wikidata to understand, document

and develop the data models on Wikidata, and make sure that they are sufficient to accommodate the needs of GLAM organisations and others currently or in future uploading or maintaining metadata on Commons.

To start to port existing such data that can be represented in

structured form, and is appropriate to do so, from Commons to Wikidata

To examine the divide between what should be stored on Wikidata and

what should be stored on the proposed Commons Wikibase.

To support, as a user-space community, the work of the staffers

developing Commons Wikibase and other aspects of the Foundation initiative for Structured Data for Commons in any way we can.

Sign up now!

Talk-page comments, or wholescale re-editing, of this essay at https://commons.wikimedia.org/wiki/Commons:Wikidata/How_ GLAMs_can_help_the_Structured_Data_for_Commons_initiative also very welcome.

-- J.

Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l

Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l

Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l

James Heald

10:21 a.m.

Gerard -- It's not just about getting data out of Wikidata, it's also about getting data into Wikidata. Being able to round-trip what you've got already is a minimum requirement for the data model being flexible enough.

Once we can do that, we can also think how to make *better* templates, that can show complex information (as reflected in the Wikidata data modelling) in a straightforward way. Thinking about how we want to present the data can also point up issues with the data modelling, so there is an important cycle. The data modelling is also helped the more we expose it to real cases, such as quirky cases found in the bulk uploads currently occurring to Commons, and also from the diversity of original metadata sources for those bulk uploads.

All this will mean more information about objects, stored in a more structured way, which should make it much more accessible for the internationalisation and sorting that you rightly raise, as well as enhancing WD's stores of knowledge about items associated with the objects -- creators, painters, places, subjects, etc.

So the aim is to directly improve what we can give people when they click for "More Details" about an image, as well as image searchability, and the whole store of knowledge in the database; and also to help our own and others' skills in working with Wikidata.

Not a bad set of aims for one little Wiki project!

As for the legal side, WMF Legal are very much involved with that at the Foundation level, trying to organise and simplify so one search rapidly by the key things one might to look for in a license. They obviously need to lead on that, but obviously if there is anything we as volunteers can help the central team with, the project aims to be a resource to be called on.

- J.

On 17/08/2014 16:00, Gerard Meijssen wrote:

...

Hoi, Importing data from Wikidata (where do you want it??) is just one application. There are so many potential applications for structured data and Wikidata implicitly covers the sum of all knowledge as we know it (in the Wikimedia projects) so there are opportunities galore.

For people "not to know how to" is a given. I do not care to know about Wikipedia templates because they are freaking impossibly hard.

For me the data in Wikidata has other applications. One important one is providing information when there is no Wikipedia article another will be helping children find pictures of a *YOUR FAVOURITE ANIMAL HERE* in their own language.That is my vision, motivation. Abstract language does not help motivate others.. Saying things like "I want every eight year old find pictures of a horse in their own language" may. It also jives better with what the WMF aims to achieve. Thanks, GerardM

Luca Martinelli

18 Aug 18 Aug

7:41 a.m.

2014-08-17 17:00 GMT+02:00 Gerard Meijssen gerard.meijssen@gmail.com:

...

Hoi, Importing data from Wikidata (where do you want it??) is just one application. There are so many potential applications for structured data and Wikidata implicitly covers the sum of all knowledge as we know it (in the Wikimedia projects) so there are opportunities galore.

For people "not to know how to" is a given. I do not care to know about Wikipedia templates because they are freaking impossibly hard.

Yet, if we don't use the Wikidata data in the "freaking impossibly hard" Wikipedia templates, what is the point of Wikidata as a Wikimedia project?

I remember that this project had among its first goals to help disseminate structured data on all Wikimedia projects, in order to relieve the less-crowded WMF projects of their burden in managing such data and to let their few users focus on writing/translating/expanding their articles. Now, if we don't show to people on the WMF project - even the bigger ones - that Wikidata IS useful by helping them in retrieving these data, what is the point of this project?

"There are so many potential application", I know, yet THIS IS ONE OF THEM -- and in my personal and humble opinion a damn important one.

-- Luca "Sannita" Martinelli http://it.wikipedia.org/wiki/Utente:Sannita

Gerard Meijssen

19 Aug 19 Aug

12:56 a.m.

Hoi, What is the point of Wiktionary, WIkipedia, Wikispecies et al as a WMF project? Like Wikidata they all help us share in the sum of all knowledge. Wikidata already provides an application in being the vehicle for interlanguage links.

The low hanging fruit of Wikidata is not sharing info in templates, it is in providing search results where a Wikipedia does NOT have an article. It is used for this and it does have a measurable impact. It is "nice" to have the ambition to share data in templates but be realistic. The quality of the data in Wikidata does not merit this at this time. The "community" insists on sources and frankly it is assassine to expect that in the first few years it will be available near the level that some "demand". This is only based on the data that is there. That is the next problem we do not have enough data. We are still at the stage where we are harvesting data for the first time. Harvesting big amounts, not one item at a time.

It is important to have goals, and it is nice that at the start providing data to templates was seen as an initial goal. However it will not be like with Pallas Athena when she came from the head of Zeus in full armour. This goal is achievable and we are making big strides in that direction BUT we need smaller goals, small applications that grow our content in both quality and quantity. As I wrote on my blog, we need to think in terms of confidence in our data and not so much in sources. Amir is finishing a tool that will allow us to compare data for "humans" in the English, German and Italian Wikipedia. That will be a massive step in the right direction.

I care about Wikidata and I know that at this time those freakingly hard templates are the least of our worries. More problematic is that people think of Wikidata as a service product for Wikipedia and limit their thinking to templates. The existing search extension with WDQ is there. It works really well. It is dismissed probably because it demonstrates that ALL Wikipedias cover less than 50% of the subjects known to us. We know all of them because of Wikidata.

So yeah by all means blow the horn about our aspiration of servicing templates in those projects that can handle this. It is fine. It is not realistic and even counter productive as an aspiration when we do not appreciate the reality as we have it at this time. Thanks, GerardM

On 18 August 2014 14:41, Luca Martinelli martinelliluca@gmail.com wrote:

...

2014-08-17 17:00 GMT+02:00 Gerard Meijssen gerard.meijssen@gmail.com:

...
Hoi, Importing data from Wikidata (where do you want it??) is just one application. There are so many potential applications for structured data and Wikidata implicitly covers the sum of all knowledge as we know it (in the Wikimedia projects) so there are opportunities galore.

For people "not to know how to" is a given. I do not care to know about Wikipedia templates because they are freaking impossibly hard.

Yet, if we don't use the Wikidata data in the "freaking impossibly hard" Wikipedia templates, what is the point of Wikidata as a Wikimedia project?

I remember that this project had among its first goals to help disseminate structured data on all Wikimedia projects, in order to relieve the less-crowded WMF projects of their burden in managing such data and to let their few users focus on writing/translating/expanding their articles. Now, if we don't show to people on the WMF project - even the bigger ones - that Wikidata IS useful by helping them in retrieving these data, what is the point of this project?

"There are so many potential application", I know, yet THIS IS ONE OF THEM -- and in my personal and humble opinion a damn important one.

-- Luca "Sannita" Martinelli http://it.wikipedia.org/wiki/Utente:Sannita

Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l

Luca Martinelli

3:27 a.m.

Ok, I got the point. What you probably need to consider is that focusing on one goal does not mean at all that we have to dismiss all the others. At least, *I* do not think so.

You want to focus on research? Fine, do it. I'd like to focus on templates. That's fine too, I guess. We're both working to let Wikidata be appreciated - by separate audiences.

L. Il 19/ago/2014 07:57 "Gerard Meijssen" gerard.meijssen@gmail.com ha scritto:

...

Hoi, What is the point of Wiktionary, WIkipedia, Wikispecies et al as a WMF project? Like Wikidata they all help us share in the sum of all knowledge. Wikidata already provides an application in being the vehicle for interlanguage links.

The low hanging fruit of Wikidata is not sharing info in templates, it is in providing search results where a Wikipedia does NOT have an article. It is used for this and it does have a measurable impact. It is "nice" to have the ambition to share data in templates but be realistic. The quality of the data in Wikidata does not merit this at this time. The "community" insists on sources and frankly it is assassine to expect that in the first few years it will be available near the level that some "demand". This is only based on the data that is there. That is the next problem we do not have enough data. We are still at the stage where we are harvesting data for the first time. Harvesting big amounts, not one item at a time.

It is important to have goals, and it is nice that at the start providing data to templates was seen as an initial goal. However it will not be like with Pallas Athena when she came from the head of Zeus in full armour. This goal is achievable and we are making big strides in that direction BUT we need smaller goals, small applications that grow our content in both quality and quantity. As I wrote on my blog, we need to think in terms of confidence in our data and not so much in sources. Amir is finishing a tool that will allow us to compare data for "humans" in the English, German and Italian Wikipedia. That will be a massive step in the right direction.

I care about Wikidata and I know that at this time those freakingly hard templates are the least of our worries. More problematic is that people think of Wikidata as a service product for Wikipedia and limit their thinking to templates. The existing search extension with WDQ is there. It works really well. It is dismissed probably because it demonstrates that ALL Wikipedias cover less than 50% of the subjects known to us. We know all of them because of Wikidata.

So yeah by all means blow the horn about our aspiration of servicing templates in those projects that can handle this. It is fine. It is not realistic and even counter productive as an aspiration when we do not appreciate the reality as we have it at this time. Thanks, GerardM

On 18 August 2014 14:41, Luca Martinelli martinelliluca@gmail.com wrote:

...
2014-08-17 17:00 GMT+02:00 Gerard Meijssen gerard.meijssen@gmail.com:

...
Hoi, Importing data from Wikidata (where do you want it??) is just one application. There are so many potential applications for structured

data

...
and Wikidata implicitly covers the sum of all knowledge as we know it

(in

...
the Wikimedia projects) so there are opportunities galore.

For people "not to know how to" is a given. I do not care to know about Wikipedia templates because they are freaking impossibly hard.

Yet, if we don't use the Wikidata data in the "freaking impossibly hard" Wikipedia templates, what is the point of Wikidata as a Wikimedia project?

I remember that this project had among its first goals to help disseminate structured data on all Wikimedia projects, in order to relieve the less-crowded WMF projects of their burden in managing such data and to let their few users focus on writing/translating/expanding their articles. Now, if we don't show to people on the WMF project - even the bigger ones - that Wikidata IS useful by helping them in retrieving these data, what is the point of this project?

"There are so many potential application", I know, yet THIS IS ONE OF THEM -- and in my personal and humble opinion a damn important one.

-- Luca "Sannita" Martinelli http://it.wikipedia.org/wiki/Utente:Sannita

Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l

Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l

Gerard Meijssen

3:39 a.m.

Hoi, I am not in research. I am into making Wikidata into a reasonable resource. To achieve this I add gazillions of statements and I am happy when people focus on templates. However without the data, templates that make use of Wikidata are niche applications. Without some mature understanding of what Wikidata is at this time concentrating on templates is an exercise in building enabling technology.

My point is for the "community" to have reasonable expectations. So many people consider Wikidata to be useless. That is fine but imho it helps when the baseline of where Wikidata is at this time is understood. Thanks, GerardM

The statistics that explain this best can be found here .. https://tools.wmflabs.org/wikidata-todo/stats.php?reverse

On 19 August 2014 10:27, Luca Martinelli martinelliluca@gmail.com wrote:

...

Ok, I got the point. What you probably need to consider is that focusing on one goal does not mean at all that we have to dismiss all the others. At least, *I* do not think so.

You want to focus on research? Fine, do it. I'd like to focus on templates. That's fine too, I guess. We're both working to let Wikidata be appreciated - by separate audiences.

L. Il 19/ago/2014 07:57 "Gerard Meijssen" gerard.meijssen@gmail.com ha scritto:

Hoi,

...
What is the point of Wiktionary, WIkipedia, Wikispecies et al as a WMF project? Like Wikidata they all help us share in the sum of all knowledge. Wikidata already provides an application in being the vehicle for interlanguage links.

The low hanging fruit of Wikidata is not sharing info in templates, it is in providing search results where a Wikipedia does NOT have an article. It is used for this and it does have a measurable impact. It is "nice" to have the ambition to share data in templates but be realistic. The quality of the data in Wikidata does not merit this at this time. The "community" insists on sources and frankly it is assassine to expect that in the first few years it will be available near the level that some "demand". This is only based on the data that is there. That is the next problem we do not have enough data. We are still at the stage where we are harvesting data for the first time. Harvesting big amounts, not one item at a time.

It is important to have goals, and it is nice that at the start providing data to templates was seen as an initial goal. However it will not be like with Pallas Athena when she came from the head of Zeus in full armour. This goal is achievable and we are making big strides in that direction BUT we need smaller goals, small applications that grow our content in both quality and quantity. As I wrote on my blog, we need to think in terms of confidence in our data and not so much in sources. Amir is finishing a tool that will allow us to compare data for "humans" in the English, German and Italian Wikipedia. That will be a massive step in the right direction.

I care about Wikidata and I know that at this time those freakingly hard templates are the least of our worries. More problematic is that people think of Wikidata as a service product for Wikipedia and limit their thinking to templates. The existing search extension with WDQ is there. It works really well. It is dismissed probably because it demonstrates that ALL Wikipedias cover less than 50% of the subjects known to us. We know all of them because of Wikidata.

So yeah by all means blow the horn about our aspiration of servicing templates in those projects that can handle this. It is fine. It is not realistic and even counter productive as an aspiration when we do not appreciate the reality as we have it at this time. Thanks, GerardM

On 18 August 2014 14:41, Luca Martinelli martinelliluca@gmail.com wrote:

...
2014-08-17 17:00 GMT+02:00 Gerard Meijssen gerard.meijssen@gmail.com:

...
Hoi, Importing data from Wikidata (where do you want it??) is just one application. There are so many potential applications for structured

data

...
and Wikidata implicitly covers the sum of all knowledge as we know it

(in

...
the Wikimedia projects) so there are opportunities galore.

For people "not to know how to" is a given. I do not care to know about Wikipedia templates because they are freaking impossibly hard.

Yet, if we don't use the Wikidata data in the "freaking impossibly hard" Wikipedia templates, what is the point of Wikidata as a Wikimedia project?

I remember that this project had among its first goals to help disseminate structured data on all Wikimedia projects, in order to relieve the less-crowded WMF projects of their burden in managing such data and to let their few users focus on writing/translating/expanding their articles. Now, if we don't show to people on the WMF project - even the bigger ones - that Wikidata IS useful by helping them in retrieving these data, what is the point of this project?

"There are so many potential application", I know, yet THIS IS ONE OF THEM -- and in my personal and humble opinion a damn important one.

-- Luca "Sannita" Martinelli http://it.wikipedia.org/wiki/Utente:Sannita

Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l

Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l

Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l

David Cuenca

4:19 a.m.

Thanks for the stats, Gerard. Two thoughts: - With so many items without description I wonder why we don't have the automatic descriptions gadget enabled by default. - There are many items without statements, but not that many articles without a category --> would it be possible to have a game that suggests instance of/subclass of based on the statements of the items in the same or in the upper category?

Thanks Micru

On Tue, Aug 19, 2014 at 10:39 AM, Gerard Meijssen <gerard.meijssen@gmail.com

...

wrote:

...

Hoi, I am not in research. I am into making Wikidata into a reasonable resource. To achieve this I add gazillions of statements and I am happy when people focus on templates. However without the data, templates that make use of Wikidata are niche applications. Without some mature understanding of what Wikidata is at this time concentrating on templates is an exercise in building enabling technology.

My point is for the "community" to have reasonable expectations. So many people consider Wikidata to be useless. That is fine but imho it helps when the baseline of where Wikidata is at this time is understood. Thanks, GerardM

The statistics that explain this best can be found here .. https://tools.wmflabs.org/wikidata-todo/stats.php?reverse

On 19 August 2014 10:27, Luca Martinelli martinelliluca@gmail.com wrote:

...
Ok, I got the point. What you probably need to consider is that focusing on one goal does not mean at all that we have to dismiss all the others. At least, *I* do not think so.

You want to focus on research? Fine, do it. I'd like to focus on templates. That's fine too, I guess. We're both working to let Wikidata be appreciated - by separate audiences.

L. Il 19/ago/2014 07:57 "Gerard Meijssen" gerard.meijssen@gmail.com ha scritto:

Hoi,

...
What is the point of Wiktionary, WIkipedia, Wikispecies et al as a WMF project? Like Wikidata they all help us share in the sum of all knowledge. Wikidata already provides an application in being the vehicle for interlanguage links.

The low hanging fruit of Wikidata is not sharing info in templates, it is in providing search results where a Wikipedia does NOT have an article. It is used for this and it does have a measurable impact. It is "nice" to have the ambition to share data in templates but be realistic. The quality of the data in Wikidata does not merit this at this time. The "community" insists on sources and frankly it is assassine to expect that in the first few years it will be available near the level that some "demand". This is only based on the data that is there. That is the next problem we do not have enough data. We are still at the stage where we are harvesting data for the first time. Harvesting big amounts, not one item at a time.

It is important to have goals, and it is nice that at the start providing data to templates was seen as an initial goal. However it will not be like with Pallas Athena when she came from the head of Zeus in full armour. This goal is achievable and we are making big strides in that direction BUT we need smaller goals, small applications that grow our content in both quality and quantity. As I wrote on my blog, we need to think in terms of confidence in our data and not so much in sources. Amir is finishing a tool that will allow us to compare data for "humans" in the English, German and Italian Wikipedia. That will be a massive step in the right direction.

I care about Wikidata and I know that at this time those freakingly hard templates are the least of our worries. More problematic is that people think of Wikidata as a service product for Wikipedia and limit their thinking to templates. The existing search extension with WDQ is there. It works really well. It is dismissed probably because it demonstrates that ALL Wikipedias cover less than 50% of the subjects known to us. We know all of them because of Wikidata.

So yeah by all means blow the horn about our aspiration of servicing templates in those projects that can handle this. It is fine. It is not realistic and even counter productive as an aspiration when we do not appreciate the reality as we have it at this time. Thanks, GerardM

On 18 August 2014 14:41, Luca Martinelli martinelliluca@gmail.com wrote:

...
2014-08-17 17:00 GMT+02:00 Gerard Meijssen gerard.meijssen@gmail.com:

...
Hoi, Importing data from Wikidata (where do you want it??) is just one application. There are so many potential applications for structured

data

...
and Wikidata implicitly covers the sum of all knowledge as we know it

(in

...
the Wikimedia projects) so there are opportunities galore.

For people "not to know how to" is a given. I do not care to know

about

...
Wikipedia templates because they are freaking impossibly hard.

Yet, if we don't use the Wikidata data in the "freaking impossibly hard" Wikipedia templates, what is the point of Wikidata as a Wikimedia project?

I remember that this project had among its first goals to help disseminate structured data on all Wikimedia projects, in order to relieve the less-crowded WMF projects of their burden in managing such data and to let their few users focus on writing/translating/expanding their articles. Now, if we don't show to people on the WMF project - even the bigger ones - that Wikidata IS useful by helping them in retrieving these data, what is the point of this project?

"There are so many potential application", I know, yet THIS IS ONE OF THEM -- and in my personal and humble opinion a damn important one.

-- Luca "Sannita" Martinelli http://it.wikipedia.org/wiki/Utente:Sannita

Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l

Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l

Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l

Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l

-- Etiamsi omnes, ego non

Lydia Pintscher

9:13 a.m.

On Tue, Aug 19, 2014 at 11:19 AM, David Cuenca dacuetu@gmail.com wrote:

...

Thanks for the stats, Gerard. Two thoughts:

With so many items without description I wonder why we don't have the

automatic descriptions gadget enabled by default.

I am a bit worried about enabling this by default for everyone as a gadget. We need the descriptions in a lot of places where people search for items. The next big one will be Commons. But _a lot_ more will come in the future. Think for example of tagging your blog post on Wordpress with Wikidata concepts. You'll need the descriptions. If we enable automatic descriptions on Wikidata now we will actively discourage people from entering more descriptions. That would be bad as 3rd parties then don't get the benefit of them. I am also hesitant to build this into Wikibase directly as it'd need quite some domain-knowledge for all I can tell at this point. That's something we need to avoid. Anyone got ideas how to get out of this?

Cheers Lydia

-- Lydia Pintscher - http://about.me/lydia.pintscher Product Manager for Wikidata Wikimedia Deutschland e.V. Tempelhofer Ufer 23-24 10963 Berlin www.wikimedia.de Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V. Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter der Nummer 23855 Nz. Als gemeinnützig anerkannt durch das Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985.

Thomas Douillard

9:34 a.m.

One answer would be deeper answer of autodescription into Wikidata. I don't know how to make this flexible and generic enough to make this only depedant of Wikibase model concept though.

One rough and not thought throw proposition would be: associate a (mediawiki?) template to a query or a class for each language, which generates a description if an item matches the query according to the values of the properties.

hen we could add an API call that uses the template.

2014-08-19 16:13 GMT+02:00 Lydia Pintscher lydia.pintscher@wikimedia.de:

...

On Tue, Aug 19, 2014 at 11:19 AM, David Cuenca dacuetu@gmail.com wrote:

...
Thanks for the stats, Gerard. Two thoughts:

With so many items without description I wonder why we don't have the

automatic descriptions gadget enabled by default.

I am a bit worried about enabling this by default for everyone as a gadget. We need the descriptions in a lot of places where people search for items. The next big one will be Commons. But _a lot_ more will come in the future. Think for example of tagging your blog post on Wordpress with Wikidata concepts. You'll need the descriptions. If we enable automatic descriptions on Wikidata now we will actively discourage people from entering more descriptions. That would be bad as 3rd parties then don't get the benefit of them. I am also hesitant to build this into Wikibase directly as it'd need quite some domain-knowledge for all I can tell at this point. That's something we need to avoid. Anyone got ideas how to get out of this?

Cheers Lydia

-- Lydia Pintscher - http://about.me/lydia.pintscher Product Manager for Wikidata

Wikimedia Deutschland e.V. Tempelhofer Ufer 23-24 10963 Berlin www.wikimedia.de

Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V.

Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter der Nummer 23855 Nz. Als gemeinnützig anerkannt durch das Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985.

Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l

David Cuenca

10:32 a.m.

It is possible to represent linguistic patterns as items but it needs some work to conceptualize a structure that can be used in several languages. Then you could implement a complex query that selects the one that displays the most information.

The biggest hurdle I see is that we are not storing enough linguistic data. We could already do it without any further development, but I think that we need more time to understand thoroughly the structure that is emerging.

Maybe it could be thought as a research project to explore the possibilities.

On Tue, Aug 19, 2014 at 4:34 PM, Thomas Douillard < thomas.douillard@gmail.com> wrote:

...

One answer would be deeper answer of autodescription into Wikidata. I don't know how to make this flexible and generic enough to make this only depedant of Wikibase model concept though.

One rough and not thought throw proposition would be: associate a (mediawiki?) template to a query or a class for each language, which generates a description if an item matches the query according to the values of the properties.

hen we could add an API call that uses the template.

2014-08-19 16:13 GMT+02:00 Lydia Pintscher lydia.pintscher@wikimedia.de:

On Tue, Aug 19, 2014 at 11:19 AM, David Cuenca dacuetu@gmail.com wrote:

...
...
Thanks for the stats, Gerard. Two thoughts:

With so many items without description I wonder why we don't have the

automatic descriptions gadget enabled by default.

I am a bit worried about enabling this by default for everyone as a gadget. We need the descriptions in a lot of places where people search for items. The next big one will be Commons. But _a lot_ more will come in the future. Think for example of tagging your blog post on Wordpress with Wikidata concepts. You'll need the descriptions. If we enable automatic descriptions on Wikidata now we will actively discourage people from entering more descriptions. That would be bad as 3rd parties then don't get the benefit of them. I am also hesitant to build this into Wikibase directly as it'd need quite some domain-knowledge for all I can tell at this point. That's something we need to avoid. Anyone got ideas how to get out of this?

Cheers Lydia

-- Lydia Pintscher - http://about.me/lydia.pintscher Product Manager for Wikidata

Wikimedia Deutschland e.V. Tempelhofer Ufer 23-24 10963 Berlin www.wikimedia.de

Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V.

Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter der Nummer 23855 Nz. Als gemeinnützig anerkannt durch das Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985.

Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l

Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l

-- Etiamsi omnes, ego non

Andrew Gray

2:46 p.m.

New subject: [Wikidata-l] [Spam] Re: Announce: WikiProject Structured Data for Commons

Can we try displaying them but marking them in some way?

eg/ "[German person 1885-1952]" for an automatic description, "German mechanical engineer, 1885-1952" for a human-written one - depending on how and where they're shown in the interface we could also try italicising them, using a lighter shade of text, or something else to mark them out a bit.

Using square brackets is a relatively accepted marking for hypothetical or interpolated information, and would serve to indicate that this is automatically generated and could be wrong. If we want to be more aggressive about it, we could add question marks - "[German person 1885-1952?]" would make it even clearer it's provisional.

Andrew.

On 19 August 2014 15:13, Lydia Pintscher lydia.pintscher@wikimedia.de wrote:

...

On Tue, Aug 19, 2014 at 11:19 AM, David Cuenca dacuetu@gmail.com wrote:

...
Thanks for the stats, Gerard. Two thoughts:

With so many items without description I wonder why we don't have the

automatic descriptions gadget enabled by default.

I am a bit worried about enabling this by default for everyone as a gadget. We need the descriptions in a lot of places where people search for items. The next big one will be Commons. But _a lot_ more will come in the future. Think for example of tagging your blog post on Wordpress with Wikidata concepts. You'll need the descriptions. If we enable automatic descriptions on Wikidata now we will actively discourage people from entering more descriptions. That would be bad as 3rd parties then don't get the benefit of them. I am also hesitant to build this into Wikibase directly as it'd need quite some domain-knowledge for all I can tell at this point. That's something we need to avoid. Anyone got ideas how to get out of this?

Cheers Lydia

-- Lydia Pintscher - http://about.me/lydia.pintscher Product Manager for Wikidata

Wikimedia Deutschland e.V. Tempelhofer Ufer 23-24 10963 Berlin www.wikimedia.de

Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V.

Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter der Nummer 23855 Nz. Als gemeinnützig anerkannt durch das Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985.

Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l

-- - Andrew Gray andrew.gray@dunelm.org.uk

Joe Filceolaire

2:52 p.m.

New subject: [Wikidata-l] [Spam] Re: Announce: WikiProject Structured Data for Commons

Better to make this 4 statements Instance of :human Sex:male Nationality:germany Date of birth:1885 Date of death:1952 Occupation:engineer

Al these are wikidata properties

Joe On 19 Aug 2014 20:47, "Andrew Gray" andrew.gray@dunelm.org.uk wrote:

...

Can we try displaying them but marking them in some way?

eg/ "[German person 1885-1952]" for an automatic description, "German mechanical engineer, 1885-1952" for a human-written one - depending on how and where they're shown in the interface we could also try italicising them, using a lighter shade of text, or something else to mark them out a bit.

Using square brackets is a relatively accepted marking for hypothetical or interpolated information, and would serve to indicate that this is automatically generated and could be wrong. If we want to be more aggressive about it, we could add question marks - "[German person 1885-1952?]" would make it even clearer it's provisional.

Andrew.

On 19 August 2014 15:13, Lydia Pintscher lydia.pintscher@wikimedia.de wrote:

...
On Tue, Aug 19, 2014 at 11:19 AM, David Cuenca dacuetu@gmail.com

wrote:

...
...
Thanks for the stats, Gerard. Two thoughts:

With so many items without description I wonder why we don't have the

automatic descriptions gadget enabled by default.

I am a bit worried about enabling this by default for everyone as a gadget. We need the descriptions in a lot of places where people search for items. The next big one will be Commons. But _a lot_ more will come in the future. Think for example of tagging your blog post on Wordpress with Wikidata concepts. You'll need the descriptions. If we enable automatic descriptions on Wikidata now we will actively discourage people from entering more descriptions. That would be bad as 3rd parties then don't get the benefit of them. I am also hesitant to build this into Wikibase directly as it'd need quite some domain-knowledge for all I can tell at this point. That's something we need to avoid. Anyone got ideas how to get out of this?

Cheers Lydia

-- Lydia Pintscher - http://about.me/lydia.pintscher Product Manager for Wikidata

Wikimedia Deutschland e.V. Tempelhofer Ufer 23-24 10963 Berlin www.wikimedia.de

Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V.

Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter der Nummer 23855 Nz. Als gemeinnützig anerkannt durch das Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985.

Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l

--

Andrew Gray andrew.gray@dunelm.org.uk

Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l

Markus Krötzsch

20 Aug 20 Aug

3:14 a.m.

On 19.08.2014 16:13, Lydia Pintscher wrote:

...

On Tue, Aug 19, 2014 at 11:19 AM, David Cuenca dacuetu@gmail.com wrote:

...
Thanks for the stats, Gerard. Two thoughts:

With so many items without description I wonder why we don't have the

automatic descriptions gadget enabled by default.

I am a bit worried about enabling this by default for everyone as a gadget. We need the descriptions in a lot of places where people search for items. The next big one will be Commons. But _a lot_ more will come in the future. Think for example of tagging your blog post on Wordpress with Wikidata concepts. You'll need the descriptions. If we enable automatic descriptions on Wikidata now we will actively discourage people from entering more descriptions. That would be bad as 3rd parties then don't get the benefit of them. I am also hesitant to build this into Wikibase directly as it'd need quite some domain-knowledge for all I can tell at this point. That's something we need to avoid. Anyone got ideas how to get out of this?

It had been suggested recently on this list to store the autodescriptions in the data, e.g., using a robot. The question there was whether this would make future automated update too troublesome (since one would have to check if the description has been overwritten by a human in the meantime). I think this can be solved (see below). For thousands of non-described items this would be a large improvement.

An important point is that there are really two kinds of "descriptions" that we should keep separate, since they have two different purposes:

(1) to provide a clue for distinguishing items with the same label (2) to give a human-readable, informative summary of the data

The descriptions that we want to have stored on Wikidata are there for the first purpose ("type-1 descriptions" :-). Their main virtue is to be as much to the point as possible, so you can read them quickly in a small dropdown menu etc. (short and accurate, but just enough information to clarify what we are talking about).

Descriptions of the second kind are a completely different issue. They should not be stored on Wikidata (or anywhere), since they will continuously evolve. The more data you have, the better your type-2 description will get. For new kinds of data you will even have to extend the code that generates the texts. Also, these descriptions could be much longer, and indeed their "optimal" length would vary from application to application.

This is why I think that it is fairly safe to import (some) type-1 descriptions without this reducing in any way the importance of type-2 descriptions. Of course even type-1 descriptions will change over time (esp. for living persons or ongoing events), but most of them should be fairly stable (cities, species, many people, ...). Since we are only interested in a very concise description for this purpose, it might be possible to see if an item has enough data yet to create an "ultimate" description (in the sense that more data would not have an effect on the concise description anyway). Magnus will know more about this.

A simple, low-tech way of enabling future updates would be to store the list of imported descriptions somewhere so that future update robots can check if the description has been changed by someone in the meantime without having to consult the page history.

Cheers,

Markus

3629

Age (days ago)

3632

Last active (days ago)

wikidata@lists.wikimedia.org

15 comments

9 participants

tags (0)

participants (9)

Andrew Gray
David Cuenca
Gerard Meijssen
James Heald
Joe Filceolaire
Luca Martinelli
Lydia Pintscher
Markus Krötzsch
Thomas Douillard