Hey folks :)
The students team working on data quality is making good progress. For some reason their emails don't go through to this list so I am sending it for them below.
Cheers Lydia
Hey :)
We are a team of students from Hasso-Plattner-Institute in Potsdam, Germany. For our bachelor project we're working together with the team of Wikidata to ensure their data quality. On our wiki page (https://www.mediawiki.org/wiki/WikidataQuality) we introduce our projects in more detail. One of them deals with constraints, for which we need your input since we want to work on constraints as statements on properties and the final form of those still needs to be specified. So far, we hope you like our projects and we appreciate your input!
Cheers, the Wikidata Quality Team
Lydia, Thanks. Is this the team responsible for the suggested properties? As a Wikidata user, how do I see their work exactly? Jane
On Fri, Feb 20, 2015 at 11:40 AM, Lydia Pintscher < lydia.pintscher@wikimedia.de> wrote:
Hey folks :)
The students team working on data quality is making good progress. For some reason their emails don't go through to this list so I am sending it for them below.
Cheers Lydia
Hey :)
We are a team of students from Hasso-Plattner-Institute in Potsdam, Germany. For our bachelor project we're working together with the team of Wikidata to ensure their data quality. On our wiki page (https://www.mediawiki.org/wiki/WikidataQuality) we introduce our projects in more detail. One of them deals with constraints, for which we need your input since we want to work on constraints as statements on properties and the final form of those still needs to be specified. So far, we hope you like our projects and we appreciate your input!
Cheers, the Wikidata Quality Team
-- Lydia Pintscher - http://about.me/lydia.pintscher Product Manager for Wikidata
Wikimedia Deutschland e.V. Tempelhofer Ufer 23-24 10963 Berlin www.wikimedia.de
Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V.
Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter der Nummer 23855 Nz. Als gemeinnützig anerkannt durch das Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985.
Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
On Fri, Feb 20, 2015 at 11:54 AM, Jane Darnell jane023@gmail.com wrote:
Lydia, Thanks. Is this the team responsible for the suggested properties? As a Wikidata user, how do I see their work exactly?
Yes they are. https://www.mediawiki.org/wiki/WikidataQuality is the best overview of their work currently. Demos and so on will follow.
Cheers Lydia
Lydia, Thanks - I have used those suggestions tons of times and I have noticed the suggestions getting better. Great and useful work definitely! Jane
On Fri, Feb 20, 2015 at 11:57 AM, Lydia Pintscher < lydia.pintscher@wikimedia.de> wrote:
On Fri, Feb 20, 2015 at 11:54 AM, Jane Darnell jane023@gmail.com wrote:
Lydia, Thanks. Is this the team responsible for the suggested properties? As a Wikidata user, how do I see their work exactly?
Yes they are. https://www.mediawiki.org/wiki/WikidataQuality is the best overview of their work currently. Demos and so on will follow.
Cheers Lydia
-- Lydia Pintscher - http://about.me/lydia.pintscher Product Manager for Wikidata
Wikimedia Deutschland e.V. Tempelhofer Ufer 23-24 10963 Berlin www.wikimedia.de
Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V.
Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter der Nummer 23855 Nz. Als gemeinnützig anerkannt durch das Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985.
Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
On Fri, Feb 20, 2015 at 11:59 AM, Jane Darnell jane023@gmail.com wrote:
Lydia, Thanks - I have used those suggestions tons of times and I have noticed the suggestions getting better. Great and useful work definitely!
Ahhhhh sorry. Misunderstanding on my side ;-) I thought you mean the properties suggested for this project on the property proposal page. Not the suggestions when entering a new statement. The teams are from the same institution and professor but not the same students. And yay you like the previous project result. Hope you'll like this one as much.
Cheers Lydia
On Fri, Feb 20, 2015 at 11:57 AM, Lydia Pintscher < lydia.pintscher@wikimedia.de> wrote:
On Fri, Feb 20, 2015 at 11:54 AM, Jane Darnell jane023@gmail.com wrote:
Lydia, Thanks. Is this the team responsible for the suggested properties? As a Wikidata user, how do I see their work exactly?
Yes they are. https://www.mediawiki.org/wiki/WikidataQuality is the best overview of their work currently. Demos and so on will follow.
Cheers Lydia
-- Lydia Pintscher - http://about.me/lydia.pintscher Product Manager for Wikidata
Wikimedia Deutschland e.V. Tempelhofer Ufer 23-24 10963 Berlin www.wikimedia.de
Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V.
Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter der Nummer 23855 Nz. Als gemeinnützig anerkannt durch das Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985.
Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
OK thanks for the clarification. I believe both projects are related to the constraints, no? Anyway I think this is good work, though some of the exception reports are so huge that I wonder how useful some constraints are.
On Fri, Feb 20, 2015 at 11:57 AM, Lydia Pintscher < lydia.pintscher@wikimedia.de> wrote:
On Fri, Feb 20, 2015 at 11:54 AM, Jane Darnell jane023@gmail.com wrote:
Lydia, Thanks. Is this the team responsible for the suggested properties? As a Wikidata user, how do I see their work exactly?
Yes they are. https://www.mediawiki.org/wiki/WikidataQuality is the best overview of their work currently. Demos and so on will follow.
Cheers Lydia
-- Lydia Pintscher - http://about.me/lydia.pintscher Product Manager for Wikidata
Wikimedia Deutschland e.V. Tempelhofer Ufer 23-24 10963 Berlin www.wikimedia.de
Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V.
Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter der Nummer 23855 Nz. Als gemeinnützig anerkannt durch das Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985.
Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Hoi, I like where you are going a lot... GND is given as an example and for any source including the GND we know things can be wrong as well. They regularly are and there is a process in place for indicating where GND and WP-DE differ. This works well because in time the GND does look and remedy or comments. So when a source differs, we need a report. Such a report is GOLD. Because it enables us to verify what is right and what differs. The people who care so much about sources will have a field day with this BECAUSE it is actually worthwhile to spend time on such differences.
When GND has data and Wikidata does not, we should import from sources. I prefer data over no data any day. Particularly when we strive to have comparisons between multiple sources.. Nonsense will weed itself out in this way.
What we need is a process of reconciliation of what is there in the first place, where sources differ and how we deal with eventual changes in the sources (they do include the various Wikipedias, Wikisources, Wikibooks, Wikinews, Wikitravels..) Thanks, GerardM
On 20 February 2015 at 11:40, Lydia Pintscher lydia.pintscher@wikimedia.de wrote:
Hey folks :)
The students team working on data quality is making good progress. For some reason their emails don't go through to this list so I am sending it for them below.
Cheers Lydia
Hey :)
We are a team of students from Hasso-Plattner-Institute in Potsdam, Germany. For our bachelor project we're working together with the team of Wikidata to ensure their data quality. On our wiki page (https://www.mediawiki.org/wiki/WikidataQuality) we introduce our projects in more detail. One of them deals with constraints, for which we need your input since we want to work on constraints as statements on properties and the final form of those still needs to be specified. So far, we hope you like our projects and we appreciate your input!
Cheers, the Wikidata Quality Team
-- Lydia Pintscher - http://about.me/lydia.pintscher Product Manager for Wikidata
Wikimedia Deutschland e.V. Tempelhofer Ufer 23-24 10963 Berlin www.wikimedia.de
Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V.
Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter der Nummer 23855 Nz. Als gemeinnützig anerkannt durch das Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985.
Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Hoi, One thought, the majority of statements are not made at Wikidata but are done using Widar or a bot. How will this affect these processes ? Thanks, GerardM
On 20 February 2015 at 12:44, Gerard Meijssen gerard.meijssen@gmail.com wrote:
Hoi, I like where you are going a lot... GND is given as an example and for any source including the GND we know things can be wrong as well. They regularly are and there is a process in place for indicating where GND and WP-DE differ. This works well because in time the GND does look and remedy or comments. So when a source differs, we need a report. Such a report is GOLD. Because it enables us to verify what is right and what differs. The people who care so much about sources will have a field day with this BECAUSE it is actually worthwhile to spend time on such differences.
When GND has data and Wikidata does not, we should import from sources. I prefer data over no data any day. Particularly when we strive to have comparisons between multiple sources.. Nonsense will weed itself out in this way.
What we need is a process of reconciliation of what is there in the first place, where sources differ and how we deal with eventual changes in the sources (they do include the various Wikipedias, Wikisources, Wikibooks, Wikinews, Wikitravels..) Thanks, GerardM
On 20 February 2015 at 11:40, Lydia Pintscher < lydia.pintscher@wikimedia.de> wrote:
Hey folks :)
The students team working on data quality is making good progress. For some reason their emails don't go through to this list so I am sending it for them below.
Cheers Lydia
Hey :)
We are a team of students from Hasso-Plattner-Institute in Potsdam, Germany. For our bachelor project we're working together with the team of Wikidata to ensure their data quality. On our wiki page (https://www.mediawiki.org/wiki/WikidataQuality) we introduce our projects in more detail. One of them deals with constraints, for which we need your input since we want to work on constraints as statements on properties and the final form of those still needs to be specified. So far, we hope you like our projects and we appreciate your input!
Cheers, the Wikidata Quality Team
-- Lydia Pintscher - http://about.me/lydia.pintscher Product Manager for Wikidata
Wikimedia Deutschland e.V. Tempelhofer Ufer 23-24 10963 Berlin www.wikimedia.de
Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V.
Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter der Nummer 23855 Nz. Als gemeinnützig anerkannt durch das Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985.
Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Hi GerardM,
about “importing from sources”: To be on the save site regarding data licenses and usage, for now we just want to compare existing data in Wikidata instead of importing additional data. But we have thought about a list of additional information in the report, so it can be added manually.
About “changes in the sources”: The plan is to update the data dumps regularly associated with running the cross-check on the latest data.
About imported data from Widar or bots: So far as we understand it, it does not really affect those processes. The cross-check is performed on all verifiable properties (that map to the information of an external source) of verifiable items (that possess an external identifier to an implemented external source), regardless of how these items were created.
Cheers,
the Wikidata Quality Team
Gerard Meijssen gerard.meijssen@gmail.com schrieb am Fr., 20. Feb. 2015 um 13:04 Uhr:
Hoi, One thought, the majority of statements are not made at Wikidata but are done using Widar or a bot. How will this affect these processes ? Thanks, GerardM
On 20 February 2015 at 12:44, Gerard Meijssen gerard.meijssen@gmail.com wrote:
Hoi, I like where you are going a lot... GND is given as an example and for any source including the GND we know things can be wrong as well. They regularly are and there is a process in place for indicating where GND and WP-DE differ. This works well because in time the GND does look and remedy or comments. So when a source differs, we need a report. Such a report is GOLD. Because it enables us to verify what is right and what differs. The people who care so much about sources will have a field day with this BECAUSE it is actually worthwhile to spend time on such differences.
When GND has data and Wikidata does not, we should import from sources. I prefer data over no data any day. Particularly when we strive to have comparisons between multiple sources.. Nonsense will weed itself out in this way.
What we need is a process of reconciliation of what is there in the first place, where sources differ and how we deal with eventual changes in the sources (they do include the various Wikipedias, Wikisources, Wikibooks, Wikinews, Wikitravels..) Thanks, GerardM
On 20 February 2015 at 11:40, Lydia Pintscher < lydia.pintscher@wikimedia.de> wrote:
Hey folks :)
The students team working on data quality is making good progress. For some reason their emails don't go through to this list so I am sending it for them below.
Cheers Lydia
Hey :)
We are a team of students from Hasso-Plattner-Institute in Potsdam, Germany. For our bachelor project we're working together with the team of Wikidata to ensure their data quality. On our wiki page (https://www.mediawiki.org/wiki/WikidataQuality) we introduce our projects in more detail. One of them deals with constraints, for which we need your input since we want to work on constraints as statements on properties and the final form of those still needs to be specified. So far, we hope you like our projects and we appreciate your input!
Cheers, the Wikidata Quality Team
-- Lydia Pintscher - http://about.me/lydia.pintscher Product Manager for Wikidata
Wikimedia Deutschland e.V. Tempelhofer Ufer 23-24 10963 Berlin www.wikimedia.de
Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V.
Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter der Nummer 23855 Nz. Als gemeinnützig anerkannt durch das Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985.
Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Hey folks :)
I wanted to give you a short update on what the students are working on. Attached are two screenshots of the current state of the project. It's two special pages. The screenshots show them working on test data (so there are more violations than in the real data). This is working pretty smoothly now. Next step is integration in the items themselves so that you can see a little indicator next to problematic statements.
Cheers Lydia