Hi,
regarding an actual topic in Germany about publication of the timetable-data of Deutsche Bahn (German national railway company) and their willingness of a discussion with other Open-Data-Supporters it may be a good idea of providing an expiration dates for Wikidata-records.
In their open letter to Mr. Kreil [1] they announced that it may cause problems providing the timetable-data in an open way if e.g. anybody uses old data.
Marco
[1] http://www.db-vertrieb.com/db_vertrieb/view/service/open_plan_b.shtml
On 30.09.2012 11:59, Marco Fleckinger wrote:
Hi,
regarding an actual topic in Germany about publication of the timetable-data of Deutsche Bahn (German national railway company) and their willingness of a discussion with other Open-Data-Supporters it may be a good idea of providing an expiration dates for Wikidata-records.
In their open letter to Mr. Kreil [1] they announced that it may cause problems providing the timetable-data in an open way if e.g. anybody uses old data.
Or approach to this is to mark individual statements as "historic", while the most current data is marked as "preferred". This is already necessary for basic things like population numbers. However, I don't think we need to expire data automatically - it should just be superseded by newer information.
-- daniel
Hi,
On 30.09.2012 12:07, Daniel Kinzler wrote:
On 30.09.2012 11:59, Marco Fleckinger wrote:
regarding an actual topic in Germany about publication of the timetable-data of Deutsche Bahn (German national railway company) and their willingness of a discussion with other Open-Data-Supporters it may be a good idea of providing an expiration dates for Wikidata-records.
In their open letter to Mr. Kreil [1] they announced that it may cause problems providing the timetable-data in an open way if e.g. anybody uses old data.
Or approach to this is to mark individual statements as "historic", while the most current data is marked as "preferred". This is already necessary for basic things like population numbers. However, I don't think we need to expire data automatically - it should just be superseded by newer information.
Possible solution, but what will you do if a railway company does not provide actual data anymore? Then the old data is still the most actual.
Another using case would be a new law. There you could determine the actual one by checking such dates with the server's date. The new law will be in the repository before, but will be visible as it comes in force.
So also a launch field would be great.
So with launch and expiration dates it would be possible that there will be no data if there is no data in the database, very important for some applications as Deutsche Bahn says.
Marco
Hi, I think a valid_from and valid_to-field would be a great idea. Especially for queries on the db. But I think it is a fundamental design decision and I'm not sure if it's possible to integrate now...
LB
Marco Fleckinger marco.fleckinger@gmail.com schrieb:
Hi,
On 30.09.2012 12:07, Daniel Kinzler wrote:
On 30.09.2012 11:59, Marco Fleckinger wrote:
regarding an actual topic in Germany about publication of the
timetable-data of
Deutsche Bahn (German national railway company) and their
willingness of a
discussion with other Open-Data-Supporters it may be a good idea of
providing an
expiration dates for Wikidata-records.
In their open letter to Mr. Kreil [1] they announced that it may
cause problems
providing the timetable-data in an open way if e.g. anybody uses old
data.
Or approach to this is to mark individual statements as "historic",
while the
most current data is marked as "preferred". This is already necessary
for basic
things like population numbers. However, I don't think we need to
expire data
automatically - it should just be superseded by newer information.
Possible solution, but what will you do if a railway company does not provide actual data anymore? Then the old data is still the most actual.
Another using case would be a new law. There you could determine the actual one by checking such dates with the server's date. The new law will be in the repository before, but will be visible as it comes in force.
So also a launch field would be great.
So with launch and expiration dates it would be possible that there will be no data if there is no data in the database, very important for some
applications as Deutsche Bahn says.
Marco
Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
On 30/09/12 13:00, benedix@zedat.fu-berlin.de wrote:
Hi, I think a valid_from and valid_to-field would be a great idea. Especially for queries on the db. But I think it is a fundamental design decision and I'm not sure if it's possible to integrate now...
LB
Seconded.
This would, for example, allow next year's train timetables to be loaded into the database prior to their period of validity, and for the cutover between last year's and this year's timetables to then happen automatically at the appointed date.
-- N.
2012/9/30 Neil Harris neil@tonal.clara.co.uk:
On 30/09/12 13:00, benedix@zedat.fu-berlin.de wrote:
Hi, I think a valid_from and valid_to-field would be a great idea. Especially for queries on the db. But I think it is a fundamental design decision and I'm not sure if it's possible to integrate now...
LB
Seconded.
This would, for example, allow next year's train timetables to be loaded into the database prior to their period of validity, and for the cutover between last year's and this year's timetables to then happen automatically at the appointed date.
-- N.
Sorry if I get back on this issue, but I don't think that the train timetables will be THAT important in the future - except for the ones who have to use DeutscheBahn's services.
It may be, by the way, that this example is just a wrong one to illustrate something that Wikidata WILL need, that are "historical data".
There's plenty of datas that are "valid" only from $DateA to $DateB (i.e. the affiliation to a particular federation, the use of a particular flag, the definition of a particular capitol city...). Is this something that you guys already dealt with or not?
Plus, there's also another thing. Let's say that I want to add to [[Item:Football Federation of Israel]] a particular property "Member". Now, the Israelis were: * full members of AFC from 1952 to 1974, * associated members of OFC from 1974 to 1979, and from 1984 to 1991; * associated members of UEFA from 1979 to 1984; * full members of UEFA from 1991 on.
How do we deal with the second statement ("valid from $DateA to $DateB and from $DateC to $DateD")? Is that something already resolved?
Thanks in advance for the answers.
Hi,
On 10.10.2012 15:51, Luca Martinelli wrote:
2012/9/30 Neil Harrisneil@tonal.clara.co.uk:
On 30/09/12 13:00, benedix@zedat.fu-berlin.de wrote:
Hi, I think a valid_from and valid_to-field would be a great idea. Especially for queries on the db. But I think it is a fundamental design decision and I'm not sure if it's possible to integrate now...
LB
Seconded.
This would, for example, allow next year's train timetables to be loaded into the database prior to their period of validity, and for the cutover between last year's and this year's timetables to then happen automatically at the appointed date.
-- N.
Sorry if I get back on this issue, but I don't think that the train timetables will be THAT important in the future - except for the ones who have to use DeutscheBahn's services.
It may be, by the way, that this example is just a wrong one to illustrate something that Wikidata WILL need, that are "historical data".
The idea was just the result of a recent public discussion in Germany.
As you read the open letter from Deutsche Bahn, you may can understand what was meant. Unfortunately it is only in German. I translated it to English, but do not want to publish it anywhere before it's proved by the original Author.
There's plenty of datas that are "valid" only from $DateA to $DateB (i.e. the affiliation to a particular federation, the use of a particular flag, the definition of a particular capitol city...). Is this something that you guys already dealt with or not?
E.g. laws, incumbencies, memberships, employments, jobs, periods in history, seasons, crises, special offers in shops, etc.
Some of those examples are relevant for Wikipedia, but WikiData may be used in other cases as well.
Plus, there's also another thing. Let's say that I want to add to [[Item:Football Federation of Israel]] a particular property "Member". Now, the Israelis were:
- full members of AFC from 1952 to 1974,
- associated members of OFC from 1974 to 1979, and from 1984 to 1991;
- associated members of UEFA from 1979 to 1984;
- full members of UEFA from 1991 on.
How do we deal with the second statement ("valid from $DateA to $DateB and from $DateC to $DateD")? Is that something already resolved?
Sounds like a datatype like this:
function period (start, end) { // [...] var _start = start; var _end = end; this.set_start = function (start) { _start = start; }; this.get_start = function () { return _start; }; this.get_end = function () { return _end; }; this.set_end = function (end) { _end = end; }; this.get_duration = function () { return _end-_start; }; }
var Israel = { memberships:{ OFC:[ new period (new Date(1974, 0, 1), new Date(1979, 11, 31)), new period (new Date(1984, 0, 1), new Date(1991, 11, 31)) ] } };
or anything similar.
Marco
Hi,
there are lot's of other data that is not valid today, but will be in the future... think about a law that is changed today but the old version is valid until end of year...
Is there something like VALID_FROM and VALID_TO in your Database?
LB
Hi,
On 10.10.2012 15:51, Luca Martinelli wrote:
2012/9/30 Neil Harrisneil@tonal.clara.co.uk:
On 30/09/12 13:00, benedix@zedat.fu-berlin.de wrote:
Hi, I think a valid_from and valid_to-field would be a great idea. Especially for queries on the db. But I think it is a fundamental design decision and I'm not sure if it's possible to integrate now...
LB
Seconded.
This would, for example, allow next year's train timetables to be loaded into the database prior to their period of validity, and for the cutover between last year's and this year's timetables to then happen automatically at the appointed date.
-- N.
Sorry if I get back on this issue, but I don't think that the train timetables will be THAT important in the future - except for the ones who have to use DeutscheBahn's services.
It may be, by the way, that this example is just a wrong one to illustrate something that Wikidata WILL need, that are "historical data".
The idea was just the result of a recent public discussion in Germany.
As you read the open letter from Deutsche Bahn, you may can understand what was meant. Unfortunately it is only in German. I translated it to English, but do not want to publish it anywhere before it's proved by the original Author.
There's plenty of datas that are "valid" only from $DateA to $DateB (i.e. the affiliation to a particular federation, the use of a particular flag, the definition of a particular capitol city...). Is this something that you guys already dealt with or not?
E.g. laws, incumbencies, memberships, employments, jobs, periods in history, seasons, crises, special offers in shops, etc.
Some of those examples are relevant for Wikipedia, but WikiData may be used in other cases as well.
Plus, there's also another thing. Let's say that I want to add to [[Item:Football Federation of Israel]] a particular property "Member". Now, the Israelis were:
- full members of AFC from 1952 to 1974,
- associated members of OFC from 1974 to 1979, and from 1984 to 1991;
- associated members of UEFA from 1979 to 1984;
- full members of UEFA from 1991 on.
How do we deal with the second statement ("valid from $DateA to $DateB and from $DateC to $DateD")? Is that something already resolved?
Sounds like a datatype like this:
function period (start, end) { // [...] var _start = start; var _end = end; this.set_start = function (start) { _start = start; }; this.get_start = function () { return _start; }; this.get_end = function () { return _end; }; this.set_end = function (end) { _end = end; }; this.get_duration = function () { return _end-_start; }; }
var Israel = { memberships:{ OFC:[ new period (new Date(1974, 0, 1), new Date(1979, 11, 31)), new period (new Date(1984, 0, 1), new Date(1991, 11, 31)) ] } };
or anything similar.
Marco
Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
On Thu, Oct 11, 2012 at 11:13 AM, benedix@zedat.fu-berlin.de wrote:
Hi,
there are lot's of other data that is not valid today, but will be in the future... think about a law that is changed today but the old version is valid until end of year...
Is there something like VALID_FROM and VALID_TO in your Database?
LB
This is basically what the qualifiers do. http://meta.wikimedia.org/wiki/Wikidata/Notes/Data_model_primer has more details.
Cheers Lydia
Hi,
On 11.10.2012 16:12, Lydia Pintscher wrote:
On Thu, Oct 11, 2012 at 11:13 AM,benedix@zedat.fu-berlin.de wrote:
Is there something like VALID_FROM and VALID_TO in your Database?
LB
This is basically what the qualifiers do. http://meta.wikimedia.org/wiki/Wikidata/Notes/Data_model_primer has more details.
Hm, sorry I didn't remember this. Thank you for reminding!
Marco
Has there been any progress on time-based qualifiers since this thread? If so, can someone point me to relevant discussions/proposals?
Thanks Dario
On Oct 11, 2012, at 8:28 AM, Marco Fleckinger marco.fleckinger@gmail.com wrote:
Hi,
On 11.10.2012 16:12, Lydia Pintscher wrote:
On Thu, Oct 11, 2012 at 11:13 AM,benedix@zedat.fu-berlin.de wrote:
Is there something like VALID_FROM and VALID_TO in your Database?
LB
This is basically what the qualifiers do. http://meta.wikimedia.org/wiki/Wikidata/Notes/Data_model_primer has more details.
Hm, sorry I didn't remember this. Thank you for reminding!
Marco
Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Such an old topic, and was unread in my mailbox by now. :-)
Although having validity intervals in Data would be great, I think now, as we have Lua, there is a client-side approach at least for such specific data as DB's timetable that is interesting mostly for dewiki.
So is the question whether there should be strong types for qualifiers as opposed to just a string with custom logic for each type of template box that will be displayed in articles? My opinion is that it is too difficult to know in advance how many and what types of qualifiers the data will have to add much specificity in the database right now. Maybe date ranges will be very frequent, maybe they won't. Maybe precision and uncertainty ranges will be frequent, maybe they won't.
Date: Thu, 14 Mar 2013 08:00:46 +0100 From: wikiposta@gmail.com To: wikidata-l@lists.wikimedia.org Subject: Re: [Wikidata-l] Expiration date for data
Such an old topic, and was unread in my mailbox by now. :-)
Although having validity intervals in Data would be great, I think now, as we have Lua, there is a client-side approach at least for such specific data as DB's timetable that is interesting mostly for dewiki.
Hi Dario,
two or three features are still missing to enable that (sorted in order we are probably going to deploy them): * qualifiers * the time datatype * statement ranks
As soon as they are available, this can be modeled in a way that it can be useful for projects accessing the data.
So, progress yet, but it's not there yet :)
Cheers, Denny
2013/3/14 Dario Taraborelli dtaraborelli@wikimedia.org
Has there been any progress on time-based qualifiers since this thread? If so, can someone point me to relevant discussions/proposals?
Thanks Dario
On Oct 11, 2012, at 8:28 AM, Marco Fleckinger marco.fleckinger@gmail.com wrote:
Hi,
On 11.10.2012 16:12, Lydia Pintscher wrote:
On Thu, Oct 11, 2012 at 11:13 AM,benedix@zedat.fu-berlin.de wrote:
Is there something like VALID_FROM and VALID_TO in your Database?
LB
This is basically what the qualifiers do. http://meta.wikimedia.org/wiki/Wikidata/Notes/Data_model_primer has more details.
Hm, sorry I didn't remember this. Thank you for reminding!
Marco
Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Hoi, The qualifiers, would that be something like ... if the language is English, the string can be a noun, a verb, an adjective ....
When the sting is a Dutch noun, it can be masculine, feminine or neuter ??
When a qualifier allows for such constructs, we are halfway there to implementing a structure that allows for importing OmegaWiki data..
Thanks, Gerard
On 14 March 2013 11:57, Denny Vrandečić denny.vrandecic@wikimedia.dewrote:
Hi Dario,
two or three features are still missing to enable that (sorted in order we are probably going to deploy them):
- qualifiers
- the time datatype
- statement ranks
As soon as they are available, this can be modeled in a way that it can be useful for projects accessing the data.
So, progress yet, but it's not there yet :)
Cheers, Denny
2013/3/14 Dario Taraborelli dtaraborelli@wikimedia.org
Has there been any progress on time-based qualifiers since this thread? If so, can someone point me to relevant discussions/proposals?
Thanks Dario
On Oct 11, 2012, at 8:28 AM, Marco Fleckinger marco.fleckinger@gmail.com wrote:
Hi,
On 11.10.2012 16:12, Lydia Pintscher wrote:
On Thu, Oct 11, 2012 at 11:13 AM,benedix@zedat.fu-berlin.de wrote:
Is there something like VALID_FROM and VALID_TO in your Database?
LB
This is basically what the qualifiers do. http://meta.wikimedia.org/wiki/Wikidata/Notes/Data_model_primer has more details.
Hm, sorry I didn't remember this. Thank you for reminding!
Marco
Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
-- Project director Wikidata Wikimedia Deutschland e.V. | Obentrautstr. 72 | 10963 Berlin Tel. +49-30-219 158 26-0 | http://wikimedia.de
Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e.V. Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter der Nummer 23855 B. Als gemeinnützig anerkannt durch das Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985.
Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Thanks Denny for the update and everybody else for the feedback.
The cases I am particularly interested in are those of qualifiers to express that "Elizabeth I was Queen of England between 1558 and 1603", or that "the city of Vibo Valentia was in the Province of Catanzaro up to 1996, in the Province of Vibo Valentia until 2014 and in the Province of Catanzaro-Crotone-Vibo Valentia after 2014".
Until these qualifiers become available, the only way to represent that a region has changed its governor is to overwrite the old value of "head of local government" with the current one.
Dario
On Mar 14, 2013, at 3:57 AM, Denny Vrandečić denny.vrandecic@wikimedia.de wrote:
Hi Dario,
two or three features are still missing to enable that (sorted in order we are probably going to deploy them):
- qualifiers
- the time datatype
- statement ranks
As soon as they are available, this can be modeled in a way that it can be useful for projects accessing the data.
So, progress yet, but it's not there yet :)
Cheers, Denny
2013/3/14 Dario Taraborelli dtaraborelli@wikimedia.org Has there been any progress on time-based qualifiers since this thread? If so, can someone point me to relevant discussions/proposals?
Thanks Dario
On Oct 11, 2012, at 8:28 AM, Marco Fleckinger marco.fleckinger@gmail.com wrote:
Hi,
On 11.10.2012 16:12, Lydia Pintscher wrote:
On Thu, Oct 11, 2012 at 11:13 AM,benedix@zedat.fu-berlin.de wrote:
Is there something like VALID_FROM and VALID_TO in your Database?
LB
This is basically what the qualifiers do. http://meta.wikimedia.org/wiki/Wikidata/Notes/Data_model_primer has more details.
Hm, sorry I didn't remember this. Thank you for reminding!
Marco
Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
-- Project director Wikidata Wikimedia Deutschland e.V. | Obentrautstr. 72 | 10963 Berlin Tel. +49-30-219 158 26-0 | http://wikimedia.de
Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e.V. Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter der Nummer 23855 B. Als gemeinnützig anerkannt durch das Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985. _______________________________________________ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Yes, I think once qualifiers are enabled you would just have something like:...Property(head of local government) ... Value(Elizabeth I) - Qualifier("1558-1603") - Sources() Value(James VI and I) - Qualifier("1603-1625") - Sources() ......
There was a discussion about whether qualifiers should have specific datatypes other than just string, but I think we should only do that if needed. From: dtaraborelli@wikimedia.org Date: Thu, 14 Mar 2013 21:27:32 -0700 To: wikidata-l@lists.wikimedia.org Subject: Re: [Wikidata-l] Expiration date for data
Thanks Denny for the update and everybody else for the feedback. The cases I am particularly interested in are those of qualifiers to express that "Elizabeth I was Queen of England between 1558 and 1603", or that "the city of Vibo Valentia was in the Province of Catanzaro up to 1996, in the Province of Vibo Valentia until 2014 and in the Province of Catanzaro-Crotone-Vibo Valentia after 2014". Until these qualifiers become available, the only way to represent that a region has changed its governor is to overwrite the old value of "head of local government" with the current one. Dario On Mar 14, 2013, at 3:57 AM, Denny Vrandečić denny.vrandecic@wikimedia.de wrote:Hi Dario, two or three features are still missing to enable that (sorted in order we are probably going to deploy them):* qualifiers* the time datatype * statement ranks
As soon as they are available, this can be modeled in a way that it can be useful for projects accessing the data. So, progress yet, but it's not there yet :)
Cheers,Denny
2013/3/14 Dario Taraborelli dtaraborelli@wikimedia.org
Has there been any progress on time-based qualifiers since this thread?
If so, can someone point me to relevant discussions/proposals?
Thanks
Dario
On Oct 11, 2012, at 8:28 AM, Marco Fleckinger marco.fleckinger@gmail.com wrote:
Hi,
On 11.10.2012 16:12, Lydia Pintscher wrote:
On Thu, Oct 11, 2012 at 11:13 AM,benedix@zedat.fu-berlin.de wrote:
Is there something like VALID_FROM and VALID_TO in your Database?
LB
This is basically what the qualifiers do.
http://meta.wikimedia.org/wiki/Wikidata/Notes/Data_model_primer has
more details.
Hm, sorry I didn't remember this. Thank you for reminding!
Marco
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
_______________________________________________
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l
On Fri, Mar 15, 2013 at 1:49 AM, Michael Hale hale.michael.jr@live.comwrote:
Yes, I think once qualifiers are enabled you would just have something like: ... Property(head of local government) ... Value(Elizabeth I) - Qualifier("1558-1603") - Sources() Value(James VI and I) - Qualifier("1603-1625") - Sources() ... ...
There was a discussion about whether qualifiers should have specific datatypes other than just string, but I think we should only do that if needed.
Clearly the example that you gave is one where non-string datatypes are critically important. If you don't know that they're dates, you have no way of telling when they were in those roles.
Tom
For most of the scenarios I can think of, parsing the dates out of strings that are in a standard format by convention will be much easier. The number of ways people will want to use qualifiers will increase like the number of properties and items. So the way I see it, we have to support string-based qualifiers at the minimum. Then I think we should only support strongly typed qualifiers if performance requires it. By setting an update polling frequency on templates that use the information I don't think we'll run into performance issues for most scenarios. Even with this example the qualifier type is a date range, not just a date. So do we want them to have to choose from a large, fixed list of qualifier types or just look at a similar example and set a string to something similar and then gradually enforce types on the most popular uses that we see. I think this type of organic growth as opposed to trying to guess the qualifier types in advance is exactly in the spirit of Wikipedia.
Date: Fri, 15 Mar 2013 09:58:38 -0400 From: tfmorris@gmail.com To: wikidata-l@lists.wikimedia.org Subject: Re: [Wikidata-l] Expiration date for data
On Fri, Mar 15, 2013 at 1:49 AM, Michael Hale hale.michael.jr@live.com wrote:
Yes, I think once qualifiers are enabled you would just have something like:...Property(head of local government) ... Value(Elizabeth I) - Qualifier("1558-1603") - Sources() Value(James VI and I) - Qualifier("1603-1625") - Sources() ......
There was a discussion about whether qualifiers should have specific datatypes other than just string, but I think we should only do that if needed.
Clearly the example that you gave is one where non-string datatypes are critically important. If you don't know that they're dates, you have no way of telling when they were in those roles.
Tom
_______________________________________________ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
I disagree, and fully concur with Tom: a generic string type for a datetime qualifier defies the purpose of making wikidata statements well-formed and machine-readable. I don't think we should enforce typing for *all* qualifiers and I second the general "organic growth" approach, but datetime qualifiers strike me as a fundamental exception. Would you represent geocoordinates as a generic string and wait for "organic growth" to determine the appropriate datatype? I appreciate the overheads of adding datatype support, but this decision will have a major impact on the shape of collaborative work on wikidata.
Denny – on a related note, I wanted to ask you what is the priority of qualifier support relative to the other items you mentioned in your list. As I noted in my previous post, the only way for an editor to correct an outdated statement is to remove information (e.g. Lombardy: head of local government: -Roberto Formigoni +Roberto Maroni ): this information will then be lost forever in an item's revision history. The sooner we introduce basic support for qualifiers, the sooner we can avoid removing valuable information from wikidata entries just for the sake of keeping them up-to-date.
Dario
On Mar 15, 2013, at 10:09 AM, Michael Hale hale.michael.jr@live.com wrote:
For most of the scenarios I can think of, parsing the dates out of strings that are in a standard format by convention will be much easier. The number of ways people will want to use qualifiers will increase like the number of properties and items. So the way I see it, we have to support string-based qualifiers at the minimum. Then I think we should only support strongly typed qualifiers if performance requires it. By setting an update polling frequency on templates that use the information I don't think we'll run into performance issues for most scenarios. Even with this example the qualifier type is a date range, not just a date. So do we want them to have to choose from a large, fixed list of qualifier types or just look at a similar example and set a string to something similar and then gradually enforce types on the most popular uses that we see. I think this type of organic growth as opposed to trying to guess the qualifier types in advance is exactly in the spirit of Wikipedia.
Date: Fri, 15 Mar 2013 09:58:38 -0400 From: tfmorris@gmail.com To: wikidata-l@lists.wikimedia.org Subject: Re: [Wikidata-l] Expiration date for data
On Fri, Mar 15, 2013 at 1:49 AM, Michael Hale hale.michael.jr@live.com wrote: Yes, I think once qualifiers are enabled you would just have something like: ... Property(head of local government) ... Value(Elizabeth I) - Qualifier("1558-1603") - Sources() Value(James VI and I) - Qualifier("1603-1625") - Sources() ... ...
There was a discussion about whether qualifiers should have specific datatypes other than just string, but I think we should only do that if needed.
Clearly the example that you gave is one where non-string datatypes are critically important. If you don't know that they're dates, you have no way of telling when they were in those roles.
Tom
_______________________________________________ Wikidata-l mailing listWikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l _______________________________________________ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
I think it will be about the same amount of work on the client side for templates either way. When would I use a coordinate as a qualifier? I can think of plenty of places where they would be used as property value datatypes, but not for qualifiers. What happens if someone was a head of government twice? Do they have four datetime qualifiers? Two starts and two ends? Would the clients be able to assume they are listed in order or need to sort them anyway to just be safe? Clients would then just have to check for string qualifiers and datetime qualifiers, so I don't think it would make using Wikidata any easier.
From: dtaraborelli@wikimedia.org Date: Wed, 20 Mar 2013 10:12:05 -0700 To: wikidata-l@lists.wikimedia.org Subject: Re: [Wikidata-l] Expiration date for data
I disagree, and fully concur with Tom: a generic string type for a datetime qualifier defies the purpose of making wikidata statements well-formed and machine-readable.I don't think we should enforce typing for *all* qualifiers and I second the general "organic growth" approach, but datetime qualifiers strike me as a fundamental exception. Would you represent geocoordinates as a generic string and wait for "organic growth" to determine the appropriate datatype? I appreciate the overheads of adding datatype support, but this decision will have a major impact on the shape of collaborative work on wikidata.
Denny – on a related note, I wanted to ask you what is the priority of qualifier support relative to the other items you mentioned in your list. As I noted in my previous post, the only way for an editor to correct an outdated statement is to remove information (e.g. Lombardy: head of local government: -Roberto Formigoni +Roberto Maroni ): this information will then be lost forever in an item's revision history. The sooner we introduce basic support for qualifiers, the sooner we can avoid removing valuable information from wikidata entries just for the sake of keeping them up-to-date. Dario On Mar 15, 2013, at 10:09 AM, Michael Hale hale.michael.jr@live.com wrote:For most of the scenarios I can think of, parsing the dates out of strings that are in a standard format by convention will be much easier. The number of ways people will want to use qualifiers will increase like the number of properties and items. So the way I see it, we have to support string-based qualifiers at the minimum. Then I think we should only support strongly typed qualifiers if performance requires it. By setting an update polling frequency on templates that use the information I don't think we'll run into performance issues for most scenarios. Even with this example the qualifier type is a date range, not just a date. So do we want them to have to choose from a large, fixed list of qualifier types or just look at a similar example and set a string to something similar and then gradually enforce types on the most popular uses that we see. I think this type of organic growth as opposed to trying to guess the qualifier types in advance is exactly in the spirit of Wikipedia.
Date: Fri, 15 Mar 2013 09:58:38 -0400 From: tfmorris@gmail.com To: wikidata-l@lists.wikimedia.org Subject: Re: [Wikidata-l] Expiration date for data
On Fri, Mar 15, 2013 at 1:49 AM, Michael Hale hale.michael.jr@live.com wrote: Yes, I think once qualifiers are enabled you would just have something like:...Property(head of local government) ... Value(Elizabeth I) - Qualifier("1558-1603") - Sources() Value(James VI and I) - Qualifier("1603-1625") - Sources() ......
There was a discussion about whether qualifiers should have specific datatypes other than just string, but I think we should only do that if needed. Clearly the example that you gave is one where non-string datatypes are critically important. If you don't know that they're dates, you have no way of telling when they were in those roles. Tom _______________________________________________ Wikidata-l mailing listWikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l_____________________... Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
_______________________________________________ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
We will have a time datatype, and every property is strongly typed. This is also true for properties used as qualifiers.
Regarding the priority of qualifiers: very high. They are the next major UI feature to be deployed, and as far as I can tell from the progress of the team it looks like they will be deployed in April.
Cheers, Denny
2013/3/20 Dario Taraborelli dtaraborelli@wikimedia.org
I disagree, and fully concur with Tom: a generic string type for a datetime qualifier defies the purpose of making wikidata statements well-formed and machine-readable. I don't think we should enforce typing for *all* qualifiers and I second the general "organic growth" approach, but datetime qualifiers strike me as a fundamental exception. Would you represent geocoordinates as a generic string and wait for "organic growth" to determine the appropriate datatype? I appreciate the overheads of adding datatype support, but this decision will have a major impact on the shape of collaborative work on wikidata.
Denny – on a related note, I wanted to ask you what is the priority of qualifier support relative to the other items you mentioned in your list. As I noted in my previous post, the only way for an editor to correct an outdated statement is to remove information (e.g. Lombardy: head of local government: -Roberto Formigoni +Roberto Maroni ): this information will then be lost forever in an item's revision history. The sooner we introduce basic support for qualifiers, the sooner we can avoid removing valuable information from wikidata entries just for the sake of keeping them up-to-date.
Dario
On Mar 15, 2013, at 10:09 AM, Michael Hale hale.michael.jr@live.com wrote:
For most of the scenarios I can think of, parsing the dates out of strings that are in a standard format by convention will be much easier. The number of ways people will want to use qualifiers will increase like the number of properties and items. So the way I see it, we have to support string-based qualifiers at the minimum. Then I think we should only support strongly typed qualifiers if performance requires it. By setting an update polling frequency on templates that use the information I don't think we'll run into performance issues for most scenarios. Even with this example the qualifier type is a date range, not just a date. So do we want them to have to choose from a large, fixed list of qualifier types or just look at a similar example and set a string to something similar and then gradually enforce types on the most popular uses that we see. I think this type of organic growth as opposed to trying to guess the qualifier types in advance is exactly in the spirit of Wikipedia.
Date: Fri, 15 Mar 2013 09:58:38 -0400 From: tfmorris@gmail.com To: wikidata-l@lists.wikimedia.org Subject: Re: [Wikidata-l] Expiration date for data
On Fri, Mar 15, 2013 at 1:49 AM, Michael Hale hale.michael.jr@live.com wrote:
Yes, I think once qualifiers are enabled you would just have something like: ... Property(head of local government) ... Value(Elizabeth I) - Qualifier("1558-1603") - Sources() Value(James VI and I) - Qualifier("1603-1625") - Sources() ... ...
There was a discussion about whether qualifiers should have specific datatypes other than just string, but I think we should only do that if needed.
Clearly the example that you gave is one where non-string datatypes are critically important. If you don't know that they're dates, you have no way of telling when they were in those roles.
Tom
_______________________________________________ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l _______________________________________________ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
That seems better to constrain the overall type of a qualifier to any property. It still doesn't feel exactly right, but I'm not sure what would. Now that I think about it more, for the case of heads of government it doesn't seem appropriate to use a qualifier at all to me. It would just be a list of items which are presumably people. Each of those items would then have a single date or list of dates for start of head of government and end of head of government. The qualifier would be redundant. It seems the downside to having everything be strongly typed like in Freebase is that you end up with really weird and specific entity types like "government leadership timespan" to try to capture all of the details that you want, and the downside to semi-weakly typed items in Wikidata is that you might end up with different items representing the same information with different properties or qualifiers. But I have faith that Wikidata will ultimately work and achieve stability and convergence for the most common types just like how template boxes naturally emerged on Wikipedia. And I think the key advantage of Wikidata is that it will achieve growth, stability, and convergence without suffocating from having too many weird and specific item types to try to bridge and glue different types of information together.
Date: Thu, 21 Mar 2013 15:40:39 +0100 From: denny.vrandecic@wikimedia.de To: wikidata-l@lists.wikimedia.org Subject: Re: [Wikidata-l] Expiration date for data
We will have a time datatype, and every property is strongly typed. This is also true for properties used as qualifiers. Regarding the priority of qualifiers: very high. They are the next major UI feature to be deployed, and as far as I can tell from the progress of the team it looks like they will be deployed in April.
Cheers,Denny
2013/3/20 Dario Taraborelli dtaraborelli@wikimedia.org
I disagree, and fully concur with Tom: a generic string type for a datetime qualifier defies the purpose of making wikidata statements well-formed and machine-readable. I don't think we should enforce typing for *all* qualifiers and I second the general "organic growth" approach, but datetime qualifiers strike me as a fundamental exception. Would you represent geocoordinates as a generic string and wait for "organic growth" to determine the appropriate datatype? I appreciate the overheads of adding datatype support, but this decision will have a major impact on the shape of collaborative work on wikidata.
Denny – on a related note, I wanted to ask you what is the priority of qualifier support relative to the other items you mentioned in your list. As I noted in my previous post, the only way for an editor to correct an outdated statement is to remove information (e.g. Lombardy: head of local government: -Roberto Formigoni +Roberto Maroni ): this information will then be lost forever in an item's revision history. The sooner we introduce basic support for qualifiers, the sooner we can avoid removing valuable information from wikidata entries just for the sake of keeping them up-to-date.
Dario On Mar 15, 2013, at 10:09 AM, Michael Hale hale.michael.jr@live.com wrote:
For most of the scenarios I can think of, parsing the dates out of strings that are in a standard format by convention will be much easier. The number of ways people will want to use qualifiers will increase like the number of properties and items. So the way I see it, we have to support string-based qualifiers at the minimum. Then I think we should only support strongly typed qualifiers if performance requires it. By setting an update polling frequency on templates that use the information I don't think we'll run into performance issues for most scenarios. Even with this example the qualifier type is a date range, not just a date. So do we want them to have to choose from a large, fixed list of qualifier types or just look at a similar example and set a string to something similar and then gradually enforce types on the most popular uses that we see. I think this type of organic growth as opposed to trying to guess the qualifier types in advance is exactly in the spirit of Wikipedia.
Date: Fri, 15 Mar 2013 09:58:38 -0400 From: tfmorris@gmail.com To: wikidata-l@lists.wikimedia.org
Subject: Re: [Wikidata-l] Expiration date for data
On Fri, Mar 15, 2013 at 1:49 AM, Michael Hale hale.michael.jr@live.com wrote:
Yes, I think once qualifiers are enabled you would just have something like:... Property(head of local government) ... Value(Elizabeth I) - Qualifier("1558-1603") - Sources() Value(James VI and I) - Qualifier("1603-1625") - Sources() ......
There was a discussion about whether qualifiers should have specific datatypes other than just string, but I think we should only do that if needed.
Clearly the example that you gave is one where non-string datatypes are critically important. If you don't know that they're dates, you have no way of telling when they were in those roles.
Tom _______________________________________________ Wikidata-l mailing listWikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l _______________________________________________ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
_______________________________________________
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l
We do have strong types, but only few of time: item, commons media, string, time, geo, URL. "Government leader" would not be a supported type.
The exact list and details are here: < http://meta.wikimedia.org/wiki/Wikidata/Data_model#Datatypes_and_their_Value...
Cheers, Denny
2013/3/21 Michael Hale hale.michael.jr@live.com
That seems better to constrain the overall type of a qualifier to any property. It still doesn't feel exactly right, but I'm not sure what would. Now that I think about it more, for the case of heads of government it doesn't seem appropriate to use a qualifier at all to me. It would just be a list of items which are presumably people. Each of those items would then have a single date or list of dates for start of head of government and end of head of government. The qualifier would be redundant. It seems the downside to having everything be strongly typed like in Freebase is that you end up with really weird and specific entity types like "government leadership timespan" to try to capture all of the details that you want, and the downside to semi-weakly typed items in Wikidata is that you might end up with different items representing the same information with different properties or qualifiers. But I have faith that Wikidata will ultimately work and achieve stability and convergence for the most common types just like how template boxes naturally emerged on Wikipedia. And I think the key advantage of Wikidata is that it will achieve growth, stability, and convergence without suffocating from having too many weird and specific item types to try to bridge and glue different types of information together.
Date: Thu, 21 Mar 2013 15:40:39 +0100 From: denny.vrandecic@wikimedia.de
To: wikidata-l@lists.wikimedia.org Subject: Re: [Wikidata-l] Expiration date for data
We will have a time datatype, and every property is strongly typed. This is also true for properties used as qualifiers.
Regarding the priority of qualifiers: very high. They are the next major UI feature to be deployed, and as far as I can tell from the progress of the team it looks like they will be deployed in April.
Cheers, Denny
2013/3/20 Dario Taraborelli dtaraborelli@wikimedia.org
I disagree, and fully concur with Tom: a generic string type for a datetime qualifier defies the purpose of making wikidata statements well-formed and machine-readable. I don't think we should enforce typing for *all* qualifiers and I second the general "organic growth" approach, but datetime qualifiers strike me as a fundamental exception. Would you represent geocoordinates as a generic string and wait for "organic growth" to determine the appropriate datatype? I appreciate the overheads of adding datatype support, but this decision will have a major impact on the shape of collaborative work on wikidata.
Denny – on a related note, I wanted to ask you what is the priority of qualifier support relative to the other items you mentioned in your list. As I noted in my previous post, the only way for an editor to correct an outdated statement is to remove information (e.g. Lombardy: head of local government: -Roberto Formigoni +Roberto Maroni ): this information will then be lost forever in an item's revision history. The sooner we introduce basic support for qualifiers, the sooner we can avoid removing valuable information from wikidata entries just for the sake of keeping them up-to-date.
Dario
On Mar 15, 2013, at 10:09 AM, Michael Hale hale.michael.jr@live.com wrote:
For most of the scenarios I can think of, parsing the dates out of strings that are in a standard format by convention will be much easier. The number of ways people will want to use qualifiers will increase like the number of properties and items. So the way I see it, we have to support string-based qualifiers at the minimum. Then I think we should only support strongly typed qualifiers if performance requires it. By setting an update polling frequency on templates that use the information I don't think we'll run into performance issues for most scenarios. Even with this example the qualifier type is a date range, not just a date. So do we want them to have to choose from a large, fixed list of qualifier types or just look at a similar example and set a string to something similar and then gradually enforce types on the most popular uses that we see. I think this type of organic growth as opposed to trying to guess the qualifier types in advance is exactly in the spirit of Wikipedia.
Date: Fri, 15 Mar 2013 09:58:38 -0400 From: tfmorris@gmail.com To: wikidata-l@lists.wikimedia.org Subject: Re: [Wikidata-l] Expiration date for data
On Fri, Mar 15, 2013 at 1:49 AM, Michael Hale hale.michael.jr@live.com wrote:
Yes, I think once qualifiers are enabled you would just have something like: ... Property(head of local government) ... Value(Elizabeth I) - Qualifier("1558-1603") - Sources() Value(James VI and I) - Qualifier("1603-1625") - Sources() ... ...
There was a discussion about whether qualifiers should have specific datatypes other than just string, but I think we should only do that if needed.
Clearly the example that you gave is one where non-string datatypes are critically important. If you don't know that they're dates, you have no way of telling when they were in those roles.
Tom
_______________________________________________ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l _______________________________________________ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
-- Project director Wikidata Wikimedia Deutschland e.V. | Obentrautstr. 72 | 10963 Berlin Tel. +49-30-219 158 26-0 | http://wikimedia.de
Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e.V. Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter der Nummer 23855 B. Als gemeinnützig anerkannt durch das Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985.
_______________________________________________ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Yes, I just meant that items aren't forced to have a specific set of properties by the software, so they are essentially weakly typed, right?
Date: Thu, 21 Mar 2013 16:09:58 +0100 From: denny.vrandecic@wikimedia.de To: wikidata-l@lists.wikimedia.org Subject: Re: [Wikidata-l] Expiration date for data
We do have strong types, but only few of time: item, commons media, string, time, geo, URL. "Government leader" would not be a supported type. The exact list and details are here: http://meta.wikimedia.org/wiki/Wikidata/Data_model#Datatypes_and_their_Values
Cheers,Denny
2013/3/21 Michael Hale hale.michael.jr@live.com
That seems better to constrain the overall type of a qualifier to any property. It still doesn't feel exactly right, but I'm not sure what would. Now that I think about it more, for the case of heads of government it doesn't seem appropriate to use a qualifier at all to me. It would just be a list of items which are presumably people. Each of those items would then have a single date or list of dates for start of head of government and end of head of government. The qualifier would be redundant. It seems the downside to having everything be strongly typed like in Freebase is that you end up with really weird and specific entity types like "government leadership timespan" to try to capture all of the details that you want, and the downside to semi-weakly typed items in Wikidata is that you might end up with different items representing the same information with different properties or qualifiers. But I have faith that Wikidata will ultimately work and achieve stability and convergence for the most common types just like how template boxes naturally emerged on Wikipedia. And I think the key advantage of Wikidata is that it will achieve growth, stability, and convergence without suffocating from having too many weird and specific item types to try to bridge and glue different types of information together.
Date: Thu, 21 Mar 2013 15:40:39 +0100 From: denny.vrandecic@wikimedia.de To: wikidata-l@lists.wikimedia.org
Subject: Re: [Wikidata-l] Expiration date for data
We will have a time datatype, and every property is strongly typed. This is also true for properties used as qualifiers. Regarding the priority of qualifiers: very high. They are the next major UI feature to be deployed, and as far as I can tell from the progress of the team it looks like they will be deployed in April.
Cheers,Denny
2013/3/20 Dario Taraborelli dtaraborelli@wikimedia.org
I disagree, and fully concur with Tom: a generic string type for a datetime qualifier defies the purpose of making wikidata statements well-formed and machine-readable.
I don't think we should enforce typing for *all* qualifiers and I second the general "organic growth" approach, but datetime qualifiers strike me as a fundamental exception. Would you represent geocoordinates as a generic string and wait for "organic growth" to determine the appropriate datatype? I appreciate the overheads of adding datatype support, but this decision will have a major impact on the shape of collaborative work on wikidata.
Denny – on a related note, I wanted to ask you what is the priority of qualifier support relative to the other items you mentioned in your list. As I noted in my previous post, the only way for an editor to correct an outdated statement is to remove information (e.g. Lombardy: head of local government: -Roberto Formigoni +Roberto Maroni ): this information will then be lost forever in an item's revision history. The sooner we introduce basic support for qualifiers, the sooner we can avoid removing valuable information from wikidata entries just for the sake of keeping them up-to-date.
Dario On Mar 15, 2013, at 10:09 AM, Michael Hale hale.michael.jr@live.com wrote:
For most of the scenarios I can think of, parsing the dates out of strings that are in a standard format by convention will be much easier. The number of ways people will want to use qualifiers will increase like the number of properties and items. So the way I see it, we have to support string-based qualifiers at the minimum. Then I think we should only support strongly typed qualifiers if performance requires it. By setting an update polling frequency on templates that use the information I don't think we'll run into performance issues for most scenarios. Even with this example the qualifier type is a date range, not just a date. So do we want them to have to choose from a large, fixed list of qualifier types or just look at a similar example and set a string to something similar and then gradually enforce types on the most popular uses that we see. I think this type of organic growth as opposed to trying to guess the qualifier types in advance is exactly in the spirit of Wikipedia.
Date: Fri, 15 Mar 2013 09:58:38 -0400 From: tfmorris@gmail.com To: wikidata-l@lists.wikimedia.org
Subject: Re: [Wikidata-l] Expiration date for data
On Fri, Mar 15, 2013 at 1:49 AM, Michael Hale hale.michael.jr@live.com wrote:
Yes, I think once qualifiers are enabled you would just have something like:...
Property(head of local government) ... Value(Elizabeth I) - Qualifier("1558-1603") - Sources() Value(James VI and I) - Qualifier("1603-1625") - Sources()
......
There was a discussion about whether qualifiers should have specific datatypes other than just string, but I think we should only do that if needed.
Clearly the example that you gave is one where non-string datatypes are critically important. If you don't know that they're dates, you have no way of telling when they were in those roles.
Tom _______________________________________________ Wikidata-l mailing listWikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
_______________________________________________ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
_______________________________________________
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l
It really depends on your definitions :)
Items are strongly typed as items. Any item can have any property. And only items can have properties. Time or geocoordinates, e.g., can not have properties.
But yes, there is no forcing of properties onto any item, nor any restriction of usage of every property. See also here:
http://blog.wikimedia.de/2013/02/22/restricting-the-world/
Cheers, denny
2013/3/21 Michael Hale hale.michael.jr@live.com
Yes, I just meant that items aren't forced to have a specific set of properties by the software, so they are essentially weakly typed, right?
Date: Thu, 21 Mar 2013 16:09:58 +0100
From: denny.vrandecic@wikimedia.de To: wikidata-l@lists.wikimedia.org Subject: Re: [Wikidata-l] Expiration date for data
We do have strong types, but only few of time: item, commons media, string, time, geo, URL. "Government leader" would not be a supported type.
The exact list and details are here: < http://meta.wikimedia.org/wiki/Wikidata/Data_model#Datatypes_and_their_Value...
Cheers, Denny
2013/3/21 Michael Hale hale.michael.jr@live.com
That seems better to constrain the overall type of a qualifier to any property. It still doesn't feel exactly right, but I'm not sure what would. Now that I think about it more, for the case of heads of government it doesn't seem appropriate to use a qualifier at all to me. It would just be a list of items which are presumably people. Each of those items would then have a single date or list of dates for start of head of government and end of head of government. The qualifier would be redundant. It seems the downside to having everything be strongly typed like in Freebase is that you end up with really weird and specific entity types like "government leadership timespan" to try to capture all of the details that you want, and the downside to semi-weakly typed items in Wikidata is that you might end up with different items representing the same information with different properties or qualifiers. But I have faith that Wikidata will ultimately work and achieve stability and convergence for the most common types just like how template boxes naturally emerged on Wikipedia. And I think the key advantage of Wikidata is that it will achieve growth, stability, and convergence without suffocating from having too many weird and specific item types to try to bridge and glue different types of information together.
Date: Thu, 21 Mar 2013 15:40:39 +0100 From: denny.vrandecic@wikimedia.de
To: wikidata-l@lists.wikimedia.org Subject: Re: [Wikidata-l] Expiration date for data
We will have a time datatype, and every property is strongly typed. This is also true for properties used as qualifiers.
Regarding the priority of qualifiers: very high. They are the next major UI feature to be deployed, and as far as I can tell from the progress of the team it looks like they will be deployed in April.
Cheers, Denny
2013/3/20 Dario Taraborelli dtaraborelli@wikimedia.org
I disagree, and fully concur with Tom: a generic string type for a datetime qualifier defies the purpose of making wikidata statements well-formed and machine-readable. I don't think we should enforce typing for *all* qualifiers and I second the general "organic growth" approach, but datetime qualifiers strike me as a fundamental exception. Would you represent geocoordinates as a generic string and wait for "organic growth" to determine the appropriate datatype? I appreciate the overheads of adding datatype support, but this decision will have a major impact on the shape of collaborative work on wikidata.
Denny – on a related note, I wanted to ask you what is the priority of qualifier support relative to the other items you mentioned in your list. As I noted in my previous post, the only way for an editor to correct an outdated statement is to remove information (e.g. Lombardy: head of local government: -Roberto Formigoni +Roberto Maroni ): this information will then be lost forever in an item's revision history. The sooner we introduce basic support for qualifiers, the sooner we can avoid removing valuable information from wikidata entries just for the sake of keeping them up-to-date.
Dario
On Mar 15, 2013, at 10:09 AM, Michael Hale hale.michael.jr@live.com wrote:
For most of the scenarios I can think of, parsing the dates out of strings that are in a standard format by convention will be much easier. The number of ways people will want to use qualifiers will increase like the number of properties and items. So the way I see it, we have to support string-based qualifiers at the minimum. Then I think we should only support strongly typed qualifiers if performance requires it. By setting an update polling frequency on templates that use the information I don't think we'll run into performance issues for most scenarios. Even with this example the qualifier type is a date range, not just a date. So do we want them to have to choose from a large, fixed list of qualifier types or just look at a similar example and set a string to something similar and then gradually enforce types on the most popular uses that we see. I think this type of organic growth as opposed to trying to guess the qualifier types in advance is exactly in the spirit of Wikipedia.
Date: Fri, 15 Mar 2013 09:58:38 -0400 From: tfmorris@gmail.com To: wikidata-l@lists.wikimedia.org Subject: Re: [Wikidata-l] Expiration date for data
On Fri, Mar 15, 2013 at 1:49 AM, Michael Hale hale.michael.jr@live.com wrote:
Yes, I think once qualifiers are enabled you would just have something like: ... Property(head of local government) ... Value(Elizabeth I) - Qualifier("1558-1603") - Sources() Value(James VI and I) - Qualifier("1603-1625") - Sources() ... ...
There was a discussion about whether qualifiers should have specific datatypes other than just string, but I think we should only do that if needed.
Clearly the example that you gave is one where non-string datatypes are critically important. If you don't know that they're dates, you have no way of telling when they were in those roles.
Tom
_______________________________________________ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l _______________________________________________ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
-- Project director Wikidata Wikimedia Deutschland e.V. | Obentrautstr. 72 | 10963 Berlin Tel. +49-30-219 158 26-0 | http://wikimedia.de
Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e.V. Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter der Nummer 23855 B. Als gemeinnützig anerkannt durch das Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985.
_______________________________________________ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
-- Project director Wikidata Wikimedia Deutschland e.V. | Obentrautstr. 72 | 10963 Berlin Tel. +49-30-219 158 26-0 | http://wikimedia.de
Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e.V. Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter der Nummer 23855 B. Als gemeinnützig anerkannt durch das Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985.
_______________________________________________ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Great post. Regarding the suggestions feature, it seems the easiest way to get that rolling would be to have a bot that would periodically tally all of the properties for all of the items that have an "is a" property with the same value. Then we could say most of the items with "is a _" have a _. Then anytime you set an "is a" property you could automatically get suggestions for what properties to add. Guided growth without suffocation from constraints.
Date: Thu, 21 Mar 2013 16:17:55 +0100 From: denny.vrandecic@wikimedia.de To: wikidata-l@lists.wikimedia.org Subject: Re: [Wikidata-l] Expiration date for data
It really depends on your definitions :) Items are strongly typed as items. Any item can have any property. And only items can have properties. Time or geocoordinates, e.g., can not have properties.
But yes, there is no forcing of properties onto any item, nor any restriction of usage of every property. See also here: http://blog.wikimedia.de/2013/02/22/restricting-the-world/
Cheers,denny
2013/3/21 Michael Hale hale.michael.jr@live.com
Yes, I just meant that items aren't forced to have a specific set of properties by the software, so they are essentially weakly typed, right?
Date: Thu, 21 Mar 2013 16:09:58 +0100
From: denny.vrandecic@wikimedia.de To: wikidata-l@lists.wikimedia.org
Subject: Re: [Wikidata-l] Expiration date for data
We do have strong types, but only few of time: item, commons media, string, time, geo, URL. "Government leader" would not be a supported type.
The exact list and details are here: http://meta.wikimedia.org/wiki/Wikidata/Data_model#Datatypes_and_their_Values
Cheers,Denny
2013/3/21 Michael Hale hale.michael.jr@live.com
That seems better to constrain the overall type of a qualifier to any property. It still doesn't feel exactly right, but I'm not sure what would. Now that I think about it more, for the case of heads of government it doesn't seem appropriate to use a qualifier at all to me. It would just be a list of items which are presumably people. Each of those items would then have a single date or list of dates for start of head of government and end of head of government. The qualifier would be redundant. It seems the downside to having everything be strongly typed like in Freebase is that you end up with really weird and specific entity types like "government leadership timespan" to try to capture all of the details that you want, and the downside to semi-weakly typed items in Wikidata is that you might end up with different items representing the same information with different properties or qualifiers. But I have faith that Wikidata will ultimately work and achieve stability and convergence for the most common types just like how template boxes naturally emerged on Wikipedia. And I think the key advantage of Wikidata is that it will achieve growth, stability, and convergence without suffocating from having too many weird and specific item types to try to bridge and glue different types of information together.
Date: Thu, 21 Mar 2013 15:40:39 +0100 From: denny.vrandecic@wikimedia.de To: wikidata-l@lists.wikimedia.org
Subject: Re: [Wikidata-l] Expiration date for data
We will have a time datatype, and every property is strongly typed. This is also true for properties used as qualifiers. Regarding the priority of qualifiers: very high. They are the next major UI feature to be deployed, and as far as I can tell from the progress of the team it looks like they will be deployed in April.
Cheers,Denny
2013/3/20 Dario Taraborelli dtaraborelli@wikimedia.org
I disagree, and fully concur with Tom: a generic string type for a datetime qualifier defies the purpose of making wikidata statements well-formed and machine-readable.
I don't think we should enforce typing for *all* qualifiers and I second the general "organic growth" approach, but datetime qualifiers strike me as a fundamental exception. Would you represent geocoordinates as a generic string and wait for "organic growth" to determine the appropriate datatype? I appreciate the overheads of adding datatype support, but this decision will have a major impact on the shape of collaborative work on wikidata.
Denny – on a related note, I wanted to ask you what is the priority of qualifier support relative to the other items you mentioned in your list. As I noted in my previous post, the only way for an editor to correct an outdated statement is to remove information (e.g. Lombardy: head of local government: -Roberto Formigoni +Roberto Maroni ): this information will then be lost forever in an item's revision history. The sooner we introduce basic support for qualifiers, the sooner we can avoid removing valuable information from wikidata entries just for the sake of keeping them up-to-date.
Dario On Mar 15, 2013, at 10:09 AM, Michael Hale hale.michael.jr@live.com wrote:
For most of the scenarios I can think of, parsing the dates out of strings that are in a standard format by convention will be much easier. The number of ways people will want to use qualifiers will increase like the number of properties and items. So the way I see it, we have to support string-based qualifiers at the minimum. Then I think we should only support strongly typed qualifiers if performance requires it. By setting an update polling frequency on templates that use the information I don't think we'll run into performance issues for most scenarios. Even with this example the qualifier type is a date range, not just a date. So do we want them to have to choose from a large, fixed list of qualifier types or just look at a similar example and set a string to something similar and then gradually enforce types on the most popular uses that we see. I think this type of organic growth as opposed to trying to guess the qualifier types in advance is exactly in the spirit of Wikipedia.
Date: Fri, 15 Mar 2013 09:58:38 -0400 From: tfmorris@gmail.com To: wikidata-l@lists.wikimedia.org
Subject: Re: [Wikidata-l] Expiration date for data
On Fri, Mar 15, 2013 at 1:49 AM, Michael Hale hale.michael.jr@live.com wrote:
Yes, I think once qualifiers are enabled you would just have something like:...
Property(head of local government) ... Value(Elizabeth I) - Qualifier("1558-1603") - Sources() Value(James VI and I) - Qualifier("1603-1625") - Sources()
......
There was a discussion about whether qualifiers should have specific datatypes other than just string, but I think we should only do that if needed.
Clearly the example that you gave is one where non-string datatypes are critically important. If you don't know that they're dates, you have no way of telling when they were in those roles.
Tom _______________________________________________ Wikidata-l mailing listWikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
_______________________________________________ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
_______________________________________________
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l
On 21 March 2013 15:39, Michael Hale hale.michael.jr@live.com wrote:
Great post. Regarding the suggestions feature, it seems the easiest way to get that rolling would be to have a bot that would periodically tally all of the properties for all of the items that have an "is a" property with the same value. Then we could say most of the items with "is a _" have a _. Then anytime you set an "is a" property you could automatically get suggestions for what properties to add. Guided growth without suffocation from constraints.
And conversely, advice on unusual combinations: "Only 1 of the 54,687 items with "population" values also have a value for "flavour". Did you mean to include this?"
Sure, there are lots of types of suggestions one can imagine. We could draw attention to potential mistakes, suggest common qualifiers for specific properties, etc. I have narrowed my confusion about the current proposal for implementing qualifiers. I do think making qualifiers be a list of properties is the correct way to go. So for our common example of historic population numbers we will use a property like "publication date" of a work? Because the report is a work that has a publication date, and that date is the qualifier for the population property of the city item. However, I do think the data model primer currently gives an example of a qualifier that should just be data included in properties on an item. It refers to the election date and party of Angela Merkel as qualifiers, but that information is actual knowledge with references in the encyclopedia, so it should be included as properties of the Angela Merkel item, not as qualifiers for a value of the head of state property of the Germany item. Then things like including "as of" or "since" for population dates will be handled by the inclusion syntax in the articles, correct?
From: andrew.gray@dunelm.org.uk Date: Thu, 21 Mar 2013 17:34:05 +0000 To: wikidata-l@lists.wikimedia.org Subject: Re: [Wikidata-l] Expiration date for data
On 21 March 2013 15:39, Michael Hale hale.michael.jr@live.com wrote:
Great post. Regarding the suggestions feature, it seems the easiest way to get that rolling would be to have a bot that would periodically tally all of the properties for all of the items that have an "is a" property with the same value. Then we could say most of the items with "is a _" have a _. Then anytime you set an "is a" property you could automatically get suggestions for what properties to add. Guided growth without suffocation from constraints.
And conversely, advice on unusual combinations: "Only 1 of the 54,687 items with "population" values also have a value for "flavour". Did you mean to include this?"
--
- Andrew Gray andrew.gray@dunelm.org.uk
Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
-----Original Message----- From: denny.vrandecic@wikimedia.de To: wikidata-l@lists.wikimedia.org Subject: Re: [Wikidata-l] Expiration date for data
We will have a time datatype, and every property is strongly typed. This is also true for properties used as qualifiers.
Regarding the priority of qualifiers: very high. They are the next major UI feature to be deployed, and as far as I can tell from the progress of the team it looks like they will be deployed in April.
Cheers, Denny "every property is strongly typed" is clear, as you refer to datatypes, but not to "types" in the rdf:type sense, which of course are owl:Class things. When talking about owl:Class things it's nice to reference an OWL ontology, which maybe I've missed, from Wikidata. So I've learned from sniffing around that Wikidata's ontology is going in the direction of http://d-nb.info/standards/elementset/gnd. And that P107 is semantically identical to rdf:type.
Is this correct? If so, will you add "rdf:type" as an alternate label?
Although I disagree with some of below which I ran across at https://news.ycombinator.com/item?id=5328472, I do note that P107 on Wikidata is entitled "main type (GND)" among other indications. How does adopting a specific ontology accord with the view in your blog strenuously promoting folksonomies over ontologies?
That is, a folksonomy in the sense that owl:Class's are implicitly defined, whose "instances" are associated as a "class" by virtue of possessing in common certain properties and or property values. IOW, your blog implies little need to define "classes" at all. You face a challenge though because soon people want to attach a name to the bundle of properties and-or property values that comprise a "class" of things, to refer to them as a collection.
Any light you can shed about ontology plans for Wikidata would be appreciated! thanks - john
---------------------------------------------------------------------------- ----
emw 15 days ago | link
Property P107 (http://www.wikidata.org/wiki/Property:P107) has emerged as Wikidata's de facto upper ontology. It currently consists of six main types: person, organization, event, creative work, term, and geographical feature. It's essentially a clean port of the high-level entities from the GND Ontology -- a controlled vocabulary developed by the German National Library and released last summer (http://d-nb.info/standards/elementset/gnd). There's a fair amount of debate over that property. Are those current high level types (person, place, work, event, organization, term) a good fit for a knowledgebase that aims to structure all knowledge and not just library holdings? Does classifying subjects like inertia, DNA, Alzheimer's disease, dog, etc. as simply "terms" make sense?
More reading related to Wikidata, ontology and types: https://blog.wikimedia.de/2013/02/22/restricting-the-world/.
This has also been aired in other discussions. Outdated entries can both be something that is only valid within a set timeframe, but can also be dependent on something else. One special case is when an external source do not support a specific statement anymore.
On Sun, Sep 30, 2012 at 11:59 AM, Marco Fleckinger marco.fleckinger@gmail.com wrote:
Hi,
regarding an actual topic in Germany about publication of the timetable-data of Deutsche Bahn (German national railway company) and their willingness of a discussion with other Open-Data-Supporters it may be a good idea of providing an expiration dates for Wikidata-records.
In their open letter to Mr. Kreil [1] they announced that it may cause problems providing the timetable-data in an open way if e.g. anybody uses old data.
Marco
[1] http://www.db-vertrieb.com/db_vertrieb/view/service/open_plan_b.shtml
Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
A topic I've been involved in recently regards statistics for gun violence in the US. The government publishes a big report every year, but it takes them most of the year to collect the information from all of the local police agencies and compile the results. Several English Wikipedia articles use this information, and it would be awesome if the tables in the articles could be generated automatically from data in Wikidata. It seems like ideally I would have some code I would run whenever they release the new report that would automatically import all of the data into Wikidata and add the appropriate references. I suppose the information would go in the item for each city. Say for the Atlanta item, there would be a statement for murders and the value would be a number and the qualifier for these statements would just be "2011" or whatever. Then I would want to be able to have a template that automatically makes a table to show the 5 most recent years somewhere in the Atlanta article for example.
Date: Thu, 14 Mar 2013 16:37:01 +0100 From: jeblad@gmail.com To: wikidata-l@lists.wikimedia.org Subject: Re: [Wikidata-l] Expiration date for data
This has also been aired in other discussions. Outdated entries can both be something that is only valid within a set timeframe, but can also be dependent on something else. One special case is when an external source do not support a specific statement anymore.
On Sun, Sep 30, 2012 at 11:59 AM, Marco Fleckinger marco.fleckinger@gmail.com wrote:
Hi,
regarding an actual topic in Germany about publication of the timetable-data of Deutsche Bahn (German national railway company) and their willingness of a discussion with other Open-Data-Supporters it may be a good idea of providing an expiration dates for Wikidata-records.
In their open letter to Mr. Kreil [1] they announced that it may cause problems providing the timetable-data in an open way if e.g. anybody uses old data.
Marco
[1] http://www.db-vertrieb.com/db_vertrieb/view/service/open_plan_b.shtml
Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Le 2013-03-14 19:38, Michael Hale a écrit :
A topic I've been involved in recently regards statistics for gun violence in the US. The government publishes a big report every year, but it takes them most of the year to collect the information from all of the local police agencies and compile the results. Several English Wikipedia articles use this information, and it would be awesome if the tables in the articles could be generated automatically from data in Wikidata. It seems like ideally I would have some code I would run whenever they release the new report that would automatically import all of the data into Wikidata and add the appropriate references. I suppose the information would go in the item for each city. Say for the Atlanta item, there would be a statement for murders and the value would be a number and the qualifier for these statements would just be "2011" or whatever. Then I would want to be able to have a template that automatically makes a table to show the 5 most recent years somewhere in the Atlanta article for example.
That's great. Now we should also provide with each statiscal generated information an explanation of how it was interpreted. Numbers aren't as objective as one may believe, so we should take care to explain methodologies we use.
kind regards, mathieu
Date: Thu, 14 Mar 2013 16:37:01 +0100 From: jeblad@gmail.com To: wikidata-l@lists.wikimedia.org Subject: Re: [Wikidata-l] Expiration date for data
This has also been aired in other discussions. Outdated entries can both be something that is only valid within a set timeframe, but can also be dependent on something else. One special case is when an external source do not support a specific statement anymore.
On Sun, Sep 30, 2012 at 11:59 AM, Marco Fleckinger marco.fleckinger@gmail.com wrote:
Hi,
regarding an actual topic in Germany about publication of the
timetable-data
of Deutsche Bahn (German national railway company) and their
willingness of
a discussion with other Open-Data-Supporters it may be a good idea
of
providing an expiration dates for Wikidata-records.
In their open letter to Mr. Kreil [1] they announced that it may
cause
problems providing the timetable-data in an open way if e.g.
anybody uses
old data.
Marco
[1]
http://www.db-vertrieb.com/db_vertrieb/view/service/open_plan_b.shtml
Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Many organizations that collect statistics on a regular basis document their actual datatype, their methods for collecting the statistics, how they process the statistics, how they present them and how they validate and invalidates them. Its often just called the "metadata", but the different parts are quite distinct and not really metadata. Its also worth noting that specific entries can change role, and in some cases data used in some role becomes metadata. At Statistics Norway the information about how the data was collected was called the "vaskeseddel" ("laundry note" http://en.wikipedia.org/wiki/Laundry_symbol). I think we should find some simplified method to collect and make available this information. John
On Fri, Mar 15, 2013 at 8:49 AM, Mathieu Stumpf psychoslave@culture-libre.org wrote:
Le 2013-03-14 19:38, Michael Hale a écrit :
A topic I've been involved in recently regards statistics for gun violence in the US. The government publishes a big report every year, but it takes them most of the year to collect the information from all of the local police agencies and compile the results. Several English Wikipedia articles use this information, and it would be awesome if the tables in the articles could be generated automatically from data in Wikidata. It seems like ideally I would have some code I would run whenever they release the new report that would automatically import all of the data into Wikidata and add the appropriate references. I suppose the information would go in the item for each city. Say for the Atlanta item, there would be a statement for murders and the value would be a number and the qualifier for these statements would just be "2011" or whatever. Then I would want to be able to have a template that automatically makes a table to show the 5 most recent years somewhere in the Atlanta article for example.
That's great. Now we should also provide with each statiscal generated information an explanation of how it was interpreted. Numbers aren't as objective as one may believe, so we should take care to explain methodologies we use.
kind regards, mathieu
Date: Thu, 14 Mar 2013 16:37:01 +0100 From: jeblad@gmail.com To: wikidata-l@lists.wikimedia.org Subject: Re: [Wikidata-l] Expiration date for data
This has also been aired in other discussions. Outdated entries can both be something that is only valid within a set timeframe, but can also be dependent on something else. One special case is when an external source do not support a specific statement anymore.
On Sun, Sep 30, 2012 at 11:59 AM, Marco Fleckinger marco.fleckinger@gmail.com wrote:
Hi,
regarding an actual topic in Germany about publication of the timetable-data of Deutsche Bahn (German national railway company) and their willingness of a discussion with other Open-Data-Supporters it may be a good idea of providing an expiration dates for Wikidata-records.
In their open letter to Mr. Kreil [1] they announced that it may cause problems providing the timetable-data in an open way if e.g. anybody uses old data.
Marco
[1] http://www.db-vertrieb.com/db_vertrieb/view/service/open_plan_b.shtml
Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
-- Association Culture-Libre http://www.culture-libre.org/
Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
2013/3/15 Mathieu Stumpf psychoslave@culture-libre.org
Le 2013-03-14 19:38, Michael Hale a écrit :
A topic I've been involved in recently regards statistics for gun
violence in the US. The government publishes a big report every year, but it takes them most of the year to collect the information from all of the local police agencies and compile the results. Several English Wikipedia articles use this information, and it would be awesome if the tables in the articles could be generated automatically from data in Wikidata. It seems like ideally I would have some code I would run whenever they release the new report that would automatically import all of the data into Wikidata and add the appropriate references. I suppose the information would go in the item for each city. Say for the Atlanta item, there would be a statement for murders and the value would be a number and the qualifier for these statements would just be "2011" or whatever. Then I would want to be able to have a template that automatically makes a table to show the 5 most recent years somewhere in the Atlanta article for example.
That's great. Now we should also provide with each statiscal generated information an explanation of how it was interpreted. Numbers aren't as objective as one may believe, so we should take care to explain methodologies we use.
kind regards, mathieu
Good point, is there any thought yet on where to put these informations ? Wikidata does not really seems the right place to put it, neither Wikipedia in the general case, maybe commons ?
I thought until now that the "source" of a claim was supposed to point to whom collected the datas in the first place, could it be something else like a url pointing to a file describing the collection and interpretation of datas ?
Thomas
Le 2013-03-15 13:16, Thomas Douillard a écrit :
2013/3/15 Mathieu Stumpf psychoslave@culture-libre.org
Le 2013-03-14 19:38, Michael Hale a écrit :
A topic I've been involved in recently regards statistics for gun violence in the US. The government publishes a big report every year, but it takes them most of the year to collect the information from all of the local police agencies and compile the results. Several English Wikipedia articles use this information, and it would be awesome if the tables in the articles could be generated automatically from data in Wikidata. It seems like ideally I would have some code I would run whenever they release the new report that would automatically import all of the data into Wikidata and add the appropriate references. I suppose the information would go in the item for each city. Say for the Atlanta item, there would be a statement for murders and the value would be a number and the qualifier for these statements would just be "2011" or whatever. Then I would want to be able to have a template that automatically makes a table to show the 5 most recent years somewhere in the Atlanta article for example.
That's great. Now we should also provide with each statiscal generated information an explanation of how it was interpreted. Numbers aren't as objective as one may believe, so we should take care to explain methodologies we use.
kind regards, mathieu
Good point, is there any thought yet on where to put these informations ? Wikidata does not really seems the right place to put it, neither Wikipedia in the general case, maybe commons ?
I think that's yet another case where a project like Wikikultur[1] would be interesting. As interpretation methologies could possibly be novel works, both the description and critics of it could be published as seperated essays on such a project.
[1] https://meta.wikimedia.org/wiki/Wikikultur
I thought until now that the "source" of a claim was supposed to point to whom collected the datas in the first place, could it be something else like a url pointing to a file describing the collection and interpretation of datas ?
Thomas
Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l