Hi!
I was lately looking into the use of novalue in wikidata, specifically in qualifiers and references. While use of novalue in property values is pretty clear for me, not sure it is as useful in qualifiers and refs.
Example:
https://www.wikidata.org/wiki/Q62#P6
As we can see, Edwin Mah Lee is the mayor of San Francisco, with end date set to "novalue". I wonder how useful is this - most entries like this just omit end date, and if we query this in SPARQL, for example, we would do something like "FILTER NOT EXISTS (?statement q:P582 ?enddate)". Inconsistently having novalues there makes it harder to process both visually (instead of just looking for one having no end date we need to look for either no end date or end date with specific "novalue") and automatically. And in overwhelming majority of cases I feel "novalue" and absence of value model exactly the same fact - it is a current event, etc. Is there any useful case for using novalue there?
Another example: https://www.wikidata.org/wiki/Q2866#P569
Here we have reference with "stated in":"no value". I don't think I understand what it means - not stated anywhere? How would we know to make such claim? Is a lie? Why would we keep confirmed lies in the data? Does not have confirmed source that we know of? Many things do, why would we have "stated in" in this particular case? Summarily, it is unclear for me that novalue in references is ever useful.
To quantify this, we do not have a lot of such things: on the partial dump I'm working with for WDQS (which contains at least half of the DB) there are 14 novalue refs and 13 properties using novalue as qualifier, leader being P582 with 200+ uses, and overall 422 uses. So volume-wise it's not a big deal but I'd like to figure out what's the right thing to do here and establish some guidelines.
Thanks,
Actually I think that having "no value" for the end date qualifier probably means that it has not ended yet. There is no other way to express whether this information is currently merely incomplete (i.e. it has ended, but no one bothered to fill it in) or not (i.e. it has not ended yet). This is pretty much the same use case as for normal claims.
Other qualifiers I could imagine where an explicit "no value" would make sense is P678, I guess.
In references it might make sense to state explicitly that the source does not have an issue number or an ISSN, etc., in order for example to allow cleanup of references and to mark the cases where a reference does not have a given value from those cases where it is merely incomplete.
I don't have superstrong arguments as you see (I would have much stronger arguments for "unknown value"), but I would prefer not to forbid "no value" in those cases explicitly, because it might be useful and it is already there.
[1] https://www.wikidata.org/wiki/Special:WhatLinksHere/Q18615010
On Thu, Apr 23, 2015 at 1:27 PM, Stas Malyshev smalyshev@wikimedia.org wrote:
Hi!
I was lately looking into the use of novalue in wikidata, specifically in qualifiers and references. While use of novalue in property values is pretty clear for me, not sure it is as useful in qualifiers and refs.
Example:
https://www.wikidata.org/wiki/Q62#P6
As we can see, Edwin Mah Lee is the mayor of San Francisco, with end date set to "novalue". I wonder how useful is this - most entries like this just omit end date, and if we query this in SPARQL, for example, we would do something like "FILTER NOT EXISTS (?statement q:P582 ?enddate)". Inconsistently having novalues there makes it harder to process both visually (instead of just looking for one having no end date we need to look for either no end date or end date with specific "novalue") and automatically. And in overwhelming majority of cases I feel "novalue" and absence of value model exactly the same fact - it is a current event, etc. Is there any useful case for using novalue there?
Another example: https://www.wikidata.org/wiki/Q2866#P569
Here we have reference with "stated in":"no value". I don't think I understand what it means - not stated anywhere? How would we know to make such claim? Is a lie? Why would we keep confirmed lies in the data? Does not have confirmed source that we know of? Many things do, why would we have "stated in" in this particular case? Summarily, it is unclear for me that novalue in references is ever useful.
To quantify this, we do not have a lot of such things: on the partial dump I'm working with for WDQS (which contains at least half of the DB) there are 14 novalue refs and 13 properties using novalue as qualifier, leader being P582 with 200+ uses, and overall 422 uses. So volume-wise it's not a big deal but I'd like to figure out what's the right thing to do here and establish some guidelines.
Thanks,
Stas Malyshev smalyshev@wikimedia.org
Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Hoi, When someone bothers to add a qualifier, it is most likely that it will be done completely and well. It is NOT something that is automated. So a date missing means exactly that. There is no known end date.
This notion that "no value" has value is fine except that it prevents tools that rely on the absence of a date. One example is the tool to find people who are known to be dead in Wikipedia but who do not have a date of death in Wikidata....
So REALLY unconvincing. Thanks, GerardM
On 26 April 2015 at 00:52, Denny Vrandečić vrandecic@gmail.com wrote:
Actually I think that having "no value" for the end date qualifier probably means that it has not ended yet. There is no other way to express whether this information is currently merely incomplete (i.e. it has ended, but no one bothered to fill it in) or not (i.e. it has not ended yet). This is pretty much the same use case as for normal claims.
Other qualifiers I could imagine where an explicit "no value" would make sense is P678, I guess.
In references it might make sense to state explicitly that the source does not have an issue number or an ISSN, etc., in order for example to allow cleanup of references and to mark the cases where a reference does not have a given value from those cases where it is merely incomplete.
I don't have superstrong arguments as you see (I would have much stronger arguments for "unknown value"), but I would prefer not to forbid "no value" in those cases explicitly, because it might be useful and it is already there.
[1] https://www.wikidata.org/wiki/Special:WhatLinksHere/Q18615010
On Thu, Apr 23, 2015 at 1:27 PM, Stas Malyshev smalyshev@wikimedia.org wrote:
Hi!
I was lately looking into the use of novalue in wikidata, specifically in qualifiers and references. While use of novalue in property values is pretty clear for me, not sure it is as useful in qualifiers and refs.
Example:
https://www.wikidata.org/wiki/Q62#P6
As we can see, Edwin Mah Lee is the mayor of San Francisco, with end date set to "novalue". I wonder how useful is this - most entries like this just omit end date, and if we query this in SPARQL, for example, we would do something like "FILTER NOT EXISTS (?statement q:P582 ?enddate)". Inconsistently having novalues there makes it harder to process both visually (instead of just looking for one having no end date we need to look for either no end date or end date with specific "novalue") and automatically. And in overwhelming majority of cases I feel "novalue" and absence of value model exactly the same fact - it is a current event, etc. Is there any useful case for using novalue there?
Another example: https://www.wikidata.org/wiki/Q2866#P569
Here we have reference with "stated in":"no value". I don't think I understand what it means - not stated anywhere? How would we know to make such claim? Is a lie? Why would we keep confirmed lies in the data? Does not have confirmed source that we know of? Many things do, why would we have "stated in" in this particular case? Summarily, it is unclear for me that novalue in references is ever useful.
To quantify this, we do not have a lot of such things: on the partial dump I'm working with for WDQS (which contains at least half of the DB) there are 14 novalue refs and 13 properties using novalue as qualifier, leader being P582 with 200+ uses, and overall 422 uses. So volume-wise it's not a big deal but I'd like to figure out what's the right thing to do here and establish some guidelines.
Thanks,
Stas Malyshev smalyshev@wikimedia.org
Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Hi!
Actually I think that having "no value" for the end date qualifier probably means that it has not ended yet. There is no other way to
But that's what no end date also means, in 99% cases where there's start date and no end date. Let's see https://www.wikidata.org/wiki/Q30#P35 - does it say that we have no idea if Barack Obama is still the US president (same for P6) and nobody bothered to check? I don't think so. I mean, maybe that was the original idea, but are we going to go and fix all start/end pairs now and add novalues to them? Are people editing Wikidata even aware this is what they should be doing - in case it is what they should be doing? I think in this case the common usage and the intent of the editor would be in 99% of cases that start date and no end date means current event and not "we have no idea if it's still current or not". At least for something like P582. I admit, for some others the meaning may be different - i.e., if there's neither P580 nor P582 then the above reasoning does not apply. But then we by default assume it's current (unless it has P585) so the outcome is essentially the same.
Other qualifiers I could imagine where an explicit "no value" would make sense is P678, I guess.
OK, here I don't know much about what it means, so I just accept your point.
In references it might make sense to state explicitly that the source does not have an issue number or an ISSN, etc., in order for example to allow cleanup of references and to mark the cases where a reference does not have a given value from those cases where it is merely incomplete.
Here though again the same as above - usually when you add something that is expected to have issue number but it's not there, it's either somevalue (means, we don't know what the issue is, but it was an issue) or somehow it's the exception and it has no issue. Only actual usage of novalue I found in refs so far was confused usage of refs instead of qualifiers (pretty soon - ~couple of weeks - we'll have full recent dump loaded in the lab machine and we could answer this with real certainty, right now it's like 80% certainty :).
I don't have superstrong arguments as you see (I would have much stronger arguments for "unknown value"), but I would prefer not to forbid "no value" in those cases explicitly, because it might be useful and it is already there.
For qualifiers, I can see now there might be cases where it is useful, still not on references. But I think maybe not forbidding as such but having the guideline on what is considered the Right Thing and then if there's an exception than the editors can use their own judgement.
What about people who were born in the 18th-century? We know they are dead, but their death is not recorded and we only know when they were last active. How do you set that end date?
On Sun, Apr 26, 2015 at 8:36 AM, Stas Malyshev smalyshev@wikimedia.org wrote:
Hi!
Actually I think that having "no value" for the end date qualifier probably means that it has not ended yet. There is no other way to
But that's what no end date also means, in 99% cases where there's start date and no end date. Let's see https://www.wikidata.org/wiki/Q30#P35 - does it say that we have no idea if Barack Obama is still the US president (same for P6) and nobody bothered to check? I don't think so. I mean, maybe that was the original idea, but are we going to go and fix all start/end pairs now and add novalues to them? Are people editing Wikidata even aware this is what they should be doing - in case it is what they should be doing? I think in this case the common usage and the intent of the editor would be in 99% of cases that start date and no end date means current event and not "we have no idea if it's still current or not". At least for something like P582. I admit, for some others the meaning may be different - i.e., if there's neither P580 nor P582 then the above reasoning does not apply. But then we by default assume it's current (unless it has P585) so the outcome is essentially the same.
Other qualifiers I could imagine where an explicit "no value" would make sense is P678, I guess.
OK, here I don't know much about what it means, so I just accept your point.
In references it might make sense to state explicitly that the source does not have an issue number or an ISSN, etc., in order for example to allow cleanup of references and to mark the cases where a reference does not have a given value from those cases where it is merely incomplete.
Here though again the same as above - usually when you add something that is expected to have issue number but it's not there, it's either somevalue (means, we don't know what the issue is, but it was an issue) or somehow it's the exception and it has no issue. Only actual usage of novalue I found in refs so far was confused usage of refs instead of qualifiers (pretty soon - ~couple of weeks - we'll have full recent dump loaded in the lab machine and we could answer this with real certainty, right now it's like 80% certainty :).
I don't have superstrong arguments as you see (I would have much stronger arguments for "unknown value"), but I would prefer not to forbid "no value" in those cases explicitly, because it might be useful and it is already there.
For qualifiers, I can see now there might be cases where it is useful, still not on references. But I think maybe not forbidding as such but having the guideline on what is considered the Right Thing and then if there's an exception than the editors can use their own judgement.
-- Stas Malyshev smalyshev@wikimedia.org
Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Hoi, There are two ways of doing that.. You can assume given average age and date of birth in what century someone died. This is something you can specify or you can state that the date of death as unknown. Now that IS a valid way of doing this. However it does not mean that 17th centrury people did not die. It is therefore relatively useless. Thanks, GerardM
On 26 April 2015 at 08:42, Jane Darnell jane023@gmail.com wrote:
What about people who were born in the 18th-century? We know they are dead, but their death is not recorded and we only know when they were last active. How do you set that end date?
On Sun, Apr 26, 2015 at 8:36 AM, Stas Malyshev smalyshev@wikimedia.org wrote:
Hi!
Actually I think that having "no value" for the end date qualifier probably means that it has not ended yet. There is no other way to
But that's what no end date also means, in 99% cases where there's start date and no end date. Let's see https://www.wikidata.org/wiki/Q30#P35 - does it say that we have no idea if Barack Obama is still the US president (same for P6) and nobody bothered to check? I don't think so. I mean, maybe that was the original idea, but are we going to go and fix all start/end pairs now and add novalues to them? Are people editing Wikidata even aware this is what they should be doing - in case it is what they should be doing? I think in this case the common usage and the intent of the editor would be in 99% of cases that start date and no end date means current event and not "we have no idea if it's still current or not". At least for something like P582. I admit, for some others the meaning may be different - i.e., if there's neither P580 nor P582 then the above reasoning does not apply. But then we by default assume it's current (unless it has P585) so the outcome is essentially the same.
Other qualifiers I could imagine where an explicit "no value" would make sense is P678, I guess.
OK, here I don't know much about what it means, so I just accept your point.
In references it might make sense to state explicitly that the source does not have an issue number or an ISSN, etc., in order for example to allow cleanup of references and to mark the cases where a reference does not have a given value from those cases where it is merely incomplete.
Here though again the same as above - usually when you add something that is expected to have issue number but it's not there, it's either somevalue (means, we don't know what the issue is, but it was an issue) or somehow it's the exception and it has no issue. Only actual usage of novalue I found in refs so far was confused usage of refs instead of qualifiers (pretty soon - ~couple of weeks - we'll have full recent dump loaded in the lab machine and we could answer this with real certainty, right now it's like 80% certainty :).
I don't have superstrong arguments as you see (I would have much stronger arguments for "unknown value"), but I would prefer not to forbid "no value" in those cases explicitly, because it might be useful and it is already there.
For qualifiers, I can see now there might be cases where it is useful, still not on references. But I think maybe not forbidding as such but having the guideline on what is considered the Right Thing and then if there's an exception than the editors can use their own judgement.
-- Stas Malyshev smalyshev@wikimedia.org
Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Hoi, It would make sense to have a bot run and add dates of novalue for dob dod where we know that people must be dead. Thanks, GerardM
On 26 April 2015 at 08:54, Gerard Meijssen gerard.meijssen@gmail.com wrote:
Hoi, There are two ways of doing that.. You can assume given average age and date of birth in what century someone died. This is something you can specify or you can state that the date of death as unknown. Now that IS a valid way of doing this. However it does not mean that 17th centrury people did not die. It is therefore relatively useless. Thanks, GerardM
On 26 April 2015 at 08:42, Jane Darnell jane023@gmail.com wrote:
What about people who were born in the 18th-century? We know they are dead, but their death is not recorded and we only know when they were last active. How do you set that end date?
On Sun, Apr 26, 2015 at 8:36 AM, Stas Malyshev smalyshev@wikimedia.org wrote:
Hi!
Actually I think that having "no value" for the end date qualifier probably means that it has not ended yet. There is no other way to
But that's what no end date also means, in 99% cases where there's start date and no end date. Let's see https://www.wikidata.org/wiki/Q30#P35 - does it say that we have no idea if Barack Obama is still the US president (same for P6) and nobody bothered to check? I don't think so. I mean, maybe that was the original idea, but are we going to go and fix all start/end pairs now and add novalues to them? Are people editing Wikidata even aware this is what they should be doing - in case it is what they should be doing? I think in this case the common usage and the intent of the editor would be in 99% of cases that start date and no end date means current event and not "we have no idea if it's still current or not". At least for something like P582. I admit, for some others the meaning may be different - i.e., if there's neither P580 nor P582 then the above reasoning does not apply. But then we by default assume it's current (unless it has P585) so the outcome is essentially the same.
Other qualifiers I could imagine where an explicit "no value" would
make
sense is P678, I guess.
OK, here I don't know much about what it means, so I just accept your point.
In references it might make sense to state explicitly that the source does not have an issue number or an ISSN, etc., in order for example to allow cleanup of references and to mark the cases where a reference
does
not have a given value from those cases where it is merely incomplete.
Here though again the same as above - usually when you add something that is expected to have issue number but it's not there, it's either somevalue (means, we don't know what the issue is, but it was an issue) or somehow it's the exception and it has no issue. Only actual usage of novalue I found in refs so far was confused usage of refs instead of qualifiers (pretty soon - ~couple of weeks - we'll have full recent dump loaded in the lab machine and we could answer this with real certainty, right now it's like 80% certainty :).
I don't have superstrong arguments as you see (I would have much stronger arguments for "unknown value"), but I would prefer not to forbid "no value" in those cases explicitly, because it might be useful and it is already there.
For qualifiers, I can see now there might be cases where it is useful, still not on references. But I think maybe not forbidding as such but having the guideline on what is considered the Right Thing and then if there's an exception than the editors can use their own judgement.
-- Stas Malyshev smalyshev@wikimedia.org
Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Hi!
It would make sense to have a bot run and add dates of novalue for dob dod where we know that people must be dead.
That would actually be opposite of what we want, since novalue would mean they were not born and are not dead. I think you meant "unknown" for date of death, in which case it does make sense.
For the unknown date case, I also used some imprecise dates in the past, if you set date withe a precision of the century around the last time it wa known active for example, you get something semantically correct and that is probably esaier to handle in queries (athough the way to handle imprecise or overlapping dates interval in date comparison for the query engine is probably not known yet :) I'm curious to know)
2015-04-26 9:29 GMT+02:00 Stas Malyshev smalyshev@wikimedia.org:
Hi!
It would make sense to have a bot run and add dates of novalue for dob dod where we know that people must be dead.
That would actually be opposite of what we want, since novalue would mean they were not born and are not dead. I think you meant "unknown" for date of death, in which case it does make sense.
-- Stas Malyshev smalyshev@wikimedia.org
Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Could you not add the last active date as a qualifier to the somevalue death date?
In general uncertainty in dates are not so easily entered. Born 1969 or 1970 cannot be entered as 1969 with uncertainty decade since that becomes 1960s (at least that is what is shown to readers) so the only legit way of entering it is 20th century (bringing the uncertainty from 2 to 100 years).
In general being able to model dates as between X and Y (as for numbers) would be nice.
Um.. sorry for the sidetrack from somevalue which sidetracked from the novalue discussion.
/André
------ André Costa GLAM-tekniker Wikimedia Sverige On 26 Apr 2015 10:23, "Thomas Douillard" thomas.douillard@gmail.com wrote:
For the unknown date case, I also used some imprecise dates in the past, if you set date withe a precision of the century around the last time it wa known active for example, you get something semantically correct and that is probably esaier to handle in queries (athough the way to handle imprecise or overlapping dates interval in date comparison for the query engine is probably not known yet :) I'm curious to know)
2015-04-26 9:29 GMT+02:00 Stas Malyshev smalyshev@wikimedia.org:
Hi!
It would make sense to have a bot run and add dates of novalue for dob dod where we know that people must be dead.
That would actually be opposite of what we want, since novalue would mean they were not born and are not dead. I think you meant "unknown" for date of death, in which case it does make sense.
-- Stas Malyshev smalyshev@wikimedia.org
Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Quick reply to Denny and Gerard:
@Denny: I think it makes sense to treat qualifiers under a closed-world semantics. That is: what is not there can safely be assumed to be false. In this I agree with Gerard. OTOH, I don't think it hurts very much to add them anyway.
@Gerard: Please note that the use of novalue in qualifiers does not have any negative effects on tools that rely on the value not being there. We do not encode "novalue" as a special value, so tools that search for some arbitrary value will never find novalues (on any level: statement, qualifier, reference). So, overall, it is not such a big deal if people add novalue qualifiers in some places. Only tool developers who create own query services (not based on our RDF exports) must be aware that it would not be a good idea to treat "novalue" like a value internally. But that's a very small and rather competent group :-)
Anyway, even if we generally agree that "not stated" means "not true" on the level of qualifiers, there could be cases where the explicit "novalue" could be valuable as documentation for other human users.
Regards,
Markus
On 26.04.2015 00:52, Denny Vrandečić wrote:
Actually I think that having "no value" for the end date qualifier probably means that it has not ended yet. There is no other way to express whether this information is currently merely incomplete (i.e. it has ended, but no one bothered to fill it in) or not (i.e. it has not ended yet). This is pretty much the same use case as for normal claims.
Other qualifiers I could imagine where an explicit "no value" would make sense is P678, I guess.
In references it might make sense to state explicitly that the source does not have an issue number or an ISSN, etc., in order for example to allow cleanup of references and to mark the cases where a reference does not have a given value from those cases where it is merely incomplete.
I don't have superstrong arguments as you see (I would have much stronger arguments for "unknown value"), but I would prefer not to forbid "no value" in those cases explicitly, because it might be useful and it is already there.
[1] https://www.wikidata.org/wiki/Special:WhatLinksHere/Q18615010
On Thu, Apr 23, 2015 at 1:27 PM, Stas Malyshev <smalyshev@wikimedia.org mailto:smalyshev@wikimedia.org> wrote:
Hi! I was lately looking into the use of novalue in wikidata, specifically in qualifiers and references. While use of novalue in property values is pretty clear for me, not sure it is as useful in qualifiers and refs. Example: https://www.wikidata.org/wiki/Q62#P6 As we can see, Edwin Mah Lee is the mayor of San Francisco, with end date set to "novalue". I wonder how useful is this - most entries like this just omit end date, and if we query this in SPARQL, for example, we would do something like "FILTER NOT EXISTS (?statement q:P582 ?enddate)". Inconsistently having novalues there makes it harder to process both visually (instead of just looking for one having no end date we need to look for either no end date or end date with specific "novalue") and automatically. And in overwhelming majority of cases I feel "novalue" and absence of value model exactly the same fact - it is a current event, etc. Is there any useful case for using novalue there? Another example: https://www.wikidata.org/wiki/Q2866#P569 Here we have reference with "stated in":"no value". I don't think I understand what it means - not stated anywhere? How would we know to make such claim? Is a lie? Why would we keep confirmed lies in the data? Does not have confirmed source that we know of? Many things do, why would we have "stated in" in this particular case? Summarily, it is unclear for me that novalue in references is ever useful. To quantify this, we do not have a lot of such things: on the partial dump I'm working with for WDQS (which contains at least half of the DB) there are 14 novalue refs and 13 properties using novalue as qualifier, leader being P582 with 200+ uses, and overall 422 uses. So volume-wise it's not a big deal but I'd like to figure out what's the right thing to do here and establish some guidelines. Thanks, -- Stas Malyshev smalyshev@wikimedia.org <mailto:smalyshev@wikimedia.org> _______________________________________________ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org <mailto:Wikidata-l@lists.wikimedia.org> https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Hoi, I regularly query for for instance claim[31] ie any instance of whatever... I would also query for the existence of a date of death in a similar way. for me a claim with a "whatever it is that says that there is no value" would be a positive result and I would not consider it for any processing.
It is fine that RDF does whatever, it is not what we use on a day to day basis with Wikidata. Consequently what RDF does has no practical implication for me. Thanks, GerardM
On 26 April 2015 at 19:54, Markus Krötzsch markus@semantic-mediawiki.org wrote:
Quick reply to Denny and Gerard:
@Denny: I think it makes sense to treat qualifiers under a closed-world semantics. That is: what is not there can safely be assumed to be false. In this I agree with Gerard. OTOH, I don't think it hurts very much to add them anyway.
@Gerard: Please note that the use of novalue in qualifiers does not have any negative effects on tools that rely on the value not being there. We do not encode "novalue" as a special value, so tools that search for some arbitrary value will never find novalues (on any level: statement, qualifier, reference). So, overall, it is not such a big deal if people add novalue qualifiers in some places. Only tool developers who create own query services (not based on our RDF exports) must be aware that it would not be a good idea to treat "novalue" like a value internally. But that's a very small and rather competent group :-)
Anyway, even if we generally agree that "not stated" means "not true" on the level of qualifiers, there could be cases where the explicit "novalue" could be valuable as documentation for other human users.
Regards,
Markus
On 26.04.2015 00:52, Denny Vrandečić wrote:
Actually I think that having "no value" for the end date qualifier probably means that it has not ended yet. There is no other way to express whether this information is currently merely incomplete (i.e. it has ended, but no one bothered to fill it in) or not (i.e. it has not ended yet). This is pretty much the same use case as for normal claims.
Other qualifiers I could imagine where an explicit "no value" would make sense is P678, I guess.
In references it might make sense to state explicitly that the source does not have an issue number or an ISSN, etc., in order for example to allow cleanup of references and to mark the cases where a reference does not have a given value from those cases where it is merely incomplete.
I don't have superstrong arguments as you see (I would have much stronger arguments for "unknown value"), but I would prefer not to forbid "no value" in those cases explicitly, because it might be useful and it is already there.
[1] https://www.wikidata.org/wiki/Special:WhatLinksHere/Q18615010
On Thu, Apr 23, 2015 at 1:27 PM, Stas Malyshev <smalyshev@wikimedia.org mailto:smalyshev@wikimedia.org> wrote:
Hi! I was lately looking into the use of novalue in wikidata, specifically in qualifiers and references. While use of novalue in property values
is pretty clear for me, not sure it is as useful in qualifiers and refs.
Example: https://www.wikidata.org/wiki/Q62#P6 As we can see, Edwin Mah Lee is the mayor of San Francisco, with end date set to "novalue". I wonder how useful is this - most entries like this just omit end date, and if we query this in SPARQL, for example,
we would do something like "FILTER NOT EXISTS (?statement q:P582 ?enddate)". Inconsistently having novalues there makes it harder to process both visually (instead of just looking for one having no end date we need to look for either no end date or end date with specific "novalue") and automatically. And in overwhelming majority of cases I feel "novalue" and absence of value model exactly the same fact - it is a current event, etc. Is there any useful case for using novalue there?
Another example: https://www.wikidata.org/wiki/Q2866#P569 Here we have reference with "stated in":"no value". I don't think I understand what it means - not stated anywhere? How would we know to make such claim? Is a lie? Why would we keep confirmed lies in the
data? Does not have confirmed source that we know of? Many things do, why would we have "stated in" in this particular case? Summarily, it is unclear for me that novalue in references is ever useful.
To quantify this, we do not have a lot of such things: on the partial dump I'm working with for WDQS (which contains at least half of the
DB) there are 14 novalue refs and 13 properties using novalue as qualifier, leader being P582 with 200+ uses, and overall 422 uses. So volume-wise it's not a big deal but I'd like to figure out what's the right thing to do here and establish some guidelines.
Thanks, -- Stas Malyshev smalyshev@wikimedia.org <mailto:smalyshev@wikimedia.org> _______________________________________________ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org <mailto:Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
On 26.04.2015 22:16, Gerard Meijssen wrote:
Hoi, I regularly query for for instance claim[31] ie any instance of whatever... I would also query for the existence of a date of death in a similar way. for me a claim with a "whatever it is that says that there is no value" would be a positive result and I would not consider it for any processing.
Not sure what you are trying to say here. Do you think there is a problem in how Wikidata Query is treating novalue? I am sure this can be fixed if it is the case. But I am not the right person to answer this.
Markus
Hoi, It is a matter of perspective. From my perspective a value exists or not. Depending on that I may want to process. When you state novalue there is a value of novalue and that is not the same as there not being a value in the first place. Thanks, GerardM
On 26 April 2015 at 22:25, Markus Krötzsch markus@semantic-mediawiki.org wrote:
On 26.04.2015 22:16, Gerard Meijssen wrote:
Hoi, I regularly query for for instance claim[31] ie any instance of whatever... I would also query for the existence of a date of death in a similar way. for me a claim with a "whatever it is that says that there is no value" would be a positive result and I would not consider it for any processing.
Not sure what you are trying to say here. Do you think there is a problem in how Wikidata Query is treating novalue? I am sure this can be fixed if it is the case. But I am not the right person to answer this.
Markus
Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
On 26.04.2015 22:28, Gerard Meijssen wrote:
Hoi, It is a matter of perspective. From my perspective a value exists or not. Depending on that I may want to process. When you state novalue there is a value of novalue and that is not the same as there not being a value in the first place.
Ah, I see. I think any query interface should allow you to find both: things with a novalue-claim and things with no claim at all. You can then pick your perspective on these two things as you like.
However, it would be an error to treat "novalue" as a kind of "some value", and it would be an even bigger error to treat "novalue" as a specific value (that can be equal to other such values). For example, a WDQ tree query should never go through "novalue" (and not even through a "somevalue" a.k.a. "unknownvalue"), as I am sure you will agree.
Cheers,
Markus
Hoi, As you know I am not a fan at all about these special values. I can follow logic but do not need to agree.
When "novalue" is not to be seen as a value. What is the point.. The point is to state there is no value right ? .. and that makes it of value. Right ahum, I admit it is confusing but is that not the point. ?
It is similar to a lot of referenced statements that when you check them are NOT what is stated at all. Thanks, GerardM
On 26 April 2015 at 22:37, Markus Krötzsch markus@semantic-mediawiki.org wrote:
On 26.04.2015 22:28, Gerard Meijssen wrote:
Hoi, It is a matter of perspective. From my perspective a value exists or not. Depending on that I may want to process. When you state novalue there is a value of novalue and that is not the same as there not being a value in the first place.
Ah, I see. I think any query interface should allow you to find both: things with a novalue-claim and things with no claim at all. You can then pick your perspective on these two things as you like.
However, it would be an error to treat "novalue" as a kind of "some value", and it would be an even bigger error to treat "novalue" as a specific value (that can be equal to other such values). For example, a WDQ tree query should never go through "novalue" (and not even through a "somevalue" a.k.a. "unknownvalue"), as I am sure you will agree.
Cheers,
Markus
Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l