hehe, maybe some kind inferences can lead to a good heuristic to suggest properties and values in the entity suggester. As they naturally become "softer" and "softer" by combination of uncertainties, this could also provide some kind of limits for inferences by fixing a probability below which we don't add a fuzzy fact to the set of facts.

Maybe we could fix an heuristic starting fuzziness or probability score based on  "1 sourced claim" -> big score ; one disputed claim ; based on ranks and so on.


2014-05-29 13:43 GMT+02:00 Markus Krötzsch <markus@semantic-mediawiki.org>:
On 29/05/14 12:41, Thomas Douillard wrote:
@David:
I think you should have a look to fuzzy logic
<https://www.wikidata.org/wiki/Q224821>:)

Or at probabilistic logic, possibilistic logic, epistemic logic, ... it's endless. Let's first complete the data we are sure of before we start to discuss whether Pluto is a planet with fuzzy degree 0.6 or 0.7 ;-)

(The problem with quantitative logics is that there is usually no reference for the numbers you need there, so they are not well suited for a secondary data collection like Wikidata that relies on other sources. The closest concept that still might work is probabilistic logic, since you can really get some probabilities from published data; but even there it is hard to use the probability as a raw value without specifying very clearly what the experiment looked like.)

Markus



2014-05-29 1:48 GMT+02:00 David Cuenca <dacuetu@gmail.com
<mailto:dacuetu@gmail.com>>:


    Markus,


    On Thu, May 29, 2014 at 12:53 AM, Markus Krötzsch
    <markus@semantic-mediawiki.org
    <mailto:markus@semantic-mediawiki.org>> wrote:

        This is an easy question once you have been clear about what
        "human behaviour" is. According to enwiki, it is a "range of
        behaviours *exhibited by* humans".


    Settled :) Let's leave it at <defined as a trait of>

        What would anybody do with this data? In what application could
        it be of interest?


    Well, our goal it to gather the whole human knowledge, not to use
    it. I can think of several applications, but let's leave that open.
    Never underestimate human creativity ;-)


        Moreover, as a great Icelandic ontologist once said: "There is
        definitely, definitely, definitely no logic, to human behaviour" ;-)


    Definitely, that is why we spend so much time in front of flickering
    squares making them flicker even more. It makes total sense :P

        I think "constraints" are already understood in this way. The
        name comes from databases, where a "constraint violation" is
        indeed a rather hard error. On the other hand, ironically,
        constraints (as a technical term) are often considered to be a
        softer form of modelling than (onto)logical axioms: a constraint
        can be violated while a logical axiom (as the name suggests) is
        always true -- if it is not backed by the given data, new data
        will be inferred. So as a technical term, "constraint" is quite
        appropriate for the mechanism we have, although it may not be
        the best term to clarify the intention.


    Ok, I will not fight traditional labels nor conventions. I was
    interested in pointing out to the inappropriateness of using a word
    inside our community with a definition that doesn't matches its use,
    when there is another word that matches perfectly and conveys its
    meaning better to users.

        Some important ideas like classification (instance of/subclass
        of) belong completely to the analytical realm. We don't observe
        classes, we define them. A planet is what we call a "planet",
        and this can change even if the actual lumps in space are pretty
        much the same.


    Agreed. Better labels could be <defined as instance of>/<defined as
    subclass of>

        Now inferences are slightly different. If we know that X implies
        Y, then if "A says X" we can infer that (implicitly) "A says Y".
        That is a logical relationship (or rule) on the level of what is
        claimed, rather than on the level of statements. Note that we
        still need to have a way to find out that "X implies Y", which
        is a content-level claim that should have its own reference
        somewhere. We mainly use inference in this sense with "subclass
        of" in reasonator or when checking constraints. In this case,
        the implications are encoded as subclass-of statements ("If X is
        a piano, then X is an instrument"). This allows us to have
        references on the implications.


    Nope, nope, nope. I was not referring to "hard" implications, but to
    heuristic ones.

    Consider that these properties in the item namespace:
    <defined as a trait of>
    <defined as having>
    <defined as instance of>

    Would translate as these constraints in the property namespace:
    <likely to be a trait of>
    <likely to have>
    <likely to be an instance of>


        In general, an interesting question here is what the status of
        "subclass of" really is. Do we gather this information from
        external sources (surely there must be a book that tells us that
        pianos are instruments) or do we as a community define this for
        Wikidata (surely, the overall hierarchy we get is hardly the
        "universal class hierarchy of the world" but a very specific
        classification that is different from other classifications that
        may exist elsewhere)? Best not to think about it too much and to
        gather sources whenever we have them ;-)


    I think it is good to think about it and to consider options to deal
    with it. Like for instance:
    <defined as instance of> "corresponds with item" <Wikimedia
    community concept>
    We already have items that refer to concepts that only make sense
    for us, so no change in that regard.

        At the moment, hard constraints (from definitions) and soft
        constraints (expectations) are simply mixed, and maybe this is
        fine since we handle them in a similar fashion (humans need to
        look how to fix the situation). Most constraints, even those
        that refer to definitions, are rather soft anyway since we apply
        them to statements, not to hard facts. Hard constraints can only
        occur in cases where the *encoding* of a statement in Wikidata
        is wrong (not the intended statement as such, but how it was
        translated to data).


    As explained above, expectations inferred from definitions should
    not be treated as hard constraints, but as soft ones.

    Micru

    _______________________________________________
    Wikidata-l mailing list
    Wikidata-l@lists.wikimedia.org <mailto:Wikidata-l@lists.wikimedia.org>
    https://lists.wikimedia.org/mailman/listinfo/wikidata-l





_______________________________________________
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l



_______________________________________________
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l