Wikidata April 2012

wikidata@lists.wikimedia.org

73 participants
111 discussions

by John McClure

Denny said: I think the assumption everything has exactly one type is oversimplifying The assumption that everything is of multiple types is over-complicating. Usually you can tell from the first sentence in the Wikipedia page. "Tuesday is a day of the week" "Love is an emotion" "(Roman) Catholicism is a faith" "Gollum is a fictional character" "HAL-9000 is a character" "Noah is a Patriarch" "Enos was the first chimpanzee" So consensus certainly is being achieved among thousands of authors about the fundamental type of thing each of these pages represent. Disambiguation pages very commonly reference these types of things as in "Enos (chimpanzee)". Let's take Gollum. I can imagine a topic map has these subjects: 1. Character 1A. Fictional character 1A1. Fictional person 1A2. Fictional animal 1A3. Fictional ghost 1A4. Fictional god Another equally valid assertion is that Gollum is a Character that is typed as Fictional and Human thing (both these adjectives that are instances of owl:Class) -- so that a comprehensive system sometime in the future would reinterpret that Gollum is actually a Fictional person. As you say yourself, it's not useful to create a "perfect" system to handle every imaginable edge case **to the extent that they exist**. Personally I don't believe such edge cases can be found - I challenge anyone to provide me such an example. But more to the point of Wikidata. I don't believe for a second that WP will be reorganized into thousands of namespaces. Rather, I believe first, SUBOBJECT names must include the idea of 'namespace' for the efficiencies gained, and second, WP pages should be associated with the same set of nouns (noun-phrases) available for subobject names. IOW, it's an implementation issue whether a wiki's pages are named using these namespaces, so that the wiki as a whole can gain the same inherent efficiencies I've sketched for subobjects. Best - john

12 years, 2 months

Re: [Wikidata-l] SNAK -> assertion?

by John McClure

Hi Denny - Correct - URI opaqueness is required at the level of exchange but there's no such requirement internal to an application. During exchange, sure you can add a triple to assert that Density is the type of the object France#Density:2012_pop_estimate_Bilan_2010. Outside of exchange, type can be reported by an API or template which parses the namespace from the object's name. Obviously this would save one triple per object and per page, in the triples database - a big win in large datasets such as WP. A similar situation is found today in SMW applications. People create namespaces and then add a category of the *same name* to the pages and objects in that namespace; it's highly inefficient and makes me yelp !! Denny asks: How do I know the relationship between France#Density:2012_pop_estimate_Bilan_2010 and Germany#Density:2009_pop_estimate_CIA_2010? If you're asking how do you know they're both Density objects. That's easy - they're subobjects in the same 'namespace'. So in SMW speak, I'd like to use a predicate 'like' operator. To list all Density subobjects is just [[~#Density:+]] To list all for a page is just [[~France#Density:+]] To list all for both is [[~France#Density:+||~Germany#Density:+]] With regard to rdf:property - I think you may have meant rdf:predicate as there's no rdf:property I could find. It's domain is an rdf:Statement not rdf:Resource. Is a Snak a subtype of rdf:Statement? Why not name it as such, eg "WMFStatement", to clearly assert its lineage? In any event, I'm concerned about using rdf:predicate in this manner. In fact I suspect there's some confusion about reification, because I used dc:Subject / dc:subject to associate the 'reifier' with the object I think in a quite traditional uncontroversial manner. To use rdf mechanisms for reification in this context I'm worried could be particularly problematic. john PS yes I was giving triples!!!!

12 years, 2 months

Re: [Wikidata-l] (no subject)

by John McClure

12 years, 2 months

Re: [Wikidata-l] Type namespace

by John McClure

Denny said: "if I understand topic maps correctly it should be trivial to write a transformer that takes the export that Wikidata will offer and translates it into topic maps, if you are so inclined. This way the topic maps community can be served through that transformer easily, be it a web service or a parser-front-end" Sure, at a technical level, but let's focus on concepts & requirements though because your semantics may force mapping SNAKs to other protocols, bad karma for lossless exchange. And allow me to note that the "topic map" community is one & the same here as the "semantic" community (if not the greatest part of that community btw) so please, let's not create a we-they paradigm okay? I guess you know that Drupal is incorporating ISO Topic Maps and that ISO Topic Maps is a superset of the W3 RDF. It's definitely heartening to see your own evolution on this matter, creating a conceptual design so duplicative of this international standard (you can reach farther when standing on the shoulders of giants...). Last point, on the "information sources must be free" dictum, which you stated in your reply. ISO has a simple business model that is fair honest etc, to support its own non-profit operations. You can use their info for free. That said, can you provide a link to your stated policy, stated as an MWF policy, or is this your own policy -- I'd like to know more about the thinking behind it. Thanks - john

12 years, 2 months

Re: [Wikidata-l] SNAK -> assertion?

by John McClure

Denny said: you forgot to add something like France#Density:2012_pop_estimate_Bilan_2010 property Density . No I did not forget anything, given the Density 'namespace' in the subobject name. IOW your triple merely restates what is discernible from the subobject name. Maybe you should tell me what a "property" property is supposed to represent At most I made a misstatement that "Estimated is an adjective treated as a subclass of owl:Class" It should say "as an instance of owl:Class"

12 years, 2 months

Re: [Wikidata-l] SNAK -> assertion?

by John McClure

Denny said: But if you find a simpler, and more RDFish way to express the (below) statement, please feel free to enlighten me. I would be indeed very interested. "The population density of France, as of an 2012 estimate, is 116 per square kilometer, according to the "Bilan demographique 2010"." A wiki namespace-based approach is: France has_subobject France#Density:2012_pop_estimate_Bilan_2010 France#Density:2012_pop_estimate_Bilan_2010 ^source "Bilan ...2010" France#Density:2012_pop_estimate_Bilan_2010 ^npkm2 116 France#Density:2012_pop_estimate_Bilan_2010 Type Estimated France#Density:2012_pop_estimate_Bilan_2010 ^date 2012 Key: France#Density:2012_pop_estimate_Bilan_2010 is a named subobject This subobject is so named to prevent subobject name collisions Subobject names follow pagename naming conventions has_subobject is a reserved property name Density is an instance in a Type (or, Noun) namespace All properties prefixed by ^ are text properties ^source and ^date are Dublin Core properties (both text properties) Type is a Dublin Core property (an object property) Estimated is an adjective treated as a subclass of owl:Class Estimated is an abbreviation of "Estimated Things" ^npkm2 is an SI unit (amount per sq km)

12 years, 2 months

[Wikidata-l] Data_model: Metamodel: Wikipedialink

by Gregor Hagedorn

Apologies, should this already have been discussed. http://meta.wikimedia.org/wiki/Wikidata/Notes/Data_model#The_Metamodel defines: Wikipedialink = (Title, LanguageId, Badge?) In my experience the scope or extent of entities in different Wikipedias sometimes differs. One Wikipedia considers an entity a valid lemma, whereas another Wikipedia subsumes it in a larger lemma. Would changing the model to: Wikipedialink = (Title, LanguageId, Relation, Badge?) where relation can be broader match close match exact match narrower match etc. help in expressing these situations? I believe this could prevent problems later on. The default (and initial import of interlanguage links) could easily be "close match" - to be refined only where required. Gregor

12 years, 2 months

[Wikidata-l] Namespace-based model

by John McClure

Wiki namespaces are currently so underused people may not realize their importance: they provide crucial semantic information. For instance, consider the example given in Wikidata's data model article[1] "Obama was US Senator from Illinois from January 3, 2005 to November 16, 2008" which yielded these observations: a.. mainSnak of type PropertyValueSnak with subject "Obama", property "US Senator from", and value "Illinois" b.. auxiliary Snak of type PropertyIntervalSnak with property "in office" and interval "January 3, 2005 to November 16, 2008" (the subject of the auxiliary Snak is always the statement itself). An alternative lexical model might restate this as "The US Senator for the place Illinois is/was the person Obama from date January 3, 2005 until date November 16, 2008". a.. the prime resource being described is a US Senator page not so much the Obama page b.. Person:Obama is the subject complement of this US Senator via the linking verb-property 'was' or "is" c.. "for" is a property of this US Senator whose value is "Place:Illinois" d.. "from" is a property of this US Senator with the value "Date:January 3, 2005" e.. "until" is a property of this US Senator with the value "Date:November 16, 2008" A significant point is that that US Senator page is named Senator:Barack H Obama (or, Legislator:Barack H Obama or Public Employee:Barack H Obama, etc); it is of type US Senator, and it has these three properties, for, from, and until. In other words, if the content from this page is to be shown on the Person:Barack H Obama page, then that content should be transcluded from the Senator page; its semantic markup need not because software can interpret transcluded material as being a "subject" of, &or organic to, the Person page. Lastly I really don't know how developers will cognitively absorb made-up words like Snak. The need for the term does mystify me somewhat. I do think everyone seems to "get" namespaces, appreciating the clarity they provide. I hope concepts like "namespace" can be equally as prominent at this stage as Snaks in the Wikidata model. Regards, --Hypergrove (talk) 03:03, 5 April 2012 (UTC) [1] http://meta.wikimedia.org/w/index.php?title=Talk:Wikidata/Data_model

12 years, 2 months

[Wikidata-l] Is a mailing-list efficient to discuss the Wikidata project?

by Isabelle Ayel

Hi everybody, I registered to the mailing-list because I am working with Semantic-Mediawiki in different languages. I am amazed of the great response given to Wikidata project and the very active discussions I see. BUT: - How are you managing in the backstage all the bright ideas flowing from the mailing-list? - How people can follow discussions and arguments through a mailing-list? - How at the end of day are you deciding what will be the next step of the project according to all the ideas coming from the mailing-list? May I suggest the set-up of a WIKI to host the followup of the Wikidata Project. In companies we never ever use mailing-list to setup a project. Best regards Isabelle Ayel Tel/Fax: +34 976 088 566 Email: isabelle.ayel(a)gmail.com My blogs: isayel.net | pouce-café <http://zfr.posterous.com> | zaragoza día día <http://zaragozadiadia.posterous.com> Networking with: <http://www.facebook.com/isayel?ref=name> Facebook<http://www.facebook.com/isayel?ref=name> <http://twitter.com/nanouk> Twitter <http://twitter.com/nanouk> <https://plus.google.com/111878932945338226153> Google Plus<https://plus.google.com/111878932945338226153> <http://es.linkedin.com/in/isabelleayel> LinkedIn<http://es.linkedin.com/in/isabelleayel> Contact me: [image: Google Talk] isabelle.ayel [image: Skype] aspmaetb

12 years, 2 months

Re: [Wikidata-l] Type namespace

by John McClure

Hi Denny - Thanks for your reply and I am relieved. The design seems in the process of walking towards looking quite alot like ISO Topic Maps, I must say, because it designates no wall of separation between classes and topics. Today that wall exists in SMW in the dichotomy of Category vs all-other-namespaces, with the problems I've outlined. I'm reading into your document that there will be no wall - that the topics describing classes surely will exist in the same 'namespace' as the topics purported to be instances of these classes. Is this correct? If so, then there's less difference between ISO Topic Maps and your design than what I had origianlly thought. If indeed the direction of the project (as I detect on this email list) is to associate pages with classification schemes such as LCSH or many others, then we're talking about even more an ISO Topic Map orientation. Which brings me back to the many benefits of a *brutally honest* adoption of the ISO Topic Map technology. Extend & refine it for sure, but imho ISO Topic Map technology is an excellent fit with wiki implementations. It seems to be what you're incidentally doing anyway. cheers - john

12 years, 2 months

← Newer
1
2
3
4
5
6
7
8
9
...
12
Older →

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

Wikidata April 2012