Wikidata June 2012

wikidata@lists.wikimedia.org

39 participants
47 discussions

Re: [Wikidata-l] Collaboration opportunity with MegaJoule.org

by Nadja Kutz

Von: Nadja Kutz <nadja(a)daytar.de> Datum: 14. Juni 2012 11:21:28 MESZ An: "Discussion list for the Wikidata project." <wikidata-l(a)lists.wikimedia.org> Betreff: Re:Re: [[meta:wikitopics]] updated Hello Evan Sherwin We just had a discussion in this thread (see http://article.gmane.org/gmane.org.wikimedia.wikidata/618) about companies who might be interested in a standartized way to represent meta-information. In fact since some while I am trying to make people think about the importance of a mathematical sound structuring and presentation of RDF standards for science and engineering applications, especially with respect to environmental issues, please see the article: http://www.azimuthproject.org/azimuth/show/Examples+of+semantic+web+applica… It might be interesting for you to hear that sofar the math and physics community did not react much on that - eventually this may be because it is a rather tedious, unrewarding work to implement such standards ? But there is in general and globally not so much support especially of mathematics, so it just may that it is right now a bit too much of a burden to the math community, next to the normal math tasks. And as you can see by our discussion in the thread the ISO seems to be equally hesitating. The above article is currently only located at the Azimuth project www.azimuthproject.org, which is a circle of mostly scientists who try to openly gather and structure environmental data. In the article there is also a description of a student project where visualization tools for RDF data where created. I had hoped that the students would put their software on sourceforge or so (and that I could announce this in the article about the semantic web applications), but they seem to be very busy these times, eventually they might even be now in internships at rather wellknown giant californian software industries :) The promotion of the article as a publication on the Azimuth project was sofar not so big. This may be because Azimuths public outreach program may not yet be overly advanced :). So my plan is currently to put the article on arxiv.org with the hope that some more mathematicians/scientists may get interested in these kind of tasks.

11 years, 11 months

Re: [Wikidata-l] [[meta:wikitopics]] updated

by jmcclure＠hypergrove.com

1. as mentioned several times, a standard for us to be considered must be free. Free as in "Everyone can get it without having to pay or register for it. I can give it to anyone legally without any restrictions." Free of patents. Free as in W3C. 2. I have taken another look at your page, and after starting to read it you simply loose me. You use so many terms without defining them. To give just a few examples: * "The NIF ontology is incorporated into the ontology for Wikitopics which shapes API designs." I do not know what the Wikitopics ontology is. The section beneath just lists a few keywords, but does not really explain it. I do not know what it means for ontologies to incorporate one another. I do not know what it means for an ontology to shape API designs. * "Wikipage naming conventions are used to name subobjects in an equally meaningful manner". Equally meaningful? To what? What does this even mean? You completely lost me here. * For the key wikipage transclusions, you do not explain what a "formatted topic presentation" is, a "formatted topic index", or a "formatted infobox". I think I understand the latter, but not the previous two. What are they? And if I indeed understand it right, are you saying that infoboxes have to be completely formatted in Wikidata, as Gregor has asked? Hello Denny, 1. There are likely several ways to accommodate your process requirements. And btw, I asked last month but received no response for a citation to relevant MWF policy on this issue, to detect whether your statement reflects the team's ELECTIVE policy or a MWF policy. Where's the benefit from imposing expenses magnitudes greater on everyone, to design develop & socialize solutions already known? And please mention how the wikidata community can be assured that the wikidata team's designs themselves don't infringe someone else's patent or copyright, a reassurance that would directly follow from MWF's purchase of rights to use an ISO standard. 2a. Surely you appreciate that Wikidata involves fielding ''some'' ontology, at least as suggested by your intention to include the (SMW) Property namespace. I don't know when you plan to publish wikidata's ontology, but certainly it must be done so overtly and soon, agile or not. I agree the ontology I proposed needs much fleshing out, but chief goals of the proposed ontology are pretty clear -- to provide a wiki-topic index, to support NIF tools directly, to capture provenance data, to reuse existing SMW tools and key international standards, and to establish various best-practices for the wider community. 2b. An ontology that 'shapes/controls API interfaces' means that the APIs' information model must align with the information model represented by the ontology. If the ontology includes an expiration-date as a required property, for instance, then the API needs to include an expiration-date as a required parameter in some fashion. 2c. One ontology incorporating another is perhaps a clumsy way to describe the process of associating a class or property defined in one ontology, to another in a different ontology, either through a subclass/subproperty relation or a documented or implemented transform. 2d."Equally meaningful" as the wiki-page naming conventions are, eg interwiki:lang:ns:pgnm is quite meaningful ... I am proposing SMW subobjects be named similarly, eg scope:lang:type:name, is the proposed structure for SMW subobject names. 2e. A 'formatted topic presentation' is the content displayed on a page for a topic. Wikidata will have a page called (Main:)Thomas Jefferson that displays a formatted topic presentation, showing information harvested from other wikis plus any information developed by the wikidata community itself. Using transclusion, anyone can embed (Main:)Thomas Jefferson into their wiki. A 'formatted topic index' (which certainly can be one part of a topic's formatted presentation) is a snippet that corresponds to the "Thomas Jefferson" heading in a subject index under which are many subtopics eg Jefferson, Thomas [1] [2] [3] -- Early years [4] [5] [6] -- Birth [7] [8] -- Formative influences [9] [10] -- etc 2f. Perhaps you missed my immediate reply [1] to Gregor. Yes all infoboxes (among other non/formatted artifacts) are '''transcluded''' from wikidata, without the nonsense of cross-wiki API calls for individual data-items, as I understand the wikidata team is now gearing to provide. Best regards - jmc [1] http://lists.wikimedia.org/pipermail/wikidata-l/2012-May/000588.html , for instance

11 years, 11 months

Re: [Wikidata-l] [[meta:wikitopics]] updated

by Nadja Kutz

John McClure wrote: "Of course, thanks for the pointer. Yes, I'd agree that 19788's ontology be closely reviewed for inclusion. 19788:2 standardizes the Dublin Core properties, the same I recommend for [[wikidata]] provenance data, the same slated for the [[wikidata]] ontology. But more to your point is that the entire ISO corpus would fit really well if it were viewed as a topic map whose topics and sub-topics can be referenced from [[wikidata]] artifacts such as property definitions." Hello John Frankly speaking I don't see why one would want to use topic maps. That is RDF triples are after an identification (canonical: elements with the same URI are identified) a labeled graph, here to be called "the" RDF graph. (I know that some people call the triples themselves "the" RDF graph, but why use a second word (namely graph) for triples? Triples are very trivial highly disconnected graph.). If I want to connect certain nodes of that graph to a topic I only need to supply these nodes with an extra triple which says ("this node belongs to this topic", i.e. something like (node, belongsto,thistopic) ) or modify the canonical identification map and the RDF graph will be a "topic map" or one has the case that the triples are already set out in "topics" that is for example if I have a set of triples with the same resource URI then upon canonical identification these are a kind of "topic map" (with all "legs" pointing in one direction) or am I missing out something crucial? However if you start with topics, you have no canonical information about the "internal structure" of a topic and in some cases you would need to artificially impose this in retrospect onto the datastructure. Like if your topic is "members of a society" , you have all the members and you would need some internal structure like a hierarchy then you would need to supply each member with a hierarchy classification (i.e. with extra data, which is usually different for each member). For the RDF case the person who gave you the triples could have made a choice of order which could be given upon canonical identification. I.e. in principle the internal structure depends on your identification map but there is a canonical one. You can of course mimick a RDF triple with a topic map by choosing the topic to be the ressource and one "leg" of your topic as a property and the topic which is connected with this "leg" as the object, but the choice of a leg is not canonical if there is more than one leg. Only if you would make all "legs" of a topic map into triples you would have something like a canonical assignment. I find these differences important. But may be I have overseen something or misunderstood about topic maps (I read about this issue what I found scattered around in the internet so this is not so unprobable). I had this kind of discussion with people from deepa mehta http://www.deepamehta.de/ because they used topic maps, but sofar nobody there could convince me about the distinguished advantages of topic maps. But the discussion was sofar rather brief. The discussion was because we discussed to what extend it would be possible to merge a student project we had at HTW Berlin ( a collaboration platform for visualizing RDF data called Mimirix http://www.daytar.de/art/MIMIRIX/) with deepa mehta, like for example one could use at least the backend, which has already a layout for access control (the deepa mehta people told me that they haven't yet really attacked the issue of access control) or one could use at lease the carefully designed client. May be you have other arguments for topic maps, as said I might have missed out something. I understand that there are other issues like the speed of adressability or direct access issues but these are then rather an issue of the serialization I find. So I didn't understand why for example the pregiven JSON structure of an JSONarray http://www.json.org/javadoc/org/json/JSONArray.html is not used in JSON-LD http://json-ld.org/spec/latest/json-ld-syntax/#sets-and-lists but thats another topic. In the context of applications of ISO metadata you may want to read: http://www.azimuthproject.org/azimuth/show/Examples+of+semantic+web+applica…

11 years, 11 months

[Wikidata-l] weekly summary #10

by Lydia Pintscher

Heya folks :) This is the Wikidata summary of the week before 2012-06-15 You can find the wiki version at http://meta.wikimedia.org/wiki/Wikidata/Status_updates/2012_06_15 = Development = * Reorganized the three git repos for our extensions into one. (This made us very happy and will cause fewer tears in the future. ;-)) * Added title normalization * Edits in the repo now have a comment added in the history entry * Sent proposal about interwiki table in MediaWiki core: http://lists.wikimedia.org/pipermail/wikitech-l/2012-June/061069.html * Finalized implementation of user interface for editing and adding aliases * Started review of Wikidata branch with Tim Starling * Untangled Wikibase Libs from Wikibase Repo code (removed cross-dependencies) * Refactored ItemView * Launched 20% code review session * Revised the storyboards for linking between Wikipedias * Checked what is left that absolutely needs to be finished before the demo at Wikimania See http://meta.wikimedia.org/wiki/Wikidata/Development/Current_sprint for what we’re working on next. You can follow our commits (https://gerrit.wikimedia.org/r/#/q/%28%28status:merged+project:mediawiki/ex…) and view the subset awaiting review (https://gerrit.wikimedia.org/r/#/q/status:open+%28project:mediawiki/extensi…). = Discussions/Press = * http://blog.slub-dresden.de/beitrag/2012/06/14/wikipedianer-in-der-slub-heu… * http://semanticweb.com/schema-org-wikidata-google-knowledge-graph-two-great… * http://www.tagesspiegel.de/medien/die-krise-der-wikipedia/6726588.html * http://blog.wikimedia.de/2012/06/12/the-second-month-of-wikidata/ * http://blog.wikimedia.de/2012/06/09/wikidata-needs-a-logo-and-you-can-help/ * Wikitopcs and SMW on the mailing list = Events = see http://meta.wikimedia.org/wiki/Wikidata/Events * Wikidata intro in Dresden * upcoming: Wikidata & Global Open Data: a roundtable discussion = other stuff = * We are looking for a logo and already got quite a few awesome submissions http://blog.wikimedia.de/2012/06/09/wikidata-needs-a-logo-and-you-can-help/ Anything to add? Please share! :) Cheers Lydia -- Lydia Pintscher - http://about.me/lydia.pintscher Community Communications for Wikidata Wikimedia Deutschland e.V. Obentrautstr. 72 10963 Berlin www.wikimedia.de Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V. Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter der Nummer 23855 Nz. Als gemeinnützig anerkannt durch das Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985.

11 years, 11 months

Re: [Wikidata-l] Testing

by Snaevar

>> * The feedback-dialogue behaves strange when clicking on "What is this?" >> (the width of the dialogue changes) >Ok. To be honest I am not sure if we'll fix this but we'll see. It's noted. Actually, this is an Moodbar bug. I have filed a bug for this at https://bugzilla.wikimedia.org/show_bug.cgi?id=37624 .

11 years, 11 months

[Wikidata-l] Wikidata test wikis are getting spammed

by Erik Moeller

http://wikidata-test-client.wikimedia.de/wiki/Special:RecentChanges It might be best to limit editing, or try to rope stewards into running their anti-spambots there ... -- Erik Möller VP of Engineering and Product Development, Wikimedia Foundation Support Free Knowledge: https://wikimediafoundation.org/wiki/Donate

11 years, 11 months

Re: [Wikidata-l] Wikidata test wikis are getting spammed

by Snaevar

Can´t we just install the SpamBlacklist extension and add links to the blacklist - MediaWiki:Spam-blacklist http://is.wikipedia.org/wiki/Melding:Spam-blacklist ? I think the best way to combat spam is to identify the unwanted behaviour and then block it. No need to limit editing on the whole wiki, as that would also limit the edits of users that have not done anything wrong.

11 years, 11 months

Re: [Wikidata-l] [[meta:wikitopics]] updated

by Nadja Kutz

John McClure wrote: "http://en.wikipedia.org/wiki/Topic_Maps gives links to iso/iec 13250" ??? There seems to be a misunderstanding. I was having rather the meta descriptions of general ISO items in mind, not the description of a topic map per se. (and apart from that I thought wikidata wanted to use RDF ? ) The ISO site is rather cryptic and most of it is not accessible (see e.g. http://isotc.iso.org/livelink/livelink/open/jtc1sc36) however I understood sentences (see http://www.iso.org/iso/iso_catalogue/catalogue_tc/catalogue_detail.htm?csnu…) like: "ISO/IEC 19788-1:2011 provides principles, rules and structures for the specification of the description of a learning resource; it identifies and specifies the attributes of a data element as well as the rules governing their use. The key principles stated in ISO/IEC 19788-1:2011 are informed by a user requirements-driven context with the aim of supporting multilingual and cultural adaptability requirements from a global perspective. ISO/IEC 19788-1:2011 is information-technology-neutral and defines a set of common approaches, i.e. methodologies and constructs, which apply to the development of the subsequent parts of ISO/IEC 19788." as that the ISO is in the process of turning parts of their database into a machine-readable standard format. So I assumed that the "identification of a data element" for learning ressources could be sort of planned to be extendend to all of (or already have?) their standards, which reaches from screw threads http://www.iso.org/iso/iso_catalogue/catalogue_tc/catalogue_detail.htm?csnu… over mathematical symbols http://www.iso.org/iso/iso_catalogue/catalogue_tc/catalogue_detail.htm?csnu… to copper alloys http://www.iso.org/iso/iso_catalogue/catalogue_tc/catalogue_tc_browse.htm?c… If that would be the case then companies etc. could link and conform to standards (here for example an unlinked reference to a DIN standard for http://de.wikipedia.org/wiki/Wärmeleitzahl in a product for insulation: http://isofloc.de/index.php?technische-daten) That is especially companies could be interested in promoting parts of their technical data in a ISO standartized format (which makes the comparision of technical data of products easier) so for example crawlers could collect products which set out certain technical specifications. Organizations could link easier to companies which conform e.g. to social standards etc. So when I wrote that Wikidata could eventually base their data on the ISO standards then I meant that it would make sense to have a structural correspondence between ISO standards and definitions (or e.g. standards from the DIN http://en.wikipedia.org/wiki/Deutsches_Institut_f%C3%BCr_Normung or other similar organizations) and the wikidata ontology, because wikidata would for materials etc. anyways have an entry for the corresponding standards. Friedrich Roehrs wrote:"1c. You're arguing over CHF 200 -- which extraordinarily-cheaply and fundamentally PROTECTS the MWF from copyright infringement suits? Can the SNAK architecture provide that reassurance to the MWF community?" I don't know what you mean by that. As you can see in the ISO links: http://www.iso.org/iso/iso_catalogue/catalogue_tc/catalogue_detail.htm?csnu… each item alone costs something in that range.

11 years, 11 months

[Wikidata-l] Tail end testing & quality

by Paul A. Houle

Hey guys, I think for a long time semantic projects have focused on getting data out but haven't incorporated the 'voice of the consumer'; yet, if you think of data quality as 'suitable for customer requirements' instead of 'process conforms to specification,' this is the first step. I'm looking forward to getting an early look of the data you publish and otherwise advocate for the point of view of people who want to use Wikidata data. I'd also like to advocate for "documentation driven development" here, because my experience with http://code.google.com/p/basekb-tools/ is that it helps a lot to (1) document the behavior of the system by providing specific query examples, (2) construct test sets for everything in the documentation (3) If you're not 100% proud of the story told in your documentation feed this back into the product. I think a little attention towards producing "readerly" output in conjunction with a "writerly" interface for Wikipedians could be key to make Wikidata one of the foundations of the Linked Data Galaxy.

11 years, 11 months

Re: [Wikidata-l] [[meta:wikitopics]] updated

by Nadja Kutz

Hello John, thanks for digging this out. I see in the Brochure there is not only a postbox but they have even an office where one could meat the ISO: 1, ch. de la Voie-Creuse It would be interesting to hear whats Wikimedia's opinion on this. Nadja

11 years, 11 months

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

Wikidata June 2012