Wikidata April 2012

wikidata@lists.wikimedia.org

73 participants
111 discussions

[Wikidata-l] Data format for scientific data
by Alexander Täschner 05 Apr '12

05 Apr '12

Hi! I am a particle physicists, so I'm interested in using the Wikidata project in order to keep physical constants, like the mass or lifetime of the neutron, in sync between different articles. In this use case it will be important to have not only the possibility to store and retrieve the value of this constant, together with the reference to the data source, but also the uncertainty. Would it be possible to include a special data type for such constants where the value, the total uncertainty and the unit of measurement can be stored together with the reference (the additional storage of statistical and systematic uncertainty would be nice, but not necessary). Best regards, Alexander

3 2

[Wikidata-l] Output formats
by Bináris 05 Apr '12

05 Apr '12

Hi, this is about the output format of numerical, money-type and date-type data. This is also language dependent; for example in Hungarian decimal sign is a comma rather then dot and the thousand separator is a space, not a comma. (In computer environment preferably a non breaking space.) Will the interface of Wikidata handle these national/local differences? I think this would be much more efficient than let the recipient projects transform data to their own format. Some Wikipedias use templates to translate miles to kilometers and vice versa. This translation often fails due to a format error and results in funny values. (Example: a train station that is 31 km away from the town. [1]) As Wikidata has controlled data, the conversion of measurment units would be useful to solve locally, and serve data in the desired format. (A mile expressed in kilometers is also a piece of data that can be stored, but may perhaps need stronger protection as it has an effect on the outut of many other data -- such as a highly used template is editable for admins only.) [1] http://en.wikipedia.org/w/index.php?title=Mez%C5%91keresztes&diff=next&oldi… -- Bináris

3 2

[Wikidata-l] Type namespace
by John McClure 05 Apr '12

05 Apr '12

Hello- pardon me for any hint of ranting .... Please consider a different namespace than the Category namespace for defining TYPEs of things; of course, I am proposing a standard MW "Type" namespace. There are benefits to this refinement. 1. Easier for users - today categories are a mixture of plurals and singulars, a complete mess. Users simply don't think that the -type- of a page is a plural concept. They just don't. I have seen time and again their confusion. They wonder why a type of page is "Persons" not "Person". 2. Easier for admins - today the category namespace is open to all users, as it should be to accommodate "folksonomies". Wikidata though will introduce types of pages and surely would like to have some control over the controlled vocabulary for the wiki. What does one do? I guess lock down individual category pages. I don't know if an admin can list pages that are 'controlled' without resorting to yet another category. A separate namespace for controlled vocabularies can have more efficient namespace-level security. 3. Easier for the system - today an sql index holds categories attached to pages; it's probably the most heavily referenced in the system. If every page in a wiki has a category, and there are say 4 standard/special attributes per page, then there's a minimum 5 triples per page. For 10K pages, there are 50K triples minimum before a single infobox factoid has been tagged. That's a burden that may not scale. Allocating a separate index for the Type namespace is smart because a global/wikifarm's Type namespace could be referenced rather than a local Category namespace. 4. Sem forms - today forms are displayed in part by referencing the MW namespace. As an admin I'd like to allow people to create their own forms. But the MW namespace -- being a utility namespace -- consequently needs to be open to the general public. Forms instead should be associated with page-types via the Type namespace, an unambiguous design that puts MW: back under admin control. 5. Ontological justification - are (owl) classes really just souped-up categories? A takeaway I have from a conversation with Ward Cunningham is that the Category namespace is simply for lists of things (it was even at first called "List" I believe). Yes, classes and lists have the common notion of a set of things, but really, when did rdfs:Class become a subclass of rdf:Bag? Never was. And in this regard note that rdf:type doesn't reference a resource whose rdf:type is "Type" as one (not in the club) might naively think. Strange to most when singular names designate a set of things (like owl:Class does, which should be called owl:Classes I guess) Typing category pages is pretty problematic too ... given the distinction between metaclasses and annotation properties versus classes and class properties. 6. Common sense -- categories are LISTS of things, they should not be used for types of things. Types of things are singular in nature (with exceptions) while categories pretty much ALWAYS have plural names so as to be consistent with the definition that a category is just a list of pages. IOW, categories should not be used for types of things nor for subject headings. I am not seeking the perfect as another writer forewarns us all. I am seeking to learn from mistakes, not to burn them into the next generation of MW software. In short, I'd like to see separate namespaces for subjects, nouns, adjectives, adverbs, participles, etc - a complete dictionary of common words & phrases. Frankly I see it harmful to the whole community to throw all these into one namespace -- category -- as it results in an unmanageable design and wildly unpredictable contents. In summary I'd like to see a LEXICAL SEMANTIC design for Wikidata. Again, this note does *not* seek perfection, it is seeking to identify and to learn from our experiences. My experience is that the Category namespace has been functionally overloaded to the detriment of 'good' system design. John McClure

2 1

[Wikidata-l] A common, open access, human and computer readable data pool for scientist
by Leukippos Institute 05 Apr '12

05 Apr '12

Hi! I am a synthetic biologist. I see big changes in the way we do science and how we will publish in the future. I see a huge need to publish all scientific data (especially raw data) in a common free accessible data pool. This data should be machine readable. We face in science huge data amounts, huge number of publications. Nobody is longer able to read all the literature. We need a computer assisted system to analyze these data and develop novel concepts from them. We need a structuring of these data on a higher abstraction level. We need to be able to go from abstraction to detail. Thus I see the Wiki data project of potentially big value for scientist. I would like that this project could serve in that manner the scientific community and provide standards for submission of data for scientist. Any plans in this direction? A bit more about the reasons, why I find this very important: I would summarize the upcoming trend in science as this: From Hypothesis to Data-Driven Research, or the End of the Age of Science, and the Dawn of the Age of Systemics. We can observe a paradigm change in science, and two computer developments are responsible. The first is the enormous storage capacity in the cloud. The second is that a huge number of computers have been connected and organized in social networks. These changes have resulted in huge quantities of data and complex systems, a problem normal science cannot solve. The traditional hypothesis method can deal with simple correlations between A and B. But the method fails if the problem becomes more complex. Science has been synonymous with a separating, reductionistic approach. Contemporary science has come to a point where we will change the perspective from reductionism to holism. We now move to a position that sees things together: short systemics. The data-driven science approach changes the scientific method and results in a practice called "science 2.0" (named after web 2.0). "Science" will happen in the cloud, with new publishing formats such as direct publishing on blogs and direct publishing of our data in a human and computer readable database, new and fast ways of collaboration in social networks, and systems theory as the new "science" paradigm. Systems theory is already important in fields such as systems biology and its practical application synthetic biology.see NextGen VOICES, Science 6 January 2012: vol. 335 no. 6064 pp. 36-38 DOI: 10.1126/science.335.6064.36 http://www.sciencemag.org/content/335/6064/36/suppl/DC1 Best Gerd

7 15

Re: [Wikidata-l] A common, open access, human and computer readable data pool for scientist
by JFC Morfin 05 Apr '12

05 Apr '12

Lydia wrote: >I would like that this project could serve in that manner >the scientific community and provide standards for submission of data >for scientist. Any plans in this direction? The en.wikipedia page on "big data" should give the answers, if it was kept current with the current "Big Data Research and Development Initiative" buzz of the world (http://www.whitehouse.gov/blog/2012/03/29/big-data-big-deal), in France (http://www.bigdataparis.com/fr-index.php), etc. >Hi! It is up to the community to later decide what goes into Wikidata >and what doesn't. So I can't give you a "yes this will be ok" or "no >this will not happen". However if this is not going to happen in Wikidata >itself there is probably demand for a separate instance where this would >be possible. Lydia, are you not afraid that an upside-down ("past to possible") rather than a "possible from present" approach is to limitate us? Once we have finalized Wikidata as a data-store, we will have decided of its output/inout capacities. jfc

1 0

Re: [Wikidata-l] Industry, JTC1/ISO, W3C, IUse, Wikimedia - where are we ?
by Denny Vrandečić 05 Apr '12

05 Apr '12

Dear JFC, we do not have a charter, but we have a technical project proposal [1] and a list of fundamental requirements [2]. [1] https://meta.wikimedia.org/wiki/Wikidata/Technical_proposal [2] https://meta.wikimedia.org/wiki/Wikidata/Notes/Requirements I am not sure if this addresses your comments, as I am frankly having trouble to follow them. You are frequently using of neologisms and weird terms. Also you seem to be discussing within a conceptual space that is on a different level than we are interested in. In short, we have for Wikidata two pragmatic goals: * Wikidata's first aim is to support the Wikipedias with their language links * Wikidata's second aim is to support the Wikipedias with the infoboxes Out of the support for these tasks, other interesting use cases might and are expected to arise. Until I manage to understand how your comments relate to one of these goals, I will personally take the liberty to ignore your comments. Cheers, Denny 2012/4/5 JFC Morfin <jefsey(a)jefsey.com> > At 17:20 04/04/2012, Stracke, Christian wrote: > >> Dear all, >> I'm sorry bit "jfc" is mixing up different standards and committees (what >> is easy): >> > > Sorry for the confusion. The problem with ISO and ISO documents is that > they are not documented on (i.e. interested in? i.e. interesting for?) > Wikipedia. > > 1. 19788-1, can be found for free at: http://www.sis.se/PageFiles/** > 2140/MLR-utkast%**20arbetsmaterial.pdf<http://www.sis.se/PageFiles/2140/MLR-utkast%20arbetsmaterial.pdf> > > 2. The Wikidata issue, as I see it, is that we do not start from an > architectural framework based on a business plan, charter or TOR. It seems > that the current target is to implement a W3C semantic web > current-Wikimedia general data-store. This does not consider Wikidata as a > major contributor to the future-Wikimedia development. > > Agreeing on the role Wikidata is to play in the Wikimedia adaptation to > the Internet future should be our first point of consensus. Otherwise : > > 1. Wikidata is of no direct interest to Internet Users, only to Wikipedia > contributors; > 2. another Wikimedia project is to be considered as a Wikidata back-end. > > Certainly, at a time, there will be a need for a datamodel. But first we > need to locate Wikidata in the Wikimedia strategy as a data-collector and > as a data-desseminator, not only as a data-store, in a real world where > there are five main conceptual channels : (1) Business World diversity > (Search Engines, etc.), (2) JTC1, (3) W3C, (4) emergent IUsers [Intelligent > lead users: FLOSS, IUCG], (5) Wikimedia. Possibly we have to consider Big > Data, and the border with Big Data as it will emerge. > > As being on the IUse side, I know that we need a convergence of these > channels if we want to attain a good degree of (meta/syllo) data > interchange and not multiply costs, complication and lack of progress > everywhere (not only in sciences). Wikidata is not only about millions of > lonely contributors/authors as Wikipedia is, it also is about > > * ISO 11179 conformant very large sources contributions > * diktyologies (from greek diktyos, network), i.e. dynamically > auto-maintained intelligent ontologic spaces. This is a part we have to > explore and support with new concepts (such as syllodata: data between > metadata), cloud architecture, programming languages. Otherwise Wikimedia > will soon be a story of the past, its free commons being copied by many > (like http://wikipedia.orange.fr) and distributed and automatically > enhanced by powerfull diktyologies with integrated big (scientific, etc.) > data servers. > > jfc > > -- Project director Wikidata Wikimedia Deutschland e.V. | Eisenacher Straße 2 | 10777 Berlin Tel. +49-30-219 158 26-0 | http://wikimedia.de Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e.V. Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter der Nummer 23855 B. Als gemeinnützig anerkannt durch das Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985.

1 0

[Wikidata-l] Industry, JTC1/ISO, W3C, IUse, Wikimedia - where are we ?
by JFC Morfin 05 Apr '12

05 Apr '12

At 17:20 04/04/2012, Stracke, Christian wrote: >Dear all, >I'm sorry bit "jfc" is mixing up different standards and committees >(what is easy): Sorry for the confusion. The problem with ISO and ISO documents is that they are not documented on (i.e. interested in? i.e. interesting for?) Wikipedia. 1. 19788-1, can be found for free at: http://www.sis.se/PageFiles/2140/MLR-utkast%20arbetsmaterial.pdf 2. The Wikidata issue, as I see it, is that we do not start from an architectural framework based on a business plan, charter or TOR. It seems that the current target is to implement a W3C semantic web current-Wikimedia general data-store. This does not consider Wikidata as a major contributor to the future-Wikimedia development. Agreeing on the role Wikidata is to play in the Wikimedia adaptation to the Internet future should be our first point of consensus. Otherwise : 1. Wikidata is of no direct interest to Internet Users, only to Wikipedia contributors; 2. another Wikimedia project is to be considered as a Wikidata back-end. Certainly, at a time, there will be a need for a datamodel. But first we need to locate Wikidata in the Wikimedia strategy as a data-collector and as a data-desseminator, not only as a data-store, in a real world where there are five main conceptual channels : (1) Business World diversity (Search Engines, etc.), (2) JTC1, (3) W3C, (4) emergent IUsers [Intelligent lead users: FLOSS, IUCG], (5) Wikimedia. Possibly we have to consider Big Data, and the border with Big Data as it will emerge. As being on the IUse side, I know that we need a convergence of these channels if we want to attain a good degree of (meta/syllo) data interchange and not multiply costs, complication and lack of progress everywhere (not only in sciences). Wikidata is not only about millions of lonely contributors/authors as Wikipedia is, it also is about * ISO 11179 conformant very large sources contributions * diktyologies (from greek diktyos, network), i.e. dynamically auto-maintained intelligent ontologic spaces. This is a part we have to explore and support with new concepts (such as syllodata: data between metadata), cloud architecture, programming languages. Otherwise Wikimedia will soon be a story of the past, its free commons being copied by many (like http://wikipedia.orange.fr) and distributed and automatically enhanced by powerfull diktyologies with integrated big (scientific, etc.) data servers. jfc

1 0

Re: [Wikidata-l] Wikidata: New ISO Metadata standard MLR published last year
by Nadja Kutz 05 Apr '12

05 Apr '12

Yury Katkov: As far as I know, sometimes thedrafts of ISO standards are available for free. Is there a free draft version somewhere One can get some text information within the ISO database, here is the documentation: http://isotc.iso.org/livelink/livelink?func=ll&objId=8421449&objAction=Open… but you don't get RDF specifications etc. that is a search at: http://www.iso.org/obp/ui/#search for for example: cutting and grinding (there is sofar no direct link to a search request) reveals a lot of unclickable items, but just ISO numbers and a little text. but then the database browsing is still in beta.

1 0

Re: [Wikidata-l] Wikidata: New ISO Metadata standard MLR published last year
by Nadja Kutz 05 Apr '12

05 Apr '12

The ISO Standard sounds very interesting, but the price is really a problem. If I understand this correctly alone the basic quantities in cutting and grinding part 3 are 66 CHF http://www.iso.org/iso/iso_catalogue/catalogue_ics/catalogue_detail_ics.htm… which could be a rather small part of an infobox . Also if Wikidata seems to have currently some money (does it?) to buy documents this can thus easily get very expensive. moreover wikidata would need to make these standards readable for everyone in order to work on it and this would probably be a copyright issue, because this would involve not only "reading only" after purchase but also involve publishing the documents in a public wiki.

2 1

[Wikidata-l] collecting list and infobox examples
by Lydia Pintscher 04 Apr '12

04 Apr '12

Heya folks :) It'd be great if you could help collect input for lists and infoboxes that already exist now and that could in the future be supported by Wikidata. I have started two pages at http://meta.wikimedia.org/wiki/Wikidata/Infoboxes and http://meta.wikimedia.org/wiki/Wikidata/Queries for this. Cheers and thanks for the help Lydia -- Lydia Pintscher - http://about.me/lydia.pintscher Community Communications for Wikidata Wikimedia Deutschland e.V. Eisenacher Straße 2 10777 Berlin www.wikimedia.de Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V. Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter der Nummer 23855 Nz. Als gemeinnützig anerkannt durch das Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985.

1 0

← Newer
1
...
4
5
6
7
8
9
10
11
12
Older →

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

Wikidata April 2012