I am a particle physicists, so I'm interested in using the Wikidata
project in order to keep physical constants, like the mass or lifetime
of the neutron, in sync between different articles. In this use case it
will be important to have not only the possibility to store and retrieve
the value of this constant, together with the reference to the data
source, but also the uncertainty. Would it be possible to include a
special data type for such constants where the value, the total
uncertainty and the unit of measurement can be stored together with the
reference (the additional storage of statistical and systematic
uncertainty would be nice, but not necessary).
this is about the output format of numerical, money-type and date-type
data. This is also language dependent; for example in Hungarian decimal
sign is a comma rather then dot and the thousand separator is a space, not
a comma. (In computer environment preferably a non breaking space.) Will
the interface of Wikidata handle these national/local differences? I think
this would be much more efficient than let the recipient projects transform
data to their own format.
Some Wikipedias use templates to translate miles to kilometers and vice
versa. This translation often fails due to a format error and results in
funny values. (Example: a train station that is 31 km away from the town.
) As Wikidata has controlled data, the conversion of measurment units
would be useful to solve locally, and serve data in the desired format. (A
mile expressed in kilometers is also a piece of data that can be stored,
but may perhaps need stronger protection as it has an effect on the outut
of many other data -- such as a highly used template is editable for admins
pardon me for any hint of ranting ....
Please consider a different namespace than the Category namespace for
defining TYPEs of things; of course, I am proposing a standard MW "Type"
namespace. There are benefits to this refinement.
1. Easier for users - today categories are a mixture of plurals and
singulars, a complete mess. Users simply don't think that the -type- of a
page is a plural concept. They just don't. I have seen time and again their
confusion. They wonder why a type of page is "Persons" not "Person".
2. Easier for admins - today the category namespace is open to all users, as
it should be to accommodate "folksonomies". Wikidata though will introduce
types of pages and surely would like to have some control over the
controlled vocabulary for the wiki. What does one do? I guess lock down
individual category pages. I don't know if an admin can list pages that are
'controlled' without resorting to yet another category. A separate namespace
for controlled vocabularies can have more efficient namespace-level
3. Easier for the system - today an sql index holds categories attached to
pages; it's probably the most heavily referenced in the system. If every
page in a wiki has a category, and there are say 4 standard/special
attributes per page, then there's a minimum 5 triples per page. For 10K
pages, there are 50K triples minimum before a single infobox factoid has
been tagged. That's a burden that may not scale. Allocating a separate index
for the Type namespace is smart because a global/wikifarm's Type namespace
could be referenced rather than a local Category namespace.
4. Sem forms - today forms are displayed in part by referencing the MW
namespace. As an admin I'd like to allow people to create their own forms.
But the MW namespace -- being a utility namespace -- consequently needs to
be open to the general public. Forms instead should be associated with
page-types via the Type namespace, an unambiguous design that puts MW: back
under admin control.
5. Ontological justification - are (owl) classes really just souped-up
categories? A takeaway I have from a conversation with Ward Cunningham is
that the Category namespace is simply for lists of things (it was even at
first called "List" I believe). Yes, classes and lists have the common
notion of a set of things, but really, when did rdfs:Class become a subclass
of rdf:Bag? Never was. And in this regard note that rdf:type doesn't
reference a resource whose rdf:type is "Type" as one (not in the club) might
naively think. Strange to most when singular names designate a set of things
(like owl:Class does, which should be called owl:Classes I guess) Typing
category pages is pretty problematic too ... given the distinction between
metaclasses and annotation properties versus classes and class properties.
6. Common sense -- categories are LISTS of things, they should not be used
for types of things. Types of things are singular in nature (with
exceptions) while categories pretty much ALWAYS have plural names so as to
be consistent with the definition that a category is just a list of pages.
IOW, categories should not be used for types of things nor for subject
headings. I am not seeking the perfect as another writer forewarns us all. I
am seeking to learn from mistakes, not to burn them into the next generation
of MW software. In short, I'd like to see separate namespaces for subjects,
nouns, adjectives, adverbs, participles, etc - a complete dictionary of
common words & phrases. Frankly I see it harmful to the whole community to
throw all these into one namespace -- category -- as it results in an
unmanageable design and wildly unpredictable contents.
In summary I'd like to see a LEXICAL SEMANTIC design for Wikidata. Again,
this note does *not* seek perfection, it is seeking to identify and to learn
from our experiences. My experience is that the Category namespace has been
functionally overloaded to the detriment of 'good' system design.
I am a synthetic biologist. I see big changes in the way we do science
and how we will publish in the future.
I see a huge need to publish all scientific data (especially raw data)
in a common free accessible data pool. This data should be machine
readable. We face in science huge data amounts, huge number of
publications. Nobody is longer able to read all the literature. We
need a computer assisted system to analyze these data and develop
novel concepts from them. We need a structuring of these data on a
higher abstraction level. We need to be able to go from abstraction to
Thus I see the Wiki data project of potentially big value for
scientist. I would like that this project could serve in that manner
the scientific community and provide standards for submission of data
for scientist. Any plans in this direction?
A bit more about the reasons, why I find this very important:
I would summarize the upcoming trend in science as this: From
Hypothesis to Data-Driven Research, or the End of the Age of Science,
and the Dawn of the Age of Systemics. We can observe a paradigm change
in science, and two computer developments are responsible. The first
is the enormous storage capacity in the cloud. The second is that a
huge number of computers have been connected and organized in social
networks. These changes have resulted in huge quantities of data and
complex systems, a problem normal science cannot solve. The
traditional hypothesis method can deal with simple correlations
between A and B. But the method fails if the problem becomes more
complex. Science has been synonymous with a separating, reductionistic
approach. Contemporary science has come to a point where we will
change the perspective from reductionism to holism. We now move to a
position that sees things together: short systemics. The data-driven
science approach changes the scientific method and results in a
practice called "science 2.0" (named after web 2.0). "Science" will
happen in the cloud, with new publishing formats such as direct
publishing on blogs and direct publishing of our data in a human and
computer readable database, new and fast ways of collaboration in
social networks, and systems theory as the new "science" paradigm.
Systems theory is already important in fields such as systems biology
and its practical application synthetic biology.see NextGen VOICES,
Science 6 January 2012: vol. 335 no. 6064 pp. 36-38 DOI:
>I would like that this project could serve in that manner
>the scientific community and provide standards for submission of data
>for scientist. Any plans in this direction?
The en.wikipedia page on "big data" should give the answers, if it
was kept current with the current "Big Data Research and Development
Initiative" buzz of the world
France (http://www.bigdataparis.com/fr-index.php), etc.
>Hi! It is up to the community to later decide what goes into Wikidata
>and what doesn't. So I can't give you a "yes this will be ok" or "no
>this will not happen". However if this is not going to happen in Wikidata
>itself there is probably demand for a separate instance where this would
are you not afraid that an upside-down ("past to possible") rather
than a "possible from present" approach is to limitate us? Once we
have finalized Wikidata as a data-store, we will have decided of its
we do not have a charter, but we have a technical project proposal  and
a list of fundamental requirements .
I am not sure if this addresses your comments, as I am frankly having
trouble to follow them. You are frequently using of neologisms and weird
terms. Also you seem to be discussing within a conceptual space that is on
a different level than we are interested in.
In short, we have for Wikidata two pragmatic goals:
* Wikidata's first aim is to support the Wikipedias with their language
* Wikidata's second aim is to support the Wikipedias with the infoboxes
Out of the support for these tasks, other interesting use cases might and
are expected to arise.
Until I manage to understand how your comments relate to one of these
goals, I will personally take the liberty to ignore your comments.
2012/4/5 JFC Morfin <jefsey(a)jefsey.com>
> At 17:20 04/04/2012, Stracke, Christian wrote:
>> Dear all,
>> I'm sorry bit "jfc" is mixing up different standards and committees (what
>> is easy):
> Sorry for the confusion. The problem with ISO and ISO documents is that
> they are not documented on (i.e. interested in? i.e. interesting for?)
> 1. 19788-1, can be found for free at: http://www.sis.se/PageFiles/**
> 2. The Wikidata issue, as I see it, is that we do not start from an
> architectural framework based on a business plan, charter or TOR. It seems
> that the current target is to implement a W3C semantic web
> current-Wikimedia general data-store. This does not consider Wikidata as a
> major contributor to the future-Wikimedia development.
> Agreeing on the role Wikidata is to play in the Wikimedia adaptation to
> the Internet future should be our first point of consensus. Otherwise :
> 1. Wikidata is of no direct interest to Internet Users, only to Wikipedia
> 2. another Wikimedia project is to be considered as a Wikidata back-end.
> Certainly, at a time, there will be a need for a datamodel. But first we
> need to locate Wikidata in the Wikimedia strategy as a data-collector and
> as a data-desseminator, not only as a data-store, in a real world where
> there are five main conceptual channels : (1) Business World diversity
> (Search Engines, etc.), (2) JTC1, (3) W3C, (4) emergent IUsers [Intelligent
> lead users: FLOSS, IUCG], (5) Wikimedia. Possibly we have to consider Big
> Data, and the border with Big Data as it will emerge.
> As being on the IUse side, I know that we need a convergence of these
> channels if we want to attain a good degree of (meta/syllo) data
> interchange and not multiply costs, complication and lack of progress
> everywhere (not only in sciences). Wikidata is not only about millions of
> lonely contributors/authors as Wikipedia is, it also is about
> * ISO 11179 conformant very large sources contributions
> * diktyologies (from greek diktyos, network), i.e. dynamically
> auto-maintained intelligent ontologic spaces. This is a part we have to
> explore and support with new concepts (such as syllodata: data between
> metadata), cloud architecture, programming languages. Otherwise Wikimedia
> will soon be a story of the past, its free commons being copied by many
> (like http://wikipedia.orange.fr) and distributed and automatically
> enhanced by powerfull diktyologies with integrated big (scientific, etc.)
> data servers.
Project director Wikidata
Wikimedia Deutschland e.V. | Eisenacher Straße 2 | 10777 Berlin
Tel. +49-30-219 158 26-0 | http://wikimedia.de
Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e.V.
Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter
der Nummer 23855 B. Als gemeinnützig anerkannt durch das Finanzamt für
Körperschaften I Berlin, Steuernummer 27/681/51985.
At 17:20 04/04/2012, Stracke, Christian wrote:
>I'm sorry bit "jfc" is mixing up different standards and committees
>(what is easy):
Sorry for the confusion. The problem with ISO and ISO documents is
that they are not documented on (i.e. interested in? i.e. interesting
1. 19788-1, can be found for free at:
2. The Wikidata issue, as I see it, is that we do not start from an
architectural framework based on a business plan, charter or TOR. It
seems that the current target is to implement a W3C semantic web
current-Wikimedia general data-store. This does not consider Wikidata
as a major contributor to the future-Wikimedia development.
Agreeing on the role Wikidata is to play in the Wikimedia adaptation
to the Internet future should be our first point of consensus. Otherwise :
1. Wikidata is of no direct interest to Internet Users, only to
2. another Wikimedia project is to be considered as a Wikidata back-end.
Certainly, at a time, there will be a need for a datamodel. But first
we need to locate Wikidata in the Wikimedia strategy as a
data-collector and as a data-desseminator, not only as a data-store,
in a real world where there are five main conceptual channels : (1)
Business World diversity (Search Engines, etc.), (2) JTC1, (3) W3C,
(4) emergent IUsers [Intelligent lead users: FLOSS, IUCG], (5)
Wikimedia. Possibly we have to consider Big Data, and the border with
Big Data as it will emerge.
As being on the IUse side, I know that we need a convergence of these
channels if we want to attain a good degree of (meta/syllo) data
interchange and not multiply costs, complication and lack of progress
everywhere (not only in sciences). Wikidata is not only about
millions of lonely contributors/authors as Wikipedia is, it also is about
* ISO 11179 conformant very large sources contributions
* diktyologies (from greek diktyos, network), i.e. dynamically
auto-maintained intelligent ontologic spaces. This is a part we have
to explore and support with new concepts (such as syllodata: data
between metadata), cloud architecture, programming languages.
Otherwise Wikimedia will soon be a story of the past, its free
commons being copied by many (like http://wikipedia.orange.fr) and
distributed and automatically enhanced by powerfull diktyologies with
integrated big (scientific, etc.) data servers.
Yury Katkov: As far as I know, sometimes thedrafts of ISO standards are available for free. Is there a free draft version somewhere
One can get some text information within the ISO database, here is the documentation:
but you don't get RDF specifications etc.
that is a search at:
for for example: cutting and grinding
(there is sofar no direct link to a search request)
reveals a lot of unclickable items, but just ISO numbers and a little text. but then the database browsing is still in beta.
The ISO Standard sounds very interesting, but the price is really a problem.
If I understand this correctly alone the basic quantities in cutting and grinding part 3 are 66 CHF
which could be a rather small part of an infobox .
Also if Wikidata seems to have currently some money (does it?) to buy documents this can thus easily get very expensive.
moreover wikidata would need to make these standards readable for everyone in order to work on it and this would probably
be a copyright issue, because this would involve not only "reading only" after purchase but also involve publishing the documents
in a public wiki.
Heya folks :)
It'd be great if you could help collect input for lists and infoboxes
that already exist now and that could in the future be supported by
I have started two pages at
http://meta.wikimedia.org/wiki/Wikidata/Queries for this.
Cheers and thanks for the help
Lydia Pintscher - http://about.me/lydia.pintscher
Community Communications for Wikidata
Wikimedia Deutschland e.V.
Eisenacher Straße 2
Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V.
Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg
unter der Nummer 23855 Nz. Als gemeinnützig anerkannt durch das
Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985.