I suspect what Martynas is driving at is that XMLS defines **FACETS** for its datatypes - accepting those as a baseline, and then extending them to your requirements, is a reasonable, community-oriented procss. However, wrapping oneself in the flag of "open development" is to me unresponsive to a simple plea to stand on the shoulders of giants gone before, to act in a responsible manner cognizant of the interests of the broader community.
And personally I have to say I don't like the word "clinging" -- clearly a red flag meant to inflame if not insult. This is no place for that!
On 19.12.2012 09:47, Sven Manguard wrote:
My philosophy is this: We should do whatever works best for
Wikidata and Wikidata's needs. If people want to reuse our content, and the choices we've made make existing tools unworkable, they can build new tools themselves. We should not be clinging to "what's been done already" if it gets in the way of "what will make Wikidata better". Everything that we make and do is open, including the software we're going to operate the database on. Every WMF project has done things differently from the standards of the time, and people have developed tools to use our content before. Wikidata will be no different in that regard.
Sven
On Wed, Dec 19, 2012 at 12:27 PM, Martynas
Jusevičius martynas@graphity.org wrote:
Denny,
you're
sidestepping the main issue here -- every sensible architecture
should build on as much previous standards as possible, and build own
custom solution only if a *very* compelling reason is found to do so
instead of finding a compromise between the requirements and the
standard. Wikidata seems to be constantly doing the opposite --
building a custom solution with whatever reason, or even without it.
This drives the compatibility and reuse towards zero.
This thread
originally discussed datatypes for values such as numbers,
dates and
their intervals -- semantics for all of those are defined in
XML
Schema Datatypes: http://www.w3.org/TR/xmlschema-2/ [1]
All the XML
and RDF tools are compatible with XSD, however I don't
think there is
even a single mention of it in this thread? What makes
Wikidata so
special that its datatypes cannot build on XSD? And this
is only one
of the issues, I've pointed out others earlier.
Martynas
graphity.org [2]
On Wed, Dec 19, 2012 at 5:58 PM, Denny
Vrandečić
denny.vrandecic@wikimedia.de wrote:
Martynas,
could you please let me know where RDF or any of the W3C
standards covers
topics like units, uncertainty, and their
conversion. I would be very much
interested in that.
Cheers,
Denny
2012/12/19 Martynas
Jusevičius martynas@graphity.org
Hey wikidatians,
occasionally checking threads in this list like the current one, I
get
a mixed feeling: on one hand, it is sad to see the efforts
and
resources waisted as Wikidata tries to reinvent RDF, and now
also
triplestore design as well as XSD datatypes. What's next,
WikiQL
instead of SPARQL?
On the other hand, it feels
reassuring as I was right to predict this:
http://www.mail-archive.com/wikidata-l@lists.wikimedia.org/msg00056.html [3]
http://www.mail-archive.com/wikidata-l@lists.wikimedia.org/msg00750.html [4]
Best,
Martynas graphity.org [2]
On Wed, Dec 19, 2012 at 4:11 PM, Daniel Kinzler
daniel.kinzler@wikimedia.de wrote:
On 19.12.2012 14:34,
Friedrich Röhrs wrote:
Hi,
Sorry for my
ignorance, if this is common knowledge: What is the use
case
for
sorting millions of different measures from different
objects?
Finding all cities with more than 100000
inhabitants requires the
database to look through all
values for the property "population" (or even all
properties
with countable values, depending on implementation an query
planning),
compare each value with "100000" and return
those with a greater value. To speed
this up, an index
sorted by this value would be needed.
For cars there
could be entries by the manufacturer, by some
car-testing
magazine, etc. I don't see how this could be adequatly
represented/sorted by a database only query.
If this
cannot be done adequatly on the database level, then it cannot
be done
efficiently, which means we will not allow it. So our
task is to come up
with an architecture that does allow
this.
(One way to allow "scripted" queries like this to
run efficiently is to
do this in a massively parallel
way, using a map/reduce framework. But that's
also not
trivial, and would require a whole new server infrastructure).
If however this is necessary, i still don't understand why it
must
affect the datavalue structure. If a index is
necessary it could be done over a
serialized
representation of the value.
"Serialized" can mean a lot
of things, but an index on some data blob is
only useful
for exact matches, it can not be used for greater/lesser queries.
We need
to map our values to scalar data types the database
can understand
directly, and use for indexing.
This needs to be done anyway, since the values are
saved at a specific unit (which is just a wikidata item). To compare
them on a
database level they must all be saved at the
same unit, or some sort of
procedure must be used to compare
them (or am i missing something
again?).
If
they measure the same dimension, they should be saved using the same
unit
(probably the SI base unit for that dimension). Saving
values using
different units would make it impossible to
run efficient queries against these
values, thereby
defying one of the major reasons for Wikidata's existance. I
don't see a
way around this.
-- daniel
-- Daniel Kinzler, Softwarearchitekt Wikimedia
Deutschland - Gesellschaft zur Förderung Freien Wissens e. V.
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l [5]
_______________________________________________
Wikidata-l mailing
list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l [5]
-- Project director Wikidata Wikimedia Deutschland
e.V. | Obentrautstr. 72 | 10963 Berlin
Tel. +49-30-219 158 26-0 |
Wikimedia Deutschland - Gesellschaft
zur Förderung Freien Wissens e.V.
Eingetragen im Vereinsregister
des Amtsgerichts Berlin-Charlottenburg unter
der Nummer 23855 B.
Als gemeinnützig anerkannt durch das Finanzamt für
Körperschaften I
Berlin, Steuernummer 27/681/51985.
_______________________________________________
Wikidata-l mailing
list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l [5]
_______________________________________________
Wikidata-l mailing
list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l [5]
_______________________________________________
Wikidata-l mailing
list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l [5]
Links: ------ [1] http://www.w3.org/TR/xmlschema-2/ [2] http://graphity.org [3] http://www.mail-archive.com/wikidata-l@lists.wikimedia.org/msg00056.html [4] http://www.mail-archive.com/wikidata-l@lists.wikimedia.org/msg00750.html [5] https://lists.wikimedia.org/mailman/listinfo/wikidata-l [6] http://wikimedia.de