looking into  I read that Wikidata supports 358 languages. Is it still
true? For example, I tried to add label in language coded as "nan" (defined
in ISO 639-3) and it worked. However it didn't worked for e.g. "arb", which
is also part of the ISO 639-3 standard. So how many?
 VRANDEČIĆ, Denny, KRÖTZSCH, Markus. Wikidata: A Free Collaborative
Knowledgebase. *Communications of the ACM*. 2014-10, Vol. 57 No. 10, 7885.
DOI 10.1145/2629489. http://cacm.acm.org/magazines/2014/10/178785-wikidata
I have a SPARQL query that returns French labels of people with the family
name (P734) Labrousse (Q25273100), sorting them by label:
The problem is that French rules for sorting are not applied: Élisabeth
Labrousse and Émile Labrousse should be between Audran Labrousse and Ernest
Labrousse, and not at the end of the results.
This seems conform to SPARQL specifications (ordering is undefined for
literals with language tags):
Some SPARQL engines like Dydra use language tags to sort strings:
It seems that Blazegraph should be able to do the same thing (using ICU
library), but the documentation is old (yep, 2013 is old ! :p) and I don't
know how WDQS is configured:
Is there a solution to use French (or other languages) sorting in WDQS?
First of all, let me say that we all love the SPARQL endpoint, it's a
great service and it has become essential to how we interact with
Wikidata and run our bots. Great job by Stas and others!
I am also aware that it is still in beta mode. There is just one
issue, which plagues us and I have filed a bug report regarding that
in Sep 2015 (https://phabricator.wikimedia.org/T112397), so the issue
got alleviated, but it turned out that it did not get fully resolved:
-Occassionally, data written to an item in Wikidata via the API does
not make it into the triple store. (Frequency of the issue is hard to
-It is a crucial issue because it can lead to data inconsistency by
creating duplicate items or incorrect properties/values on items.
-It seems to happen while the SPARQL endpoint is under high load (just
How data is affected:
-New data does not make it into the triple store
-Updates to and merges of items do not make it to the triple store, so
'ghost items' are returned which have actually been merged or queries
show/miss resutls/items incorreclty because freshly added/deleted data
has not been completely serialized.
Example: item https://www.wikidata.org/wiki/Q416356, a protein,
recently got added protein domains via the 'has part' property. This
did not show up in SPARQL queries and a DESCRIBE query for that item
returned that these triples were not there indeed. (item has been
modified, so it is fine now.)
A solution seems to be to modify the item as this seems to trigger
re-serialization. But this is certainly not practical for larger
imports. Furthermore, as long as such an item does not get modified,
data could be missing/ghosting from/in the triple store for weeks or
even months. And it turns out to be quite difficult to determine how
much of a certain import effort finally made it into the triple store,
if you do not want to iterate through all items modified and check if
everything is in the triple store, which would take significant
amounts of time.
Could you maybe give us more info on the status of this issue and if
we could do something to help alleviating it?
Sebastian Burgstaller-Muehlbacher, PhD
Andrew Su Lab
MEM-216, Department of Molecular and Experimental Medicine
The Scripps Research Institute
10550 North Torrey Pines Road
La Jolla, CA 92037
There is a new *Drag'n'drop gadget* available in Wikidata > Preferences >
* Drag'n'drop*: Add statements and references from Wikidata or Wikipedia
by dragging and dropping them.
Please note that there are issues with the gadget.
I would like to evangelize this gadget but for my purposes it is not
---> i.e., I drag the reference and get a shadow image of text and even
after waiting 5 minutes the reference does not apply; refresh clears the
It might be a browser issue, but I have tried it on both Mac and PC, as
well as on Chrome, Firefox, Safari, and SeaMonkey with exactly the same
unsuccessful results. I have talked with Magnus about this (he is not
having the problems I have had). It might also be a Wikidata response
issue, but I think that issue was resolved.
Currently if you use Wiki Markup and want to use one of the four Cite
templates via the RefToolbar these templates I don't think will transfer
- Cite books: not configured. Will get error message
- I have not tested Cite web, Cite journal, Cite news, but assume none of
these are configured to be captured by this tool.
Obviously to configure the tool to work with these RefToolbar citation
templates would be significant amount of time & effort.
But if this tool was fully functional and robust -- and was able to
transfer ALL of the piped data -- it would allow for a great
interoperability of citations between Wikipedia and Wikidata.
I don't build citations without using templates, as my assumption is that
templates are more machine readable and more useful -- and are more
consistent -- but obviously others are using different approaches. I assume
bare urls are probably the most transferrable. But those didn't work for me
I really appreciate the fact that this gadget is available and the hard
work it took to create it. Magnus has been very patient and kind offlist
trying to problem-solve the issues I have had.
I just wanted to follow up and provide this information, as it seems an
important tool for us citation-focused editors.
Wikipedia *User:BrillLyle <https://en.wikipedia.org/wiki/User:BrillLyle>*
Something very weird is going on with Serbian language on Wikidata, so I wanted
to draw more attention to it. As always, this is probably applicable to Chinese
etc. as well.
It used to be that, if someone visits Wikidata from Serbia, Serbian language
not appear in the list of languages for adding label and description. This is
described in https://phabricator.wikimedia.org/T121747
However, as of right now, if someone visits Wikidata from Serbia, he will get
Serbian language twice: in Cyrillic (српски) and Latin (srpski) variant.
knowledge, it has never been discussed to conclusion whether there should be
independent labels for the variants. Either way, one of the consequences of
is that all the Serbian labels that have been entered so far are invisible in
It gets even funnier if you go to
https://www.wikidata.org/wiki/Q3711?uselang=sr-el since now you get Serbian
three times: as "srpski (latinica)", "Serbian (Cyrillic script)" and
It appears that the first is sr-el, the second is sr-ec and the third is the
I didn't want to play with editing, since this is a mess already. It appears
that this is caused by an attempt to fix T121747 while simultaneously changing
Serbian-language codes (https://phabricator.wikimedia.org/T117845). Either way,
I believe it warrants more attention.
Hey folks :)
Based on requests here Denny and I have worked on getting the Primary
Sources Tool code moved from the Google to the Wikidata organisation.
This has now happened and it is available at
https://github.com/Wikidata/primarysources from now on. I hope this
will lead to more contributions from more people as I believe it is an
important part of Wikidata's data flow.
Lydia Pintscher - http://about.me/lydia.pintscher
Product Manager for Wikidata
Wikimedia Deutschland e.V.
Tempelhofer Ufer 23-24
Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V.
Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg
unter der Nummer 23855 Nz. Als gemeinnützig anerkannt durch das
Finanzamt für Körperschaften I Berlin, Steuernummer 27/029/42207.