Heya folks :)
Data inclusion from Wikidata is now now possible on the first 11
http://blog.wikimedia.de/2013/03/27/you-can-have-all-the-data/ has more details.
Lydia Pintscher - http://about.me/lydia.pintscher
Community Communications for Wikidata
Wikimedia Deutschland e.V.
Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V.
Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg
unter der Nummer 23855 Nz. Als gemeinnützig anerkannt durch das
Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985.
We have a first write up of how we plan to support queries in Wikidata.
Comments on our errors and requests for clarifications are more than
P.S.: unfortunately, no easter eggs inside.
Project director Wikidata
Wikimedia Deutschland e.V. | Obentrautstr. 72 | 10963 Berlin
Tel. +49-30-219 158 26-0 | http://wikimedia.de
Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e.V.
Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter
der Nummer 23855 B. Als gemeinnützig anerkannt durch das Finanzamt für
Körperschaften I Berlin, Steuernummer 27/681/51985.
Forwarding, because apparently news don't reach Wikidata translators.
In short: update translations at
-------- Messaggio originale --------
Oggetto: Help update invalidated translations on Meta
Data: Tue, 12 Mar 2013 16:40:43 +0100
Mittente: Federico Leva (Nemo)
A: Wikimedia Translators <translators-l(a)lists.wikimedia.org>
As a result of a recent software update, a change requested by some
Wikidata users has been enabled on Meta.
Now the outdated translations, marked by the !!FUZZY!! markers in
translation units pages, are hidden from the translation pages: where
you used to see pink-coloured paragraphs, there's now the English text.
We have about 700 such translations on Meta, because some hard-working
users are migrating several pages to Translate: updating them is often
very quick, and you'll restore readers with sometimes crucial pages in
their language; just remember to remove the !!FUZZY!! tag when you're done.
You can see the pages that need updating LanguageStats for your
I was going to just email Denny and Lydia about this question, but I am
like most editors and I'm lazy, so I preferred to post on my home project
Village Pump. ;-)
I have some concerns surrounding the user experience implications of Phase
2, especially if properties syntax starts getting used in article content
outside infoboxes. Specifically, I mean what it's going to be like for the
vast majority of people who will end up interacting with this syntax,
namely local Wikipedia editors.
The thread is at:
To be clear, this is a big question, and I don't think it's something that
can be answered or dealt with before Phase 2 is launched on enwiki or other
This is the first Wikidata 2 edit in Hungarian Wikipedia:
I tried to replace the official language first, but it is linked in the
infobox as [[magyar nyelv|magyar]]. How can I use properties here?
The second thought was .hu, the TLD, but its value in the infobox is
currently hu which is translated to [[.hu]] by the infobox template, so
after inclusion from WD it became [[..hu]] which is of course red. How can
I use that? Shall we rebuild all the infoboxes?
Other items are either missing from the infobox or from WD, so the reason
to choose the president whom I dislike was that his name appears both in WD
and the infobox and is linked in a simple form.
This is Asher's writeup of the jobqueue disruption that happened
yesterday afternoon Pacific time.
He's not on this list, so please keep him in the cc: if you want him to
see your message.
----- Forwarded message from Asher Feldman <afeldman(a)wikimedia.org> -----
> Date: Fri, 29 Mar 2013 11:27:13 -0700
> From: Asher Feldman <afeldman(a)wikimedia.org>
> To: Operations Engineers <ops(a)lists.wikimedia.org>
> Subject: [Ops] site issues yesterday - jobqueue and wikidata
> We had two brief site disruptions yesterday, one in the afternoon that was
> fairly major but brief (12:40-12:43pm PST) and another that was less severe
> around 11pm. Both were jobqueue related; the first incident was suspected
> to be triggered by the wikidata change publisher and the second incident
> points more strongly in that direction.
> As far as what happened - the current mysql jobqueue implementation is way
> too costly. In the last 24 hours, 75% of all queries that take over 450ms
> to run on the enwiki master are related to the jobqueue and all major
> actions result in replicated writes. It's 58% of all query execution time
> when not looking at over the slow threshold. If 1 million refreshlinks
> jobs are queued as quickly as possible without paying attention to
> replication lag, say hello to replication lag. Mediawiki depends on
> reading from slaves to scale and avoids lagged ones. If all slaves are
> lagged, the master is used for everything, and if this happens to enwiki,
> the site falls over.
> The wikidata change propagator inserts ChangeNotification jobs into local
> wiki queues in batches of 1000. The execution of one change job can result
> in many additional refreshLinks jobs being enqueued. Just prior the the
> meldown, the wikidata propagator inserted around 7000 jobs into enwiki.
> That resulted in around 200k refreshlinks jobs getting inserted in a single
> minute, and around 1.2 million over a slightly longer time. It turns out
> that trying to reparse 1/4 of enwiki as quickly as possible is a problem :)
> Aaron deployed a change last night (
> https://gerrit.wikimedia.org/r/#/c/56572/1) that should throttle the
> insertion of new refreshLinks jobs if the queue is large but not yet sure
> if that's enough. We may also turn down the wikidata dispatcher batch
> size, shut down one of its two dispatchers, or again limit how many
> wikiadmin users can connect to the database to force a concurrency limit on
> all things job queue related.
> The good thing is - the mysql jobqueue was identified as a scaling
> bottleneck a while ago, and will be switching to redis very soon. It's
> currently targeted with the release of wmf13, but we may be able to
> backport to wmf12 and get this done sooner.
> In the interim, please do not release anything that will place new demands
> on the jobqueue, such as echo, or any ramping up of wikidata.
> Ops mailing list
----- End forwarded message -----
| Greg Grossmeier GPG: B2FA 27B1 F7EB D327 6B8E |
| identi.ca: @greg A18D 1138 8E47 FAC8 1C7D |
I recently attended "Wikidata meets archeology" symposium and I came
across the question how a well structured and comprehensive data model
of Roman forts (castra) could be integrated into Wikidata and then used
on Wikipedia. For this I developed a raw XSD model based on the data
model of Wikipedia's Infobox castrum. I put this model on the talk page
of Q88205 (https://www.wikidata.org/wiki/Talk:Q88205) adding also a
simple example. I would like to know this data model could be integrated
into Wikidata? It is feasible? Is it too complex?
When I give workshops about Wikipedia to researchers I've been making
a point of mentioning Wikidata, and there's been a lot of interest in
the potential of it. We'll be having a session at GLAM-Wiki, but of
course not everyone is willing or able to attend a Saturday afternoon
conference! To get around this, I've arranged another afternoon
workshop/seminar, pitched mainly at non-Wikimedia people, especially
from a library/metadata/digital humanities background:
All welcome - if you're in the area and you'd like to come along,
please do let me know.
- Andrew Gray
As you may have noticed, the propagation of changes from Wikidata to the
Wikipedias has been slower than it should be. Because of this, changes on
Wikidata have been showing up on Wikipedia watchlists very late, or not at all.
Katie and I have investigated the causes for this and what we can do about it.
To keep you in the loop, here is what we found:
* A dispatcher needs about 3 seconds to dispatch 1000 changes to a client wiki.
* Considering we have ~300 client wikis, this means one dispatcher can handle
about 4000 changes per hour.
* We currently have two dispatchers running in parallel (on a single box, hume),
that makes a capacity of 8000 changes/hour.
* We are seeing roughly 17000 changes per hour on wikidata.org - more than twice
our dispatch capacity.
* I want to try running 6 dispatcher processes; that would give us the capacity
to handle 24000 changes per hour (assuming linear scaling).
Katie has prepared a patch for that: https://gerrit.wikimedia.org/r/#/c/55904/
Getting this patch in is currently the quickest way for us to make change
propagation work. I hope running all the processes on the same box is not a
problem, a second box for cron job will be set up "soon".
* Making the dispatcher a "real demon" would probably help with getting it
deployed to more boxes.
* If the Job Queue gets support for delayed (and maybe also recurring) jobs, we
could use the existing JQ infrastructure, and wouldn't need any processes for
ourselves. I'm a bit unsure though how well we could control scaling in such a
Daniel Kinzler, Softwarearchitekt
Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V.