Hi all,
I recently attended "Wikidata meets archeology" symposium and I came
across the question how a well structured and comprehensive data model
of Roman forts (castra) could be integrated into Wikidata and then used
on Wikipedia. For this I developed a raw XSD model based on the data
model of Wikipedia's Infobox castrum. I put this model on the talk page
of Q88205 (https://www.wikidata.org/wiki/Talk:Q88205) adding also a
simple example. I would like to know this data model could be integrated
into Wikidata? It is feasible? Is it too complex?
Thank you,
Saturnian
Hi all,
When I give workshops about Wikipedia to researchers I've been making
a point of mentioning Wikidata, and there's been a lot of interest in
the potential of it. We'll be having a session at GLAM-Wiki, but of
course not everyone is willing or able to attend a Saturday afternoon
conference! To get around this, I've arranged another afternoon
workshop/seminar, pitched mainly at non-Wikimedia people, especially
from a library/metadata/digital humanities background:
http://en.wikipedia.org/wiki/Wikipedia:GLAM/BL/Wikidata
All welcome - if you're in the area and you'd like to come along,
please do let me know.
--
- Andrew Gray
andrew.gray(a)dunelm.org.uk
Hi all!
As you may have noticed, the propagation of changes from Wikidata to the
Wikipedias has been slower than it should be. Because of this, changes on
Wikidata have been showing up on Wikipedia watchlists very late, or not at all.
Katie and I have investigated the causes for this and what we can do about it.
To keep you in the loop, here is what we found:
* A dispatcher needs about 3 seconds to dispatch 1000 changes to a client wiki.
* Considering we have ~300 client wikis, this means one dispatcher can handle
about 4000 changes per hour.
* We currently have two dispatchers running in parallel (on a single box, hume),
that makes a capacity of 8000 changes/hour.
* We are seeing roughly 17000 changes per hour on wikidata.org - more than twice
our dispatch capacity.
* I want to try running 6 dispatcher processes; that would give us the capacity
to handle 24000 changes per hour (assuming linear scaling).
Katie has prepared a patch for that: https://gerrit.wikimedia.org/r/#/c/55904/
Getting this patch in is currently the quickest way for us to make change
propagation work. I hope running all the processes on the same box is not a
problem, a second box for cron job will be set up "soon".
Future:
* Making the dispatcher a "real demon" would probably help with getting it
deployed to more boxes.
* If the Job Queue gets support for delayed (and maybe also recurring) jobs, we
could use the existing JQ infrastructure, and wouldn't need any processes for
ourselves. I'm a bit unsure though how well we could control scaling in such a
setup.
-- daniel
--
Daniel Kinzler, Softwarearchitekt
Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V.
Heya folks!
We've deployed new code on wikidata.org. This includes replacement of
the search box. I hope you like this better than the previous one :)
http://test2.wikipedia.org now also has the ability to include data
from Wikidata. It'd be awesome if you'd give it a try to make sure
there are no major issues before we roll this out on the first few
Wikipedias on Wednesday. There are two ways to include data:
* The first one is by adding a parser function like {{property:p123}}
to the Wikipedia article. This will then get the value that is
associated with property p123 on the connected Wikidata item. We're
working on making it possible to use the property's label (instead of
p123 in this example) and more. See
http://meta.wikimedia.org/wiki/Wikidata/Notes/Inclusion_syntax for
that.
* The second one is via Lua and supposed to be used for more
complicated things that can't be done with the parser function. The
documentation for that is at
http://www.mediawiki.org/wiki/Extension:WikibaseClient/Lua
Cheers
Lydia
--
Lydia Pintscher - http://about.me/lydia.pintscher
Community Communications for Wikidata
Wikimedia Deutschland e.V.
Obentrautstr. 72
10963 Berlin
www.wikimedia.de
Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V.
Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg
unter der Nummer 23855 Nz. Als gemeinnützig anerkannt durch das
Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985.
Hi,
Last November, I started to clean up on the Glossary page on meta, as
an attempt to revive it and expand it to include many technical terms,
notably related to Wikimedia Engineering (see e-mail below).
There were (and are) already many glossaries spread around the wikis:
* one for MediaWiki: https://www.mediawiki.org/wiki/Manual:Glossary
* one for Wikidata: https://www.wikidata.org/wiki/Wikidata:Glossary
* one for Labs: https://wikitech.wikimedia.org/wiki/Help:Terminology
* two for the English Wikipedia:
https://en.wikipedia.org/wiki/Wikipedia:Glossary &
https://en.wikipedia.org/wiki/Wikipedia:WikiSpeak
* etc.
My thinking at the time was that it would be better to include tech
terms in meta's glossary, because fragmentation isn't a good thing for
glossaries: The user probably doesn't want to search a term through a
dozen glossaries (that they know of), and it would be easier if they
could just search in one place.
The fact is, though, that we're not going to merge all the existing
glossaries into one anytime soon, so overlap and duplication will
remain anyway. Also, it feels weird to have tech content on meta, and
the glossary is getting very long (and possibly more difficult to
maintain). Therefore, I'm now reconsidering the decision of mixing
tech terms and general movement terms on meta.
Below are the current solutions I'm seeing to move forward; I'd love
to get some feedback as to what people think would be the best way to
proceed.
* Status quo: We keep the current glossaries as they are, even if they
overlap and duplicate work. We'll manage.
* Wikidata: If Wikidata could be used to host terms and definitions
(in various languages), and wikis could pull this data using
templates/Lua, it would be a sane way to reduce duplication, while
still allowing local wikis to complement it with their own terms. For
example, "administrator" is a generic term across Wikimedia sites
(even MediaWiki sites), so it would go into the general glossary
repository on Wikidata; but "DYK" could be local to the English
Wikipedia. With proper templates, the integration between remote and
local terms could be seamless. It seems to me, however, that this
would require significant development work.
* Google custom search: Waldir recently used Google Custom Search to
created a search tool to find technical information across many pages
and sites where information is currently fragmented:
http://lists.wikimedia.org/pipermail/wikitech-l/2013-March/067450.html
. We could set up a similar tool (or a floss alternative) that would
include all glossaries. By advertising the tool prominently on
existing glossary pages (so that users know it exists), this could
allow us to curate more specific glossaries, while keeping them all
searchable with one tool.
Right now, I'm inclined to go with the "custom search" solution,
because it looks like the easiest and fastest to implement, while
reducing maintenance costs and remaining flexible. That said, I'd love
to hear feedback and opinions about this before implementing anything.
Thanks,
guillaume
On Tue, Nov 20, 2012 at 7:55 PM, Guillaume Paumier
<gpaumier(a)wikimedia.org> wrote:
> Hi,
>
> The use of jargon, acronyms and other abbreviations throughout the
> Wikimedia movement is a major source of communication issues, and
> barriers to comprehension and involvement.
>
> The recent thread on this list about "What is Product?" is an example
> of this, as are initialisms that have long been known to be a barrier
> for Wikipedia newcomers.
>
> A way to bridge people and communities with different vocabularies is
> to write and maintain a glossary that explains jargon in plain English
> terms. We've been lacking a good and up-to-date glossary for Wikimedia
> "stuff" (Foundation, chapter, movement, technology, etc.).
>
> Therefore, I've started to clean up and expand the outdated Glossary
> on meta, but it's a lot of work, and I don't have all the answers
> myself either. I'll continue to work on it, but I'd love to get some
> help on this and to make it a collaborative effort.
>
> If you have a few minutes to spare, please consider helping your
> (current and future) fellow Wikimedians by writing a few definitions
> if there are terms that you can explain in plain English. Additions of
> new terms are much welcome as well:
>
> https://meta.wikimedia.org/wiki/Glossary
>
> Some caveats:
> * As part of my work, I'm mostly interested in a glossary from a
> technical perspective, so the list currently has a technical bias. I'm
> hoping that by sending this message to a wider audience, people from
> the whole movement will contribute to the glossary and balance it out.
> * Also, I've started to clean up the glossary, but it still contains
> dated terms and definitions from a few years ago (like the FundCom),
> so boldly edit/remove obsolete content.
--
Guillaume Paumier
Technical Communications Manager — Wikimedia Foundation
https://donate.wikimedia.org
Hi,
regarding an actual topic in Germany about publication of the
timetable-data of Deutsche Bahn (German national railway company) and
their willingness of a discussion with other Open-Data-Supporters it may
be a good idea of providing an expiration dates for Wikidata-records.
In their open letter to Mr. Kreil [1] they announced that it may cause
problems providing the timetable-data in an open way if e.g. anybody
uses old data.
Marco
[1] http://www.db-vertrieb.com/db_vertrieb/view/service/open_plan_b.shtml
Heya folks :)
Here's your summary of what happened around Wikidata this week.
The wiki version is at
http://meta.wikimedia.org/wiki/Wikidata/Status_updates/2013_03_22
= Development =
* Rolled out new code on wikidata.org. The new stuff you probably care about is:
** Improved references. They can now have multiple lines. This should
make references much more useful. You can now have one reference with
for example values for each of the properties "book", "author", "page"
to describe one source.
** Fixed the prev/next links in diff view
(https://bugzilla.wikimedia.org/show_bug.cgi?id=45821)
** http://www.wikidata.org/wiki/Special:EntitiesWithoutLabel now lets
you filter by language and entity type
* Widget to add language links on the Wikipedias directly: added
setting to enable/disable it per wiki and made it available for
logged-in users only
* Widget to add language links on the Wikipedias directly: improved
layout / size
* Made it so that the “edit links” link on Wikipedia is also shown
when the corresponding item only has a link to this one language and
no other languages
* Submitted improved Apache config patch to make wikidata.org always
redirect to www.wikidata.org, which is awaiting code review and
deployment.
* Improved the script that is responsible for taking Wikidata changes
to the Wikipedias
* Added a few ways to better debug the script responsible for taking
Wikidata changes to the Wikipedias. This should help with
investigating why some changes take way to long to show up on the
Wikipedias.
* Started work on automatically adding edited items to the user’s
watchlist (according to preferences)
* Finished script for rebuilding search keys, so we can finally get
case insensitive matches in a lot of places
* Support for multi-line references in diff view
* Selenium tests for inclusion syntax
* Improved parser function (that will be used to access Wikidata data
on the Wikipedias) to accept property ID or label
* Increased isolation of data model component to increase clarity and
visibility of bad dependencies
* Worked on schema access in the SQLStore (of the query component)
You can follow our commits at
https://gerrit.wikimedia.org/r/#/q/(status:open+project:mediawiki/extension…
and view the subset awaiting review at
https://gerrit.wikimedia.org/r/#/q/status:open+project:mediawiki/extensions…
You can see all open bugs related to Wikidata at
https://bugzilla.wikimedia.org/buglist.cgi?emailcc1=1&list_id=151540&resolu…
= Discussions/Press/Blogs =
* Denny had a look at some data about Wikidata:
http://blog.wikimedia.de/2013/03/19/some-data-on-wikidata/
= Events =
see http://meta.wikimedia.org/wiki/Wikidata/Events
* 3rd Media Web Symposium 2013
* Wikidata trifft Archäologie
* SMWCon Spring NYC
= Other Noteworthy Stuff =
* Join us for the Amsterdam and Wikimania hackathon and Wikimania:
http://www.wikidata.org/wiki/Wikidata:Project_chat#Amsterdam_.2B_Wikimania_…
* Too many items with the same label and no description?
http://toolserver.org/~magnus/wd_terminator.php helps you hunt them
all down and clean them up. Also check out the top 1000:
http://toolserver.org/~magnus/wd_terminator.php?list
* Interested in the distribution of books by genre on Wikidata and
similar things? WikidataStats can help:
http://toolserver.org/~magnus/ts2/wdstats/?q={%22r%22:[{%22p%22:31,%22i%22:…
* http://www.wikidata.org/wiki/Wikidata:Requests_for_permissions/Bot
needs input from more people
* Detailed list of bots: http://www.wikidata.org/wiki/Wikidata:Bots_by_function
* Collecting information about creating a bot:
http://www.wikidata.org/wiki/Wikidata:Creating_a_bot
* We’ve crossed Q8000000
= Did you know? =
* When you edit a statement there is a little wheel in front of the
text field. This lets you choose between “custom value”, “unknown
value” and “no value”. “No value” means that we know that the given
property has no value, e.g. Elizabeth I of England had no spouse.
“Unknown value” means that the property has a value, but it is unknown
which one -- e.g. Pope Linus most certainly had a year of birth, but
it is unknown to us.
= Open Tasks for You =
* Hack on one of
https://bugzilla.wikimedia.org/buglist.cgi?keywords=need-volunteer%2C%20&ke…
* Still looking for the right way to contribute for you? Have a look
at https://www.wikidata.org/wiki/Wikidata:Contribute
Anything to add? Please share! :)
Cheers
Lydia
--
Lydia Pintscher - http://about.me/lydia.pintscher
Community Communications for Wikidata
Wikimedia Deutschland e.V.
Obentrautstr. 72
10963 Berlin
www.wikimedia.de
Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V.
Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg
unter der Nummer 23855 Nz. Als gemeinnützig anerkannt durch das
Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985.
Heya folks :)
Some of the Wikidata team will be at the MediaWiki hackathon in
Amsterdam in late May
(http://www.mediawiki.org/wiki/Amsterdam_Hackathon_2013). It'd be
great to see some of you attend and join us in hacking on Wikidata. If
you're coming please let me know in advance so we can better prepare
in advance.
We'll also attend the hackathon right before Wikimania in Hong Kong in
early August (http://wikimania2013.wikimedia.org/wiki/Hackathon). The
same applies as for the hackathon in Amsterdam.
For Wikimania itself: Anyone wants to join me for a Wikidata
talk/workshop/...? Other ideas?
Cheers
Lydia
--
Lydia Pintscher - http://about.me/lydia.pintscher
Community Communications for Wikidata
Wikimedia Deutschland e.V.
Obentrautstr. 72
10963 Berlin
www.wikimedia.de
Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V.
Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg
unter der Nummer 23855 Nz. Als gemeinnützig anerkannt durch das
Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985.
Heya folks :)
We have deployed new features and bugfixes on wikidata.org. Here are
the major ones you probably care about:
* Multi-line references. This should make references much more useful.
You can now have one reference with for example values for each of the
properties "book", "author", "page" to describe one source.
* The search box was replaced. Give it a try!
* Fixed the prev/next links in diff view
(https://bugzilla.wikimedia.org/show_bug.cgi?id=45821)
In addition we have added code to better be able to debug why it takes
so long for changes to propagate to the Wikipedias. This will
hopefully give us the needed information to speed this up
considerably. We have also just added a second cron job on the server
for this which might improve things already.
Please let me know if you have any questions or see any issues that
might be related to this deployment.
Cheers
Lydia
--
Lydia Pintscher - http://about.me/lydia.pintscher
Community Communications for Wikidata
Wikimedia Deutschland e.V.
Obentrautstr. 72
10963 Berlin
www.wikimedia.de
Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V.
Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg
unter der Nummer 23855 Nz. Als gemeinnützig anerkannt durch das
Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985.