Hey everyone,
Wikidata’s birthday is still a few days away but since there are no
deployments on Sundays here is early birthday present number 2 ;-)
One of Wikidata’s most important but often under-appreciated areas is
the external identifiers. They link Wikidata with by now more than
2000 databases, knowledge bases, catalogs and more. External
identifiers allow people and machines access to more information about
a given topic and help identify the same concept in other databases.
The identifier can often be expanded to a full URI. (For example, LoC
ID n81114174 becomes http://id.loc.gov/authorities/names/n81114174.)
This full URI can then be used in the linked open data web to match
our data with other datasets and use both of them together easily.
>From today on, Wikidata has full URIs for statements that represent
external identifiers in its RDF exports, and thereby becomes a proper
citizen of the linked open data web. To make this work the property
for the external ID needs to have a statement with property “URI used
in RDF” (https://www.wikidata.org/wiki/Property:P1921). I’m looking
forward to seeing what new things are going to be built with this and
how we will show up on http://lod-cloud.net.
Cheers
Lydia
PS: It will take a reload of the query service to also have the full
URIs included there for all items. This should happen in the first two
weeks of November. (Tracking is in
https://phabricator.wikimedia.org/T176593) Until then only newly added
or edited items will have the full URIs added to the query service.
--
Lydia Pintscher - http://about.me/lydia.pintscher
Product Manager for Wikidata
Wikimedia Deutschland e.V.
Tempelhofer Ufer 23-24
10963 Berlin
www.wikimedia.de
Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V.
Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg
unter der Nummer 23855 Nz. Als gemeinnützig anerkannt durch das
Finanzamt für Körperschaften I Berlin, Steuernummer 27/029/42207.
Hello all,
As you know, this weekend we celebrated Wikidata's fifth birthday, during
the WikidataCon in Berlin, but also in Tokyo, Seoul and Munich.
We collected a lot of amazing *presents and stories*, that you can find
here: https://www.wikidata.org/wiki/Wikidata:Fifth_Birthday
During the WikidataCon, 200 members of the community met, shared their
knowledge, experiences, favorite tools, and discussed about a lot of
topics. We're currently trying to document the outcomes of the conference
as best as possible. You can already find some content on:
* the video recordings of the sessions happening in room A
https://media.ccc.de/b/conferences/wikidatacon/2017
* the slides and notes of each session of day 1
https://www.wikidata.org/wiki/Wikidata:WikidataCon_2017/Documentation/Proce…
* the Commons category & subcategories
https://commons.wikimedia.org/wiki/Category:WikidataCon_2017
We are also very happy to announce that the *WikidataCon will be back in
October 2019 in Berlin*, to host even more attendees and more exciting
content.
In 2018, we encourage all the local communities to organize their own
events around the sixth birthday, in order to build a *Wikidata birthday
around the world*. Meetups or trainings, conferences or parties, big or
small, all the ideas are welcome to celebrate Wikidata with your local
community and engage new editors. More information will come in the next
months, but you can already mention your projects on this page:
https://www.wikidata.org/wiki/Wikidata:Sixth_Birthday
Cheers,
--
Léa Lacroix
Project Manager Community Communication for Wikidata
Wikimedia Deutschland e.V.
Tempelhofer Ufer 23-24
10963 Berlin
www.wikimedia.de
Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V.
Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter
der Nummer 23855 Nz. Als gemeinnützig anerkannt durch das Finanzamt für
Körperschaften I Berlin, Steuernummer 27/029/42207.
Happy Birthday, Wikidata!
Today is Wikidata's 5th birthday. Some days I can't really believe
it's been 5 years already since we set up this wiki. I am proud of
what we are building together every single day and I am excited about
what is coming next. Here is to many more years together building the
best free knowledge base there is!
Head over to https://www.wikidata.org/wiki/Wikidata:Fifth_Birthday to
read stories, reflections, birthday wishes and most importantly
presents ;-)
What's your highlight of the past year? What are you looking forward
to the most over the coming year?
Cheers
Lydia
--
Lydia Pintscher - http://about.me/lydia.pintscher
Product Manager for Wikidata
Wikimedia Deutschland e.V.
Tempelhofer Ufer 23-24
10963 Berlin
www.wikimedia.de
Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V.
Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg
unter der Nummer 23855 Nz. Als gemeinnützig anerkannt durch das
Finanzamt für Körperschaften I Berlin, Steuernummer 27/029/42207.
Hello everyone,
The WikidataCon <https://www.wikidata.org/wiki/Wikidata:WikidataCon_2017>
is about to start! If you're interested by its program
<https://www.wikidata.org/wiki/Wikidata:WikidataCon_2017/Program/Saturday>
but can't unfortunately attend, I have good news: part of the content will
be livestreamed so you can follow remotely.
The link to access the livestream is: http://streaming.media.ccc.de/
wikidatacon2017
The conference will be broadcasted from today at *11:00 (UTC+2)*, starting
with the talk "State of the project" by Lydia Pintscher. The content
of the *rooms
A, A1, A2 and A3* will be streamed, unless the speaker requests not to be
recorded.
The videos will also be available later under CC-BY-SA: I'll share the
links with you as soon as possible.
Thank you,
--
Léa Lacroix
Project Manager Community Communication for Wikidata
Wikimedia Deutschland e.V.
Tempelhofer Ufer 23-24
10963 Berlin
www.wikimedia.de
Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V.
Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter
der Nummer 23855 Nz. Als gemeinnützig anerkannt durch das Finanzamt für
Körperschaften I Berlin, Steuernummer 27/029/42207.
Hi,
I would like to join thread I found in the archive:
https://lists.wikimedia.org/pipermail/wikidata//2017-October/011259.html
I worked in contextual research to facilitate knowledge transfer.
One of the domain I would like to treat is visualisation of economics
networks.
I seek for an impact over governance of innovation and transparency over
economics network control, and allow also SMEs companies or private
citizens to build their analytics and prevent cases of collusions.
Information about business profiles is currently a premium service provided
by private specialised corporations, although much of the information about
companies is public, but there is lack of open data policy.
I would like to fill the gap and contribute to feed Wikidata as repository,
either in bulk either as a collective action - as a design thinker I could
contribute to design processes to fill in data, like applications that
facilitate the process.
*Is there any guidance or clearance about this initiatives?*
I am happy to read similar interest from Germany, Belgium and Italy, I
would like to connect.
I read that feeding wikidata with corporate information would significantly
increase the size - though, I think that the benefit to allow to inquire
for public governance would allow to distribute governance of economics
data.
Aside of public services like:
https://www.gov.uk/government/organisations/companies-house
I would like to allow data-visualisation researchers (as myself) to uncover
for the public results like:
http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0025995
that relies on private parternships to access corporate databases, and so
findings cannot be quieried by the public.
*Is there a specific Wikidata policy to comply with to feed data from
scrapers of websites?*
As a starter, the URI of sites with good reputation could act as an
*identifier*.
I believe that scraping would be legit for information about property
"facts" (below) are public, and organisations that collated data provides
services (as professional communities or services augmented with private
data) that would be not in competition with building a repository.
In a way, I see wikidata as possibility to indexing data that can be
functional to search engines and discovery engines, and indexing data is an
activity that is daily run by such services. I believe that enabling public
transparency would enhance open-data services.
Below some properties of interest.
Properties I would be interested in are:
- TEAM (founders)
- DESCRIPTION (corporate description over products and services)
- INVESTORS (corp. and private equity)
- EMPLOYEES / INCUBATORS / ADVISORS (personal information available as
public information over the web)
- PARTICIPATED COMPANIES
- DATE of acquisition or participation to companies
- CAPITAL (if available, or in ranges)
- VAT NUMBER (or registry number)
- ADDRESS
Other ideas to fetch the business profile of companies?
It should be, somehow, publicly available, for each corporate report to the
organisation registry and there are already private companies offering
analytics over the business profiles.
Luigi
Hi!
Wikidata’s birthday is still a few days away but since there are no
deployments on Sundays we’ll get started with an early present ;-)
Wikidata and Search Platform teams are happy to announce that Wikidata
prefix search (aka wbsearchentities API aka the thing you use when you
type into that box on the top right or any time you edit an item or
property and use the selector widget) is now using new and improved
ElasticSearch backend. You should not see any changes except for
relevancy and ranking improvements.
Specifically improved are:
- better language support (matches along fallback chain and also can
match in any language, with lower score)
- flexibility - we now can use Elasticsearch rescore profiles which can
be tuned to take advantage of any fields we index for both matching and
boosting, including links counts, statement counts, label counts, (some)
statement values, etc. etc. More improvement coming soon in this area,
e.g. scoring disambig pages lower, scoring units higher in proper
context, etc.
- optimization - we do not need to store all search data in both DB
tables and Elastic indexes anymore, all the data that is needed for
search and retrieval of the results is stored in Elastic index and
retrieved in a single query.
- maintainability - since it is now part of the general Wikimedia search
ecosystem, it can be maintained together with the rest of the search
mechanisms, using the same infrastructure, monitoring, etc.
Please tell us if you have any suggestions, comments or experience any
problems with it.
--
Stas Malyshev
smalyshev(a)wikimedia.org
Hello all!
As you might have seen / endured, we've had a Wikdiata Query Service
partial outage yesterday morning (central european time). The full
incident report is available [1] if you are interested in the details.
The short version:
* a single client started to run an unusually high number of queries on WDQS
* the overload was not prevented by our current throttling
* the failure was not detected and isolated automatically
To prevent this from happening again, we will review our throttling
rules. Those rules were previously tuned to prevent a single client
from overloading the service with a small number of expensive
requests: we started to log a client activity only when the duration
of a request exceeded 10 seconds. Which means that a client sending
tons of short requests would never be throttled.
We will correct that by lowering the threshold to probably 25ms. The
throttling rules are still the same:
* 60 seconds of processing time per minute (peaking at 120 seconds)
* 30 errors per minute (peaking at 60)
If you are using WDQS to make lots of small requests, and you are over
the throttling rates above, there is a chance that you will start
seeing throttling errors. We are not doing this to bother you, we're
just trying to keep another crash from happening...
If you are throttled, you will receive an HTTP 429 error code. This
response include the "Retry-After" HTTP header which specify a number
of seconds you should wait before retrying.
Thanks for your patience!
And contact me if you want any clarification.
Guillaume
[1] https://wikitech.wikimedia.org/wiki/Incident_documentation/20171018-wdqs
[2] https://en.wikipedia.org/wiki/List_of_HTTP_status_codes#429
--
Guillaume Lederrey
Operations Engineer, Discovery
Wikimedia Foundation
UTC+2 / CEST
according to the Revision history of "Alois Rainer" (Q2650401) the
item should not hold claim P570 (date of death)
https://www.wikidata.org/wiki/Q2650401
the claim was removed in 2014
09:40, 20 July 2014 KrBot (talk | contribs) . . (3,531 bytes)
(-371) . . (Removed claim: date of death (P570): 14 May 2002)
is there an explanation why the claim is still attached to the wikidata item?
Marco
--
---
Marco Neumann
KONA