Hey,
I spent last few weeks working on this lights off [1] and now it's ready to
work!
Kian is a three-layered neural network with flexible number of inputs and
outputs. So if we can parametrize a job, we can teach him easily and get
the job done.
For example and as the first job. We want to add P31:5 (human) to items of
Wikidata based on categories of articles in Wikipedia. The only thing we
need to is get list of items with P31:5 and list of items of not-humans
(P31 exists but not 5 in it). then get list of category links in any wiki
we want[2] and at last we feed these files to Kian and let him learn.
Afterwards if we give Kian other articles and their categories, he
classifies them as human, not human, or failed to determine. As test I gave
him categories of ckb wiki (a small wiki) and worked pretty well and now
I'm creating the training set from German Wikipedia and the next step will
be English Wikipedia. Number of P31:5 will drastically increase this week.
I would love comments or ideas for tasks that Kian can do.
[1]: Because I love surprises
[2]: "select pp_value, cl_to from page_props join categorylinks on pp_page
= cl_from where pp_propname = 'wikibase_item';"
Best
--
Amir
Hey folks :)
Pasleim has created a nice new patrolling tool to help with vandalism and
spam fighting. Here's their announcement text:
Hey. To fight against vandalism, I wrote an oAuth application which is
similar to Special:RecentChanges
<https://www.wikidata.org/wiki/Special:RecentChanges> but has some more
features:
- 1-click patrol button
- select edits by type (edited terms, sitelinks, merges...)
- mass patrolling
- clickable edits comments, i.e. identifiers of external databases and
URLs are linked
- hints showing you various additional information, e.g. if the format
of an identifier is violated
- 1-click translation button for labels and descriptions
- automated description <http://tools.wmflabs.org/autodesc> when
hovering over the item's label
You find the tool on https://tools.wmflabs.org/pltools/rech. I hope with it
that some more users get motivated to patrol recent changes. --Pasleim
<https://www.wikidata.org/wiki/User:Pasleim> (talk
<https://www.wikidata.org/wiki/User_talk:Pasleim>) 21:46, 18 March 2015
(UTC)
Cheers
Lydia
--
Lydia Pintscher - http://about.me/lydia.pintscher
Product Manager for Wikidata
Wikimedia Deutschland e.V.
Tempelhofer Ufer 23-24
10963 Berlin
www.wikimedia.de
Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V.
Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter
der Nummer 23855 Nz. Als gemeinnützig anerkannt durch das Finanzamt für
Körperschaften I Berlin, Steuernummer 27/681/51985.
Hey folks :)
We've been working on completely rewriting the header section and
moving it towards the new design. We're not there yet but the next
step is ready for testing and then rolling out on Wikidata. Please go
ahead and test it thoroughly at
http://wikidata.beta.wmflabs.org/wiki/Q27946. This will allow you to
collapse the in other languages box for example and adds a hint about
how to configure the displayed languages. The remaining known issues
are tracked at https://phabricator.wikimedia.org/T75654.
In addition to the new header the next deployment will bring a lot of
under-the-hood changes and bug fixes. The most relevant changes for
you are:
* we made the diff for time values more meaningful
* we fixed a lot of bugs in the time datatype
* edit links are no longer cached incorrectly based on the users
permission (This lead to users sometimes seeing edit buttons on pages
that they could not edit and no edit buttons on pages that they could
edit.)
* we fixed some issues with propagating page moves and deletions on
the clients (Wikipedia, etc) to Wikidata
* we corrected an issue where you would see new data in the old part
of a diff (This affected qualifiers mainly.)
* the sitetointerwiki gadget now also works on diff pages
* the precision is now detected correctly when entering a quantity in
scientific notation
* we added mailto as an accepted protocol for the URL datatype
Unless you find big issues during testing we plan to deploy this on
Wikidata on 24th of March.
Cheers
Lydia
--
Lydia Pintscher - http://about.me/lydia.pintscher
Product Manager for Wikidata
Wikimedia Deutschland e.V.
Tempelhofer Ufer 23-24
10963 Berlin
www.wikimedia.de
Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V.
Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg
unter der Nummer 23855 Nz. Als gemeinnützig anerkannt durch das
Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985.
Hi all,
I'm a relative newcommer to Wikidata but long time OpenStreetMap contributor.
Recently OpenStreetMap has a situation where large numbers of
translated names have been added to OSM objects. When asked about the
origin of these names, I've been told a number of places, one of which
is Wikidata.
What appears to be happening, from what I've seen, is that there's a
small number of users copying data from other (not so reliable)
sources, and then putting that data in Wikidata as aliases for place
names, then when asked where the place names are from, they say
"Wikidata".
The problem here is that many of these "translations" are simply made
up. They're either transliterations or word for word translations,
rather than being genuine names in another language.
Unfortunately, it's my understanding that Wikidata aliases can't be
sourced (ie they can't be validated or invalidated like other facts).
If this is the case, it's a problem for both our projects.
I'd planned on my own implementation of using Wikidata names for
places in OSM to create custom renderings, but we need to be able to
know that the place names are something we can trace back and source
properly.
- Serge
Hi,
In my previous post, I mentioned that there are very few empty-map
issues remaining in the JSON dumps as well. I think there are only two
right now:
https://www.wikidata.org/wiki/Special:EntityData/Q2382164.jsonhttps://www.wikidata.org/wiki/Special:EntityData/Q4383128.json
Both have the empty reference problem (a statement has a reference which
contains no data whatsoever; look for: "snaks":[]). Maybe somebody wants
to look into this (this should not really happen, even if the JSON were
serialized correctly).
Cheers,
Markus
--
Markus Kroetzsch
Faculty of Computer Science
Technische Universität Dresden
+49 351 463 38486
http://korrekt.org/
Hey folks :)
Data quality and trust is what we're currently concentrating on in the
development around Wikidata. A big part of that is improving the
integration of Wikidata in the watchlist of Wikipedia and other sister
projects. I just opened a page to collect input on how we can improve
it.
Please help me with spreading this to the Wikipedias and other sister projects.
https://www.wikidata.org/wiki/Wikidata:Watchlist_integration_improvement_in…
Cheers
Lydia
--
Lydia Pintscher - http://about.me/lydia.pintscher
Product Manager for Wikidata
Wikimedia Deutschland e.V.
Tempelhofer Ufer 23-24
10963 Berlin
www.wikimedia.de
Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V.
Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg
unter der Nummer 23855 Nz. Als gemeinnützig anerkannt durch das
Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985.
Hi,
Quick question about the wbgetentities API: are there any general rules
that clients should obey when making requests?
* Maximal hit rate?
* How many entities can you actually get in one request? Is this
documented anywhere? Is it possible for a tool to find this number or is
it just hardcoded?
Cheers,
Markus