I would like to implement a version of constraint violations reporting (
https://www.wikidata.org/wiki/Wikidata:Database_reports/Constraint_violatio…
) (example (
https://www.wikidata.org/wiki/Wikidata:Database_reports/Constraint_violatio…
)) using some pre-specified SPARQL queries, to run only on my Talk page.
Basically, I would like to have a page containing a handful of SPARQL
queries, which would get run on a schedule (every week), and the output
included into the page (as a table/list/etc). This seems really similar to
what the constraint violations reporting is doing, but I'm having trouble
figuring out how it works. It looks like there is a bot (
https://www.wikidata.org/wiki/User:KrBot) that is fetching the constraints
from each property's talk page and running the checks on a schedule. Is
that correct? I cannot seem to find any documentation or source code. Has
anyone else implemented something like this? Is there a way to execute a
SPARQL query from a module?
Hi everybody,
(With apologies for cross-posting...)
You may have seen the recent communication [1
<https://www.mediawiki.org/wiki/Wikimedia_Engineering/June_2017_changes>]
about the product and tech tune-up which went live the week of June 5th,
2017. In that communication, we promised an update on the future of
Discovery projects and we will talk about those in this email.
The Discovery team structure has now changed, but the new teams will still
work together to complete the goals as listed in the draft annual plan.[2]
A summary of their anticipated work, as we finalize these changes, is
below. We plan on doing a check-in at the end of the calendar year to see
how our goals are progressing with the new smaller and separated team
structure.
Here is a list of the various projects under the Discovery umbrella, along
with the goals that they will be working on:
Search Backend
Improve search capabilities:
-
Implement ‘learning to rank’ [3] and other advanced machine learning
methodologies
-
Improve support for languages using new analyzers
-
Maintain and expand power user search functionality
Search Frontend
Improve user interface of the search results page with new functionality:
-
Implement explore similar [4]
<https://www.mediawiki.org/wiki/Cross-wiki_Search_Result_Improvements/Testin…>
-
Update the completion suggester box [5]
<https://www.mediawiki.org/wiki/Extension:CirrusSearch/CompletionSuggester>
-
Investigate the usage of a Wiktionary widget for English Wikipedia [6]
Wikidata Query Service
Expand and scale:
-
Improve ability to support power features on-wiki for readers
-
Improve full text search functionality
-
Implement SPARQL federation support
Portal
Create and implement automated language statistics and translation updates
for Wikipedia.org
Analysis
Provide in-depth analytics support:
-
Perform experimental design, data collection, and data analysis
-
Perform ad-hoc analyses of Discovery-domain data
-
Maintain and augment the Discovery Dashboards,[7] which allow the teams
to track their KPIs and other metrics
Maps
Map support:
-
Implement new map style
-
Increase frequency of OSM data replication
-
As needed, assist with individual language Wikipedia’s implementation of
mapframe [8] <https://www.mediawiki.org/wiki/Maps/how_to:_embedded_maps>
Note: There is a possibility that we can do more with maps in the coming
year; we are currently evaluating strategic, partnership, and resourcing
options.
Structured Data on Commons
Extend structured data search on Commons, as part of the structured data
grant [9] via:
-
Research and implement advanced search capabilities
-
Implement new elements, filters, relationships
Graphs and Tabular Data on Commons
We will be re-evaluating this functionality against other Commons
initiatives such as the structured data grant. As with maps, we will
provide updates when we know more.
We are still working out all the details with the new team structure and
there might be some turbulence; let us know if there are any concerns and
we will do our best to answer them.
Best regards,
Deborah Tankersley, Product Manager, Discovery
Erika Bjune, Engineering Manager, Search Platform
Jon Katz, Reading Product Lead
Toby Negrin, Interim Vice President of Product
Victoria Coleman, Chief Technology Officer
[1] https://www.mediawiki.org/wiki/Wikimedia_Engineering/June_2017_changes
[2]
https://meta.wikimedia.org/wiki/Wikimedia_Foundation_Annual_Plan/2017-2018/…
[3] https://en.wikipedia.org/wiki/Learning_to_rank
[4]
https://www.mediawiki.org/wiki/Cross-wiki_Search_Result_Improvements/Testin…
[5]
https://www.mediawiki.org/wiki/Extension:CirrusSearch/CompletionSuggester
[6]
https://www.mediawiki.org/wiki/Cross-wiki_Search_Result_Improvements/Testin…
[7] https://discovery.wmflabs.org/
[8] https://www.mediawiki.org/wiki/Maps/how_to:_embedded_maps
[9] https://commons.wikimedia.org/wiki/Commons:Structured_data
Hi!
We have deployed the first iteration of Mediawiki API[1] support for
Wikidata Query Service. Please see the manual[2] for the full
documentation, below outlined are the main highlights.
The service allows to call out to some Mediawiki APIs from SPARQL in
order to obtain information not contained in RDF data and WDQS database.
See the list of the supported APIs in the manual[2].
Currently a small subset of existing APIs is supported, and we expect
the community to nominate more services and contribute service templates
to extend the API. Please see the manual for description of service
templates. Note that we do not plan to support any APIs that modify
data, edit wikis, etc. - only read-only querying APIs and only APIs that
do not require any authorization can be supported.
Currently supported hosts are: *.wikipedia.org, commons.wikimedia.org,
www.mediawiki.org, www.wikidata.org, test.wikidata.org. If any other
wikis need to be supported, please leave a comment to the developers[3]
and we will enable them.
Example service query (more in the docs):
SELECT * WHERE {
SERVICE wikibase:mwapi {
bd:serviceParam wikibase:api "EntitySearch" .
bd:serviceParam wikibase:endpoint "www.wikidata.org" .
bd:serviceParam mwapi:search "cheese" .
bd:serviceParam mwapi:language "en" .
?item wikibase:apiOutputItem mwapi:item .
}
?item (wdt:P279|wdt:P31) ?type
}
If there are any problems or questions, please contact the developers on
the list, #wikidata on IRC, or on wiki[3], or submit a Phabricator issue.
TODOs:
* Add more services (nominations welcome)
* Support services that accept multiple titles as input in one query
* Implement parameter types
[1] https://www.mediawiki.org/wiki/API:Main_page
[2] https://www.mediawiki.org/wiki/Wikidata_query_service/User_Manual/MWAPI
[3] https://www.wikidata.org/wiki/Wikidata:Contact_the_development_team
--
Stas Malyshev
smalyshev(a)wikimedia.org
Hello there,
>From now and until July 31st, you can submit projects for the WikidataCon
program <https://www.wikidata.org/wiki/Wikidata:WikidataCon_2017/Program>!
Our focus is *sharing your knowledge and experience within the community*.
We suggest a lot of different formats, topics, and we also have room for
unusual formats and new ideas.
Feel free to submit your projects
<https://www.wikidata.org/wiki/Wikidata:WikidataCon_2017/Program/Submit>,
support
other submissions
<https://www.wikidata.org/wiki/Wikidata:WikidataCon_2017/Program/Submissions>,
ask the program committee
<https://www.wikidata.org/wiki/Wikidata:WikidataCon_2017/Volunteer/Program_c…>
for any question.
We hope that many of you will contribute to this program to make the
WikidataCon amazing, community-driven and diverse :)
Thanks,
--
Léa Lacroix
Project Manager Community Communication for Wikidata
Wikimedia Deutschland e.V.
Tempelhofer Ufer 23-24
10963 Berlin
www.wikimedia.de
Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V.
Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter
der Nummer 23855 Nz. Als gemeinnützig anerkannt durch das Finanzamt für
Körperschaften I Berlin, Steuernummer 27/029/42207.
Hoi,
There is a flood of incoming language codes for monolingual codes in
Wikidata. The scenario that is unfolding is similar as it was in the
beginning of Wikipedia. A few people that request a flood of codes without
any real need.
The result has been a chaos that has still not been resolved. With the new
codes coming in without a reasonable case made why I think we are making a
mistake. A mistake with consequences that we will regret.
Thanks,
GerardM
Hey all and apologies for the cross-posting.
The amount of conversation within the last few months via Wikicite about
FRBR, MARC and other bibliographic standards, I wanted to make sure folks
were aware that they are examining them in some way.
Cheers,
Alex
---------- Forwarded message ----------
From: Joanne Yeomans <Joanne.Yeomans(a)ifla.org>
Date: Fri, Jun 9, 2017 at 7:47 AM
Subject: [IFLA-L] Call for nominations for the IFLA namespaces/linked data
committee
To: "ifla-l(a)infoserv.inist.fr" <ifla-l(a)infoserv.inist.fr>
Dear all,
Do you know something about namespaces? Or does someone in your
organization have an interest or knowledge in this technical area? Would
you like to help develop the namespaces for IFLA’s standards?
IFLA publishes FRBR, ISBD and UNIMARC; if you believe in supporting these
as free resources for the whole world to use, we need you, to help us to
make and keep their namespaces available.
The IFLA Committee on Standards is looking for enthusiastic people to work
together to develop the provision of IFLA namespaces and their management.
The Linked Data Technical Sub-Committee (LIDATEC) will be a group of up to
7 people working together – mainly online – to develop the procedures and
support for maintaining IFLA’s namespaces. The Sub-Committee reports to the
IFLA Committee on Standards.
Applications to work on the Sub-Committee should be sent to
elections(a)ifla.org by 30 June 2017. The format of the application is brief
– if you are interested in working with us on this topic, simply describe
your own expertise and knowledge in a short paragraph. A selection will be
made in July and we hope the selected participants will be able to meet
with us in Poland in August.
As IFLA is a not-for-profit organisation, funded by library membership, we
unfortunately do not have funding to support individual participation.
Please, however, consider contributing your (or your organization’s) time
and financial support and unite with us to improve the library world for
the benefit of all.
Members of the committee try to meet once a year at the IFLA World Library
and Information Congress; discounted registration is available for people
selected for the LIDATEC committee.
Applications to elections(a)ifla.org by 30 June 2017:
Name:
Institutional affiliation (if any):
Short (APPROXIMATELY 250 WORDS) explanation of your background and
experience *relevant to this committee* (particularly in the area of linked
data and RDF publishing):
Please share this message with anyone you think might have an interest in
helping.
Thank you,
Joanne Yeomans
Professional Support Officer
International Federation of Library Associations and Institutions
IFLA Headquarters - PO Box 95312 - 2509 CH The Hague - The Netherlands
Tel. +31 70 314 0884 <+31%2070%20314%200884>
Email. joanne.yeomans(a)ifla.org
Skype: joanne.yeomans
Website for IFLA Officers: http://www.ifla.org/en/officers-corner
(Please be aware that I am not in the office on Wednesdays.)
--
Alex Stinson
GLAM-Wiki Strategist
Wikimedia Foundation
Twitter:@glamwiki/@sadads
Learn more about how the communities behind Wikipedia, Wikidata and other
Wikimedia projects partner with cultural heritage organizations:
http://glamwiki.org
Hello,
This is an important information regarding our database, for people running
tools and scripts.
*The change*
In the wb_terms table, a new column term_full_entity_id containing strings
with the full ID including the letter (ie. “Q42”) will be created, and will
be used instead of the current column term_entity_id that only stores the
numeric part of the ID (ie. “42”).
This change is made, among other things, to support the new entity types to
come, and then to store terms of entities that have non-numeric IDs, for
example Forms (“L42-F1”).
*Implications*
- This change only affects tools that directly access the database.
Other tools, for example those getting data through the API, pywikibot,
gadgets, etc. will not be affected and will work as before with no action
required from the maintainers.
- in order to adapt tools to the new database structure, database
queries using wb_terms should be changed so that they use
term_full_entity_id column instead of term_entity_id. This might
potentially simplify queries. For example, instead of having a
condition WHERE
term_entity_type='item' AND term_entity_id=42, one could now do: WHERE
term_full_entity_id='Q42'.
- term_entity_type is not affected by this change, and will still be
available as before.
*The process*
Starting now, we will work through several steps to achieve this change.
Note that the dates indicated below may not be exact. We will announce each
of the steps separately. Here is a rough timeline:
- June 22th: the new column, term_full_entity_id, becomes visible on
Labs. It will be fully populated in the testwikidatawiki database replica
on Labs. Note that this column will remain incomplete in the wikidatawiki
database until later! Tools that use the wb_terms table can test new
code that uses the new term_full_entity_id column with the
testwikidatawiki database, but must keep using the old code with the
wikidatawiki database.
- Some time later: testwikidatawiki should be fully populated, and
become usable on the wikidatawiki database. Tools that use the wb_terms
table can now switch to using new code that no longer uses the old
term_entity_id column on all databases.
- Eventually (not before July 6th): the old term_entity_id column is
removed from the testwikidatawiki and wikidatawiki database replicas on
labs. Any code that still uses the old term_entity_id will break.
If you have any question or problem regarding this change, feel free to
answer to this mail or write directly to me.
--
Léa Lacroix
Project Manager Community Communication for Wikidata
Wikimedia Deutschland e.V.
Tempelhofer Ufer 23-24
10963 Berlin
www.wikimedia.de
Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V.
Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter
der Nummer 23855 Nz. Als gemeinnützig anerkannt durch das Finanzamt für
Körperschaften I Berlin, Steuernummer 27/029/42207.
Hello all,
Here are some new information about the WikidataCon event
<https://www.wikidata.org/wiki/Wikidata:WikidataCon_2017>, that will take
place in Berlin on October 28th-29th.
We just nominated the members of the two committees that will have an
important role in the organization of the event. Thank you very much to all
the applicants!
- Program committee: andrawaag
<https://www.wikidata.org/wiki/User:Andrawaag>, Spinster
<https://www.wikidata.org/wiki/User:Spinster>, Masssly
<https://www.wikidata.org/wiki/User:Masssly>
- Scholarship committee: Micru <https://www.wikidata.org/wiki/User:Micru>,
Venuslui <https://www.wikidata.org/wiki/User:Venuslui>
For all the other topics, you can freely join *the volunteer groups
<https://www.wikidata.org/wiki/Wikidata:WikidataCon_2017/Volunteer>*,
discuss on the talk page, suggest ideas.
I'm also glad to give you an overview of the *next steps for the
WikidataCon*:
- Scholarship application process: June 8th-July 31st
- Call for submissions for the program: June 12th-July 31st
- Registration opening: June 19th (two other ticket releases are planned
in August and September)
- Selection of the scholarships recipients: August
- Review of the submissions for the program: August
- Announce of the program: September 1st
This dates can still evolve, I'll keep you informed. For more information,
check Wikidata:WikidataCon_2017
<https://www.wikidata.org/wiki/Wikidata:WikidataCon_2017>.
Last but not least: in order to be able to provide scholarships, and help
people who can't afford the travel and accommodation to participate, we
requested a grant from the WMF. This grant will be reviewed soon, and you
can help by *endorsing the project*. For this, you can add a message at the
bottom of the grant submission
<https://meta.wikimedia.org/wiki/Grants:Conference/WMDE/WikidataCon>. This
would be very useful for the project!
Cheers,
--
Léa Lacroix
Project Manager Community Communication for Wikidata
Wikimedia Deutschland e.V.
Tempelhofer Ufer 23-24
10963 Berlin
www.wikimedia.de
Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V.
Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter
der Nummer 23855 Nz. Als gemeinnützig anerkannt durch das Finanzamt für
Körperschaften I Berlin, Steuernummer 27/029/42207.
I've built a tool for mappers to match things in OSM with Wikidata and add the
appropriate wikidata tag to OSM.
https://osm.wikidata.link/
Users can search for a administrative area or city, then pick an area to
analyse. It works best if mappers pick areas they're familiar with.
The matcher can take a few minutes to run. It grabs items from Wikidata and
figures out a target set of tags and keys to search for. Then it downloads OSM
data and looks for matches. The matching is based on tags and names, currently
the wikipedia tags aren't considered.
Map data is from the OSM Overpass API. Large (more than 1,000 sq km) or dense
areas might fail with a timeout error.
The results are cached in the system and are available here:
https://osm.wikidata.link/places
Once the matching process is complete the mapper is given a tabbed page with
the results. The five tabs are:
Match candidates - things on OSM that might be considered for tagging
Already tagged - matches that are already tagged in OSM
No match - items from Wikidata with no match found in OSM
Wikidata query - the query used to find Wikidata items
Overpass query - the Overpass query to find OSM objects in this area
To start tagging the mapper is able to login to OSM via OAuth. Tick boxes will
appear next to the likely matches, the mapper can tick the box next to the
matches they want to upload, then add a change comment and upload them using
their own OSM account. Uploads within an area are combined into a single
changeset.
If the mapper sees an obviously incorrect match they can use the 'report bad
match' option to warn other mappers and provide feedback that I can use to
improve the algorithm.
This tools doesn't add any new objects to OSM. The only change it makes is
adding a wikidata tag to existing things.
My approach is to aim for a one-to-one mapping between Wikidata and OSM. If
there are two or more things in OSM that look like a Wikidata item then it
isn't a good match. This means for example that most road and rail bridges
won't be tagged because they are represented as two OSM ways. I might change
this at some point.
There are occasional duplicates in Wikidata, this tool should spot them and
refuse to add wikidata tags until the Wikidata duplicate is resolved.
The bug/todo list is here: https://github.com/EdwardBetts/osm-wikidata/issues
Any ideas or suggestions are welcome.
--
Edward.