Hello,
As you may know, Wikibase currently does not normalize pagenames/filenames
on save (e.g. underscores in the input for properties of datatype Commons
media are allowed). At the same time, Wikidata’s quality constraints
extension
<https://www.mediawiki.org/wiki/Extension:WikibaseQualityConstraints>
triggers a constraint violation after saving, if underscores are used. This
is by design as to long-established
<https://www.wikidata.org/wiki/Template:Constraint:Commons_link> Community
practices. As a result, this inconsistency leaves users with unnecessary
manual work.
We will update Wikibase so that when a new edit is saved via UI or API, and
a pagename/filename is added or changed in that edit, then this
pagename/filename will be normalized on save ("My file_name.jpg" -> "My
file name.jpg").
More generally, the breaking change is that a user of the Wikibase API may
send one data value when saving an edit, and get back a slightly different
(normalized) data value after the edit was made: it is no longer the case
that data values are either saved unmodified or totally rejected (e.g. if a
file doesn’t exist on Commons). Since this guarantee is being removed with
this breaking change announcement, we may introduce further normalizations
in the future and only announce them as significant changes, not breaking
changes.
The change is currently available on test.wikidata.org and
test-commons.wikimedia.org. It will be deployed on Wikidata on or shortly
after September 6th. If you have any questions or feedback, please feel
free to let us know in this ticket
<https://phabricator.wikimedia.org/T251480>.
Cheers,
Lucas Werkmeister
--
Lucas Werkmeister (he/er)
Full Stack Developer
Wikimedia Deutschland e. V. | Tempelhofer Ufer 23-24 | 10963 Berlin
Phone: +49 (0)30 219 158 26-0
https://wikimedia.de
Imagine a world in which every single human being can freely share in the
sum of all knowledge. Help us to achieve our vision!
https://spenden.wikimedia.de
Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V.
Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter
der Nummer 23855 B. Als gemeinnützig anerkannt durch das Finanzamt für
Körperschaften I Berlin, Steuernummer 27/029/42207.
Dear all,
[Apologies for cross-posting]
I'm posting here because there is an open devOps position at the Open
Science Lab in TIB Hannover where I work, and it might be of interest to
people on this list.
>>
https://www.tib.eu/en/tib/careers-and-apprenticeships/vacancies/details/
job-advertisement-no-62-2021
<https://www.tib.eu/en/tib/careers-and-apprenticeships/vacancies/details
/job-advertisement-no-62-2021>
We are looking for someone with experience in OSS / Mediawiki / Wikibase
software (ideally) hence I'm posting here. Please feel free to spread
the word if you know anyone who might be interested and feel free to
reach out to me directly at lozana.rossenova(a)tib.eu
<mailto:lozana.rossenova@tib.eu> if you have any questions and want to
learn more.
The position is in Germany, but remote work is also possible.
Cheers,
Lozana Rossenova
--
Research Associate
Open Science Lab
This breaking change is relevant for anyone who consumes Wikidata RDF data
through Special:EntityData (rather than the dumps) without using the “dump”
flavor.
When an Item references other entities (e.g. the statement P31:Q5), the
non-dump (?flavor=dump) RDF output of that Item would include the labels
and descriptions of the referenced entities (e.g. P31 and Q5) in all
languages. That bloats the output drastically and causes performance
issues. See Special:EntityData/Q1337.rdf
<https://www.wikidata.org/wiki/Special:EntityData/Q1337.rdf> as an example.
We will change this so that for referenced entities, only labels and
descriptions in the request language (set e.g. via ?uselang=) and its
fallback languages are included in the response. For the main entity being
requested, labels, descriptions and aliases are still included in all
languages available, of course.
If you don’t actually need this “stub” data of referenced entities at all,
and are only interested in data about the main entity being requested, we
encourage you to use the “dump” flavor instead (include flavor=dump in the
URL parameters). In that case, this change will not affect you at all,
since the dump flavor includes no stub data, regardless of language.
This change is currently available for testing at test.wikidata.org. It
will be deployed on Wikidata on August 23rd. You are welcome to give us
general feedback by leaving a comment in this ticket
<https://phabricator.wikimedia.org/T285795>.
If you have any questions please do not hesitate to ask.
Cheers,
--
Mohammed Sadat
*Community Communications Manager for Wikidata/Wikibase*
Wikimedia Deutschland e. V. | Tempelhofer Ufer 23-24 | 10963 Berlin
Phone: +49 (0)30 219 158 26-0
https://wikimedia.de
Keep up to date! Current news and exciting stories about Wikimedia,
Wikipedia and Free Knowledge in our newsletter (in German): Subscribe now
<https://www.wikimedia.de/newsletter/>.
Imagine a world in which every single human being can freely share in the
sum of all knowledge. Help us to achieve our vision!
https://spenden.wikimedia.de
Wikimedia Deutschland – Gesellschaft zur Förderung Freien Wissens e. V.
Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter
der Nummer 23855 B. Als gemeinnützig anerkannt durch das Finanzamt für
Körperschaften I Berlin, Steuernummer 27/029/42207.
I thought that "," comma was being added to the Elasticsearch token filter
as a stopword and excluded from simple search now?
Or did I miss something?
[image: image.png]
Or NO and U+002C comma was decided against being added, and we must use the
Advanced Search on Wikidata or the API ?
I noticed that the string "foot locker inc" will not show the entity in the
dropdown, but only "foot locker, inc." ?
(I've since added the full legal name into the alias to improve
searchability, but still would like to know the stopword decision)
Thad
https://www.linkedin.com/in/thadguidry/https://calendly.com/thadguidry/