(*apologies for cross-posting*)
Hello,
This is a breaking change announcement relevant to those working with Lexeme dumps.
In Lexeme dumps, "senses" and "forms" values, when not empty, are shown as arrays. When these lists are empty, they are currently displayed as objects. For example, values with content are displayed in array format: "senses":[{"id":"L4-S1",...] but empty values are treated as objects: "senses":{}
However, empty lists should be presented as arrays as well: "senses":[]
In this change, empty lists of forms and senses will be switched from objects to arrays. This adjustment makes the dumps more consistent and matches the same way non-empty values are presented. We will roll this change out on February 8th.
We anticipate the impact of this change to be minimal and harmless for most use cases. Therefore, we haven't generated a test dump, as it would demand substantial resources and time. If you have any questions or concerns about this change, please don’t hesitate to reach out to us in this ticket ( T305660 https://phabricator.wikimedia.org/T305660).
Cheers,
Thank you for the announcement Mohammed.
For the audience of the XML Data Dumps list, I wanted to clarify that this breaking change is for the Wikidata lexeme dumps https://dumps.wikimedia.org/wikidatawiki/entities/, and not for the XML Data Dumps https://dumps.wikimedia.org/backup-index.html.
-- Xabriel J. Collazo Mojica (he/him, pronunciation https://commons.wikimedia.org/wiki/File:Xabriel_Collazo_Mojica_-_pronunciation.ogg ) Sr Software Engineer Wikimedia Foundation
On Wed, Jan 24, 2024 at 12:35 PM Mohammed Sadat Abdulai < mohammed.abdulai@wikimedia.de> wrote:
(*apologies for cross-posting*)
Hello,
This is a breaking change announcement relevant to those working with Lexeme dumps.
In Lexeme dumps, "senses" and "forms" values, when not empty, are shown as arrays. When these lists are empty, they are currently displayed as objects. For example, values with content are displayed in array format: "senses":[{"id":"L4-S1",...] but empty values are treated as objects: "senses":{}
However, empty lists should be presented as arrays as well: "senses":[]
In this change, empty lists of forms and senses will be switched from objects to arrays. This adjustment makes the dumps more consistent and matches the same way non-empty values are presented. We will roll this change out on February 8th.
We anticipate the impact of this change to be minimal and harmless for most use cases. Therefore, we haven't generated a test dump, as it would demand substantial resources and time. If you have any questions or concerns about this change, please don’t hesitate to reach out to us in this ticket (T305660 https://phabricator.wikimedia.org/T305660).
Cheers,
Mohammed S. Abdulai *Community Communications Manager, Wikidata*
Wikimedia Deutschland e. V. | Tempelhofer Ufer 23-24 | 10963 Berlin Phone: +49 (0) 30 577 116 2466 https://wikimedia.de
Grab a spot in my calendar for a chat: calendly.com/masssly.
A lot is happening around Wikidata - Keep up to date! https://www.wikidata.org/wiki/Wikidata:Status_updates Current news and exciting stories about Wikimedia, Wikipedia and Free Knowledge in our newsletter (in German): Subscribe now https://www.wikimedia.de/newsletter/.
Imagine a world in which every single human being can freely share in the sum of all knowledge. Help us to achieve our vision! https://spenden.wikimedia.de
Wikimedia Deutschland — Gesellschaft zur Förderung Freien Wissens e. V. Eingetragen im Vereinsregister des Amtsgerichts Charlottenburg, VR 23855 B. Als gemeinnützig anerkannt durch das Finanzamt für Körperschaften I Berlin, Steuernummer 27/029/42207. Geschäftsführende Vorstände: Franziska Heine, Dr. Christian Humborg _______________________________________________ Xmldatadumps-l mailing list -- xmldatadumps-l@lists.wikimedia.org To unsubscribe send an email to xmldatadumps-l-leave@lists.wikimedia.org
xmldatadumps-l@lists.wikimedia.org