Hello Wikitech-l,
The Data Engineering team will start the deployment of the changes that will support the Temp Accounts https://www.mediawiki.org/wiki/Trust_and_Safety_Product/Temporary_Accounts initiative in the Data Lake https://wikitech.wikimedia.org/wiki/Data_Platform/Data_Lake starting today Wednesday January 22nd 2025. These changes are not activating the Temp Accounts feature in any of the wikis, but rather enabling support for Temp Accounts in the Hadoop Data Lake.
The changes are mostly backwards compatible, except for:
- The MediaWikiHistory dumps https://dumps.wikimedia.org/other/mediawiki_history/readme.html will have some new fields inserted, and the order of the existing fields will change. This might break code that is reading those dumps. The changes should take effect at the next snapshot release, at the beginning of February. The new schema has already been updated in the MediaWikiHistory dumps Wikitech page https://wikitech.wikimedia.org/wiki/Data_Platform/Data_Lake/Edits/MediaWiki_history_dumps#Technical_Documentation .
Thank you,
Data Engineering team.
The deployment is now finished.
If you observe any issues that you think might be related, please file a Phabricator ticket using this link https://phabricator.wikimedia.org/maniphest/task/edit/form/1/?projectPHIDs=data-engineering .
Thank you!
Data Engineering team
On Wed, Jan 22, 2025 at 7:27 PM Marcel Ruiz Forns mforns@wikimedia.org wrote:
Hello Wikitech-l,
The Data Engineering team will start the deployment of the changes that will support the Temp Accounts https://www.mediawiki.org/wiki/Trust_and_Safety_Product/Temporary_Accounts initiative in the Data Lake https://wikitech.wikimedia.org/wiki/Data_Platform/Data_Lake starting today Wednesday January 22nd 2025. These changes are not activating the Temp Accounts feature in any of the wikis, but rather enabling support for Temp Accounts in the Hadoop Data Lake.
The changes are mostly backwards compatible, except for:
- The MediaWikiHistory dumps
https://dumps.wikimedia.org/other/mediawiki_history/readme.html will have some new fields inserted, and the order of the existing fields will change. This might break code that is reading those dumps. The changes should take effect at the next snapshot release, at the beginning of February. The new schema has already been updated in the MediaWikiHistory dumps Wikitech page https://wikitech.wikimedia.org/wiki/Data_Platform/Data_Lake/Edits/MediaWiki_history_dumps#Technical_Documentation .
Thank you,
Data Engineering team.
-- *Marcel Ruiz Forns** (he/him)* Senior Software Engineer
wikitech-l@lists.wikimedia.org