This is really a defective redesign. It reintroduced numeric IDs to be removed by T114902. See also T179928. We should reconsider reintroduce a new table to link unperfixed and perfixed entity ID.
Also oppose any "partial migration for first XXX items" process in T221765: this makes queries much more complicated. Please first fill all data in the new schema, then discontinue the old table.
This is really a defective redesign. It reintroduced numeric IDs to be
removed by T114902. See also T179928. We should reconsider reintroduce a new table to link unperfixed and perfixed entity ID.
The new schema has been optimized as much as possible to allow maximum scalability as it will contain a massive amount of data that we hope it doubles or even triple in size as soon as we can. The ids changed here to integer as we have seperate tables at the top level and prefixes in those tables are redundant and only take up redundant space that accumulate to big amount very quickly.
The old `wb_terms`, as well as the new schema, are not actually design for public use unless really necessary for your needs. If your needs can be addressed via Wikidata available APIs, you are very much encourage to switch to using those instead. In that case, you need not to worry about migrations and schema changes ever.
Also oppose any "partial migration for first XXX items" process in
T221765: this makes queries much more complicated. Please first fill all data in the new schema, then discontinue the old table.
Full migration is not possible unfortunately due to the current capacity of database master node. We tried to find a trade-off between the overhead we introduce to both disk usage and application logic complexity that will access those schemas.
This will make our life at Wikidata also a little less pleasant for the migration period. We understand this is unusual migration and isn't a very easy one for a little while, that's why we want to help out with those queries and other inquiries as much as we can.
If you have some queries you are running on `wb_terms`, it would of great help if you add them to a new Phab task on the Tool Builders migration board, in the Backlog column. https://phabricator.wikimedia.org/project/view/4014/
If you have any concrete suggestions regarding making this migration easier, we would also love to hear them. Please go ahead and add them on the same board in their own Phabricator tasks so that we can keep track of things more easily and follow up on them as soon as possible.
On Thu, 25 Apr 2019 at 15:25, data_querier data_querier@protonmail.com wrote:
This is really a defective redesign. It reintroduced numeric IDs to be removed by T114902. See also T179928. We should reconsider reintroduce a new table to link unperfixed and perfixed entity ID.
Also oppose any "partial migration for first XXX items" process in T221765: this makes queries much more complicated. Please first fill all data in the new schema, then discontinue the old table. _______________________________________________ Wikidata-tech mailing list Wikidata-tech@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-tech
Alaa Sarhan, 25/04/19 17:38:
Full migration is not possible unfortunately due to the current capacity of database master node.
Can you clarify whether it would also be too much load to write both to the new table and the old wb_terms table for a transition period (controlled by a configuration setting)?
(I'm not advocating for it, just asking because we did something of the sort in the past for other transitions.)
Federico
Hello all, Since this email thread is shared accross various mailing-lists, in order to not create noise for too many people, I kindly ask you to continue the discussions oh Phabricator, where a task https://phabricator.wikimedia.org/T221764 is dedicated to questions and issues. Thanks for your understanding :) Léa
On Thu, 25 Apr 2019 at 18:04, Federico Leva (Nemo) nemowiki@gmail.com wrote:
Alaa Sarhan, 25/04/19 17:38:
Full migration is not possible unfortunately due to the current capacity of database master node.
Can you clarify whether it would also be too much load to write both to the new table and the old wb_terms table for a transition period (controlled by a configuration setting)?
(I'm not advocating for it, just asking because we did something of the sort in the past for other transitions.)
Federico
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
On Thu, Apr 25, 2019 at 6:03 PM Federico Leva (Nemo) nemowiki@gmail.com wrote:
Alaa Sarhan, 25/04/19 17:38:
Full migration is not possible unfortunately due to the current capacity of database master node.
Can you clarify whether it would also be too much load to write both to the new table and the old wb_terms table for a transition period (controlled by a configuration setting)?
The problem isn't that much because of load but because of disk space constraints on the master. Right now we do not have enough space to handle the duplication of data that it will be needed until wb_terms can be killed and that space will be reclaimed back. The migration is on hold mostly because of that. We do have the hardware already at the DC (T211613), we are racking, installing and setting it up, once done we will work on a master failover to promote a new host to master (which will have more disk space, more RAM, and faster disks).
Manuel.
Hi Alaa,
On 25-04-19 16:38, Alaa Sarhan wrote:
This is really a defective redesign. It reintroduced numeric IDs to
be removed by T114902. See also T179928. We should reconsider reintroduce a new table to link unperfixed and perfixed entity ID.
The new schema has been optimized as much as possible to allow maximum scalability as it will contain a massive amount of data that we hope it doubles or even triple in size as soon as we can.
The new schema has been optimized for your use cases and complete breaks any tools combining page table data with wikibase data. If you really would care about tool developers, you wouldn't trash the unprefixed ID.
Maarten