Hello - 

The tl;dr version of this post: On a blank Wikibase instance, I want to be able to do:

api.php?action=wbeditentity&new=item&id=Q42&data={"labels":{"en":{"language":"en","value":"Douglas Adams"}}}

I do not want to do this on wikidata.org - I understand why it makes no sense in that context. But I would like to be able to do this on my own Wikibase instance.

Beyond the whimsical like ensuring Doug Adams gets to be Q42, the main reason for this is data portability and identifier stability. As more hosted Wikibase providers come online and start offering services, I want to know that I have data portability if I need to change to a different provider. Anyone who queries my Wikibase needs to know the identifiers my Wikibase uses for instances and more importantly for classes, and if I change providers, those identifiers cannot change without breaking those queries. 

I do not think that MySQL backups are a reliable way to be able to transition between providers. I am not confident that all providers will want to offer a service where they accept a MySQL backup to load into their Wikibase backend, and there are additional challenges moving between Wikibase versions. (Though some may - I programmatically create the contents of my Wikibase so I don't care about edit history, but if one were to care about that history and other things like wikiusers I imagine the MySQL dumps would be the preferred way to migrate)

One possible solution is to simply create blank items in a new Wikibase, from 1 to the maximum identifier used in my old wikibase, and then repopulate each item with the claims from my old Wikibase instance. Unfortunately this is not a reliable solution because while Wikibase guarantees that item IDs will not be reused, it does not guarantee that every ID in the sequence will be created, e.g. in rare cases Wikibase may go from Q41 to Q43 and skip/never create Q42. 

I think Wikibase is awesome, but it is an odd database that does not allow you to set the keys for the data you are managing :)

In reading through the Wikibase Repo code, it seems like this scenario was considered though perhaps isn't fully implemented (or has been disabled?). The code in EntitySavingHelper.php looks like there are/were ways to call it by providing an ID while still asking for a new entity, though there is logic earlier in the ModifyEntity code to look for and explicitly reject the case where the API asks for 'new' and also provides an ID, so I'm not sure how this code path would get called. There is also code to ask the entityStores if they 'canCreateWithCustomId', but those all appear to just return 'false'?

However, if that logic was skipped in the API handler and a bit of code reworked in ModifyEntity and EntitySavingHelper, along with ensuring that that the next available ID is kept up to date in the wb_id_counters table to always be 1 beyond the maximum ID in use, it looks like it might not be that hard to enable creating entities with specific IDs?

So three questions:
Would the Wikibase development team ever be open to supporting something like this, behind a flag like $wgWBRepoSettings['allowUserProvidedIds'] that defaulted to false?

Are there more complicated implications from allowing a change like this that would need to be considered? I understand why the Wikidata.org repo needs this codepath fast and can't allow users to provide IDs for new entities anyway, but are there other reasons this isn't supported beyond "Wikidata doesn't need it?"

Is this all moot with the eventual REST API? I see that there's a PUT envisioned, could I use that to directly create an item or property and give it an ID then, or does the ID have to already exist to replace it?

Thank you all for your work on Wikibase and have a nice end of 2020!

Thanks,

-Erik