Hello -
Sorry for the crosspost/repeat - I sent a version of this to the wikidata
mailing list, but it was right in the peak of the holidays. This list is
probably more appropriate for it and hopefully by now the wikibase
developers are back from their holidays and all caught up on email/year
end/new year tasks, and can help provide some guidance.
The tl;dr version of this post: On a blank Wikibase instance, I want to be
able to do:
api.php?action=wbeditentity&*new=item&id=Q42*&data={"labels":{"en":{"language":"en","value":"Douglas
Adams"}}}
I do not want to do this on wikidata.org - I understand why it makes no
sense in that context. But I would like to be able to do this on my own
Wikibase instance.
Beyond the whimsical like ensuring Doug Adams gets to be Q42, the main
reason for this is data portability and identifier stability. As more
hosted Wikibase providers come online and start offering services, I want
to know that I have data portability if I need to change to a different
provider. Anyone who queries my Wikibase needs to know the identifiers my
Wikibase uses for instances and more importantly for classes, and if I
change providers, those identifiers cannot change without breaking those
queries.
I do not think that MySQL backups are a reliable way to be able to
transition between providers. I am not confident that all providers will
want to offer a service where they accept a MySQL backup to load into their
Wikibase backend, and there are additional challenges moving between
Wikibase versions. (Though some may - I programmatically create the
contents of my Wikibase so I don't care about edit history, but if one were
to care about that history and other things like wikiusers I imagine the
MySQL dumps would be the preferred way to migrate?)
One possible solution is to simply create blank items in a new Wikibase,
from 1 to the maximum identifier used in my old wikibase, and then
repopulate each item with the claims from my old Wikibase instance.
Unfortunately this is not a reliable solution because while Wikibase
guarantees that item IDs will not be reused, it does not guarantee that
every ID in the sequence will be created, e.g. in rare cases Wikibase may
go from Q41 to Q43 and skip/never create Q42.
I don't mind that the identifier needs to be prefixed with a 'Q' or a 'P'
for a particular type, I just want to be able to set the same identifier if
I set up a new wikibase instance.
I think Wikibase is awesome, but it is an odd database that does not allow
you to set the keys for the data you are managing :)
In reading through the Wikibase Repo code, it seems like this scenario was
considered though perhaps isn't fully implemented (or has been disabled?).
The code in EntitySavingHelper.php looks like there are/were ways to call
it by providing an ID while still asking for a new entity, though there is
logic earlier in the ModifyEntity code to look for and explicitly reject
the case where the API asks for 'new' and also provides an ID, so I'm not
sure how this code path would get called. There is also code to ask the
entityStores if they 'canCreateWithCustomId', but those all appear to just
return 'false'?
However, if that logic was skipped in the API handler and a bit of code
reworked in ModifyEntity and EntitySavingHelper, along with ensuring that
that the next available ID is kept up to date in the wb_id_counters table
to always be 1 beyond the maximum ID in use, it looks like it might not be
that hard to enable creating entities with specific IDs?
So three questions:
Would the Wikibase development team ever be open to supporting something
like this, behind a flag like $wgWBRepoSettings['allowUserProvidedIds']
that defaulted to false?
Are there more complicated implications from allowing a change like this
that would need to be considered? I understand why the Wikidata.org repo
needs this codepath fast and can't allow users to provide IDs for new
entities anyway, but are there other reasons this isn't supported beyond
"Wikidata doesn't need it?"
Is this all moot with the eventual REST API? I see that there's a PUT
envisioned, could I use that to directly create an item or property and
give it an ID then, or does the ID have to already exist to replace it?
I am happy to try to tackle creating a patch for this, but I'd like to get
some feedback if there's any big lurking issues that I should know about
before starting on the work - I'd rather not get deep into it only to find
out it will never work or never be accepted. I'm also happy to shift this
to phabricator if that's more appropriate.
Thank you all for your work on Wikibase!
Thanks,
-Erik
Hi everyone,
the Wikibase docker installation https://github.com/wmde/wikibase-docker
does not start anymore error free. Since there is no issue list for the
Github repository I try my luck with the mailing list.
The problem occurs related to Elasticsearch and CirrusSearch:
wikibase_1 | Validating my_wiki_general alias...alias is
free...corrected
wikibase_1 | Validating my_wiki alias...alias not already
assigned to this index...corrected
wikibase_1 | Updating tracking indexes...
wikibase_1 | Unexpected Elasticsearch failure.
wikibase_1 | Elasticsearch failed in an unexpected way. This is
always a bug in CirrusSearch.
wikibase_1 | Error type: Elastica\Exception\Bulk\ResponseException
wikibase_1 | Message: unknown: Error in one or more bulk request
actions:
wikibase_1 |
wikibase_1 | index:
/mw_cirrus_metastore_first/mw_cirrus_metastore/version-my_wiki_content
caused blocked by: [FORBIDDEN/12/index read-only / allow delete (api)];
wikibase_1 | index:
/mw_cirrus_metastore_first/mw_cirrus_metastore/version-my_wiki_general
caused blocked by: [FORBIDDEN/12/index read-only / allow delete (api)];
wikibase_1 | index:
/mw_cirrus_metastore_first/mw_cirrus_metastore/version-my_wiki_archive
caused blocked by: [FORBIDDEN/12/index read-only / allow delete (api)];
wikibase_1 |
wikibase_1 | Trace:
wikibase_1 | #0
/var/www/html/extensions/Elastica/vendor/ruflin/elastica/lib/Elastica/Bulk.php(359):
Elastica\Bulk->_processResponse(Object(Elastica\Response))
wikibase_1 | #1
/var/www/html/extensions/Elastica/vendor/ruflin/elastica/lib/Elastica/Client.php(361):
Elastica\Bulk->send()
wikibase_1 | #2
/var/www/html/extensions/Elastica/vendor/ruflin/elastica/lib/Elastica/Index.php(182):
Elastica\Client->addDocuments(Array, Array)
wikibase_1 | #3
/var/www/html/extensions/Elastica/vendor/ruflin/elastica/lib/Elastica/Type.php(202):
Elastica\Index->addDocuments(Array, Array)
wikibase_1 | #4
/var/www/html/extensions/CirrusSearch/includes/MetaStore/MetaVersionStore.php(70):
Elastica\Type->addDocuments(Array)
wikibase_1 | #5
/var/www/html/extensions/CirrusSearch/maintenance/metastore.php(133):
CirrusSearch\MetaStore\MetaVersionStore->updateAll('my_wiki')
wikibase_1 | #6
/var/www/html/extensions/CirrusSearch/maintenance/metastore.php(95):
CirrusSearch\Maintenance\Metastore->updateIndexVersion('my_wiki')
wikibase_1 | #7
/var/www/html/extensions/CirrusSearch/maintenance/updateOneSearchIndexConfig.php(301):
CirrusSearch\Maintenance\Metastore->execute()
wikibase_1 | #8
/var/www/html/extensions/CirrusSearch/maintenance/updateOneSearchIndexConfig.php(270):
CirrusSearch\Maintenance\UpdateOneSearchIndexConfig->updateVersions()
wikibase_1 | #9
/var/www/html/extensions/CirrusSearch/maintenance/updateSearchIndexConfig.php(61):
CirrusSearch\Maintenance\UpdateOneSearchIndexConfig->execute()
wikibase_1 | #10
/var/www/html/maintenance/doMaintenance.php(99):
CirrusSearch\Maintenance\UpdateSearchIndexConfig->execute()
wikibase_1 | #11
/var/www/html/extensions/CirrusSearch/maintenance/updateSearchIndexConfig.php(70):
require_once('/var/www/html/m...')
wikibase_1 | #12 {main}
I have tried version 1.34 as well as 1.35 - always the same problem.
The error occurs only the first time you start Wikibase. Unfortunately,
automatic tests always fail due to this error.
Best wishes
Jesper