Oops, now I have 2 places to respond... I added the following to the ticket:

Hi Markus,
I created some Python code myself:

lines 261 to 265 did what I was using before.

Then I use it from 387 to 397. So it searches on label first, then compares whether the descriptions match. It's not the greatest code... but it did the trick

Why was I coding this in Python? Well, I'm creating a prototype in the JOSM editor, which is written in Java. Hopefully this can be incorporated in core at some point and then it will be better to use a Java Toolkit. I had started coding one myself, but that doesn't make much sense. Better to stand on the shoulders of giants and reach up from there.


2016-02-13 23:46 GMT+01:00 Markus Krötzsch <markus@semantic-mediawiki.org>:
[Moving to wikidata-tech; previous conversation inline below]

Hi Polyglot,

ah, now I see. The Wikidata Toolkit method you call is looking for items by Wikipedia page title, not for items by label. Labels and titles are not related in Wikidata. The search by title is supported by the wbgetentities API action for which we have a wrapper class, but this API action does not support the search by label.

In fact, I am not sure that there is any API action for doing what you want. There is only wbsearchentities, but this search will return near matches and also look for aliases. Maybe this is not a big issue for long strings as in your case, but for shorter strings you would get many results and you would still need to check if they really match.

Anyway, you are right that it would be nice if we would implement support for the label/alias search as well. For this, we need to make a wrapper class for wbsearchentities. I created an issue to track this:




On 13.02.2016 23:22, Jo wrote:
Hi Markus,

I'm searching for a wikidata item with that label. It would be even
better if it were possible to search for a label/description combination.

This is the item I'm looking for:

I mostly want to make sure that I'm not creating duplicate entries in
Wikidata, most of those schools are not noteworthy enough to get an
article on Wikipedia, but since they have objects in Openstreetmap, I
would think they are interesting enough for Wikidata.


2016-02-13 23:13 GMT+01:00 Markus Krötzsch
<markus@semantic-mediawiki.org <mailto:markus@semantic-mediawiki.org>>:

    Hi Jo,

    You are searching for an item that is assigned to the article
    "Kasega Church of Uganda Primary School" on English Wikipedia.
    However, there is not article of this name on English Wikipedia.
    Maybe there is a typo? Can you tell me which Wikidata item should be
    returned here?



    P.S. If you agree, I would prefer to continue this discussion on
    wikidata-tech for the benefit of others who may have similar questions.

    On 13.02.2016 14:47, Jo wrote:

        Hi Marcus,

        I had started to write my own implementation of a Wikidata bot in
        Jython, so I could use it in JOSM, but still get to code in
        Python. This
        worked well for a while, but now apparently something was
        changed to the
        login API.

        Anyway, I can't code in all possible things that can go wrong, so it
        makes more sense to reuse an existing framework.

        What I want to do is add items, but I want to check if they already
        exist first. Try as I may, I can't seem to retrieve the items I
        myself, like:

           Kasega Church of Uganda Primary School

        Douglas Adams, on the other hand doesn't pose a problem.

        I can't figure out why this is. Some things can be found, others
        I tried with a few more entries from recent changes.

        In my own bot, I had more succes with searchEntities than with
        getEntities. Was this implemented in WDTK?

        I hope you can help, I'm stuck, as it doesn't make a lot of sense to
        continue with the conversion, if I can't even get a trivial
        thing like
        this to work.

        from org.wikidata.wdtk.datamodel.helpers import Datamodel
        from org.wikidata.wdtk.datamodel.helpers import ItemDocumentBuilder
        from org.wikidata.wdtk.datamodel.helpers import ReferenceBuilder
        from org.wikidata.wdtk.datamodel.helpers import StatementBuilder
        from org.wikidata.wdtk.datamodel.interfaces import DatatypeIdValue
        from org.wikidata.wdtk.datamodel.interfaces import EntityDocument
        from org.wikidata.wdtk.datamodel.interfaces import ItemDocument
        from org.wikidata.wdtk.datamodel.interfaces import ItemIdValue
        from org.wikidata.wdtk.datamodel.interfaces import PropertyDocument
        from org.wikidata.wdtk.datamodel.interfaces import PropertyIdValue
        from org.wikidata.wdtk.datamodel.interfaces import Reference
        from org.wikidata.wdtk.datamodel.interfaces import Statement
        from org.wikidata.wdtk.datamodel.interfaces import StatementDocument
        from org.wikidata.wdtk.datamodel.interfaces import StatementGroup
        from org.wikidata.wdtk.wikibaseapi import ApiConnection
        from org.wikidata.wdtk.util import WebResourceFetcherImpl
        from org.wikidata.wdtk.wikibaseapi import ApiConnection
        from org.wikidata.wdtk.wikibaseapi import LoginFailedException
        from org.wikidata.wdtk.wikibaseapi import WikibaseDataEditor
        from org.wikidata.wdtk.wikibaseapi import WikibaseDataFetcher
        from org.wikidata.wdtk.wikibaseapi.apierrors import
        # print dir(ItemDocument)
        # print dir(ApiConnection)

        dataFetcher = WikibaseDataFetcher(connection, siteIri)
        # print dir(dataFetcher)
        # itemDocuments =
        dataFetcher.getEntityDocumentsByTitle('enwiki',['Kasega Church
        of Uganda
        Primary School'])
        # itemDocuments = dataFetcher.getEntityDocuments('Q22695926')
        itemDocuments =
        Church of Uganda Primary School')
        # print dir(itemDocuments)
        print str(len(itemDocuments)) + ' resulting items'
        print itemDocuments.toString()
        # for itemDocument in itemDocuments:
              # print '=========================='
              # print itemDocument.toString()