On Tue, Dec 15, 2020 at 10:51 AM Frankie Robertson <frankie@robertson.name> wrote:

Dear Wikimedia search platform team,

I'm cross posting this from StackOverflow since it's a bit niche: https://stackoverflow.com/questions/65303450/how-to-authenticate-to-wikimedia-commons-query-service-using-oauth-in-python . I hope this is okay.

I am trying to use the Wikimedia Commons Query Service[1] programmatically using Python, but am having trouble authenticating via OAuth 1. I understand the service is subject to change, but am mostly trying to prototype things knowing they will have to be reworked later.

Please find enclosed my self contained Python example which does not work as expected. The expected behaviour is that a result set is returned, but instead a HTML response of the login page is returned. You can get the dependencies with `pip install --user sparqlwrapper oauthlib certifi`. The script should then be given the path to a text file containing the pasted output given after applying for an owner only token[2]. e.g.

```
Consumer token
deadbeefdeadbeefdeadbeefdeadbeefdeadbeefdeadbeef
Consumer secret
deadbeefdeadbeefdeadbeefdeadbeefdeadbeefdeadbeef
Access token
deadbeefdeadbeefdeadbeefdeadbeefdeadbeefdeadbeef
Access secret
deadbeefdeadbeefdeadbeefdeadbeefdeadbeefdeadbeef
```

[1] https://wcqs-beta.wmflabs.org/ ; https://diff.wikimedia.org/2020/10/29/sparql-in-the-shadow-of-structured-data-on-commons/

[2] https://www.mediawiki.org/wiki/OAuth/Owner-only_consumers

```python
import sys
from SPARQLWrapper import JSON, SPARQLWrapper
import certifi
from SPARQLWrapper import Wrapper
from functools import partial
from oauthlib.oauth1 import Client

ENDPOINT = "https://wcqs-beta.wmflabs.org/sparql"
QUERY = """
SELECT ?file WHERE {
?file wdt:P180 wd:Q42 .
}
"""

def monkeypatch_sparqlwrapper():
# Deal with old system certificates
if not hasattr(Wrapper.urlopener, "monkeypatched"):
Wrapper.urlopener = partial(Wrapper.urlopener, cafile=certifi.where())
setattr(Wrapper.urlopener, "monkeypatched", True)

def oauth_client(auth_file):
# Read credential from file
creds = []
for idx, line in enumerate(auth_file):
if idx % 2 == 0:
continue
creds.append(line.strip())
return Client(*creds)

class OAuth1SPARQLWrapper(SPARQLWrapper):
# OAuth sign SPARQL requests

def __init__(self, *args, **kwargs):
self.client = kwargs.pop("client")
super().__init__(*args, **kwargs)

def _createRequest(self):
request = super()._createRequest()
uri = request.get_full_url()
method = request.get_method()
body = request.data
headers = request.headers
new_uri, new_headers, new_body = self.client.sign(uri, method, body, headers)
request.full_url = new_uri
request.headers = new_headers
request.data = new_body
print("Sending request")
print("Url", request.full_url)
print("Headers", request.headers)
print("Data", request.data)
return request

monkeypatch_sparqlwrapper()
client = oauth_client(open(sys.argv[1]))
sparql = OAuth1SPARQLWrapper(ENDPOINT, client=client)
sparql.setQuery(QUERY)
sparql.setReturnFormat(JSON)
results = sparql.query().convert()

print("Results")
print(results)
```

Best regards,
Frankie
_______________________________________________
Discovery mailing list
Discovery@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/discovery