Trung Dinh wrote:
Hi all,
I have an issue why trying to parse data fetched from wikipedia api.
This is the piece of code that I am using:
api_url = 'http://en.wikipedia.org/w/api.php'
api_params =
'action=query&list=recentchanges&rclimit=5000&rctype=edit&rcnamespace=0&rc
dir=newer&format=json&rcstart=20160504022715'
f = urllib2.Request(api_url, api_params)
print ('requesting ' + api_url + '?' + api_params)
source = urllib2.urlopen(f, None, 300).read()
source = json.loads(source)
json.loads(source) raised the following exception " Expecting ,
delimiter: line 1 column 817105 (char 817104"
I tried to use source.encode('utf-8') and some other encodings but they
all didn't help.
Do we have any workaround for that issue ? Thanks :)
Hi.
Weird, I can't reproduce this error. I had to import the "json" and
"urllib2" modules, but after doing so, executing the code you provided
here worked fine for me: <https://phabricator.wikimedia.org/P3009>.
You probably want to use 'https://en.wikipedia.org/w/api.php' as your
end-point (HTTPS, not HTTP).
As far as I know, JSON is always encoded as UTF-8, so you shouldn't need
to encode or decode the data explicitly.
The error you're getting generally means that the JSON was malformed for
some reason. It seems unlikely that MediaWiki's api.php is outputting
invalid JSON, but I suppose it's possible.
Since you're coding in Python, you may be interested in a framework such
as <https://github.com/alexz-enwp/wikitools>.
MZMcBride