On Sat, Jan 17, 2015 at 11:04 PM, Markus Krötzsch markus@semantic-mediawiki.org wrote:
It is easy to fix this (though I will not fix it tonight, but tomorrow) by just adjusting the HTML strings we parse for.
Sure! I have subscribed to the bug report.
As an intermediate workaround for me, what file name pattern is used in the local cache?
I had manually downloaded a file (and made it available as torrent because it was only at about 1 MB/s, [0]) and put this in the folder, but it was not recognized... the file on the server is: http://dumps.wikimedia.org/other/wikidata/20150112.json.gz
But as 20150112.json.gz it is not detected... I noted the the json-* pattern in the code, but json-20150112.json.gz didn't work either...
BTW, a second question, is there a way to list all local (JSON) dumps using the WDTK api?
We should also improve our error reporting for this case, obviously.
Yeah, that's an art no software I ever worked with mastered... it's hard! But it's important... I was completely looking in the wrong place... mind you, monitoring logging messages can be hard too, when WDTK is used in other environments, such as Bioclipse, and you cannot rely on those message to show up :(
Thanks for immediately looking into it and looking forward to pointers for my two questions,
greetings,
Egon