Is there a particular reason that schema:contentUrl was chosen to point to URLs of the form https://upload.wikimedia.org/wikipedia/commons/q/q1/filename.jpg rather than http://commons.wikimedia.org/wiki/Special:FilePath/filename.jpg that WDQS uses? Is the specific location useful? (Is it even stable?) Or would it make more sense to use the established url form for these links, as used in WDQS ?
These URLs are the URLs that are used by Wikimedia servers to send back the files. Special:FilePath URLs redirect to them. I guess that they are mostly stable, it is the ones that are used by Wikipedia/Commons since at least 15 years. These URLs were not used in Wikidata RDF dumps because it would have required additional database queries to fetch them while generating the RDF dumps. But the SDOC dumps generator already fetches the files medata to output the file type and dimensions, so it is cheap for it to also output these URLs. But, anyway, it should be fairly easy to change them to point to Special:FilePath instead and, indeed, make interoperability with Wikidata easier. I could easily write a patch for it if the Search team is ok with it. May you open a Phabricator ticket about it?
Another option is to directly add the M-IDs to Wikidata RDF representation. I have just opened a ticket about it: https://phabricator.wikimedia.org/T258776
Cheers,
Thomas
Le jeu. 23 juil. 2020 à 23:50, James Heald jpm.heald@gmail.com a écrit :
On 23/07/2020 22:26, Hay (Husky) wrote:
Awesome, i'm really happy we finally have at least a start of a functioning query service.
For now, the two things that i guess would be helpful for most query writers:
- A way to make ImageGrid work without resorting to the clunky
Special:FilePath hack 2) A nicer way to query Wikidata information without using federation.
I guess 2) might be a bit more difficult, but it might definitely be something to consider.
Kind regards, -- Hay
IMO, the best way to avoid the hack of having to construct the Special:FilePath urls would be to have these simply available as triples, so the standard way to get an image with this sort of URL would just be
sdc:M1234567 ?relation commons:filename.jpg
for some ?relation to be determined.
Also, as User:Mfchris84 has noted on the talk page, another reason to desire such triples is to make it possible to map from the values of wikidata properties like P18 ("image") and its friends, which are of the form commons:filename.jpg to the corresponding M-IDs, so that structured data about the files can be accessed.
At the moment going from commons:filename.jpg -> M-ID -> sdc data is not straightforward.
Is there a particular reason that schema:contentUrl was chosen to point to URLs of the form https://upload.wikimedia.org/wikipedia/commons/q/q1/filename.jpg rather than http://commons.wikimedia.org/wiki/Special:FilePath/filename.jpg that WDQS uses?
Is the specific location useful? (Is it even stable?) Or would it make more sense to use the established url form for these links, as used in WDQS ?
As to Hay's (2), being able to use the SERVICE for label lookup in WCQS without requiring explicit federation to WDQS would be a nice step forward.
But overall I am *hugely* impressed by what the team has rolled out. Thank you so much!
-- James.
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata