Is there a particular reason that schema:contentUrl
was chosen to point to URLs of the form
<https://upload.wikimedia.org/wikipedia/commons/q/q1/filename.jpg> rather than
<http://commons.wikimedia.org/wiki/Special:FilePath/filename.jpg> that WDQS uses?
Is the specific location useful? (Is it even stable?) Or would it make more sense to
use the established url form for these links, as used in WDQS ?
These URLs are the URLs that are used by Wikimedia servers to send
back the files. Special:FilePath URLs redirect to them.
I guess that they are mostly stable, it is the ones that are used by
Wikipedia/Commons since at least 15 years.
These URLs were not used in Wikidata RDF dumps because it would have
required additional database queries to fetch them while generating
the RDF dumps. But the SDOC dumps generator already fetches the files
medata to output the file type and dimensions, so it is cheap for it
to also output these URLs.
But, anyway, it should be fairly easy to change them to point to
Special:FilePath instead and, indeed, make interoperability with
Wikidata easier.
I could easily write a patch for it if the Search team is ok with it.
May you open a Phabricator ticket about it?
Another option is to directly add the M-IDs to Wikidata RDF
representation. I have just opened a ticket about it:
https://phabricator.wikimedia.org/T258776
Cheers,
Thomas
Le jeu. 23 juil. 2020 à 23:50, James Heald <jpm.heald(a)gmail.com> a écrit :
On 23/07/2020 22:26, Hay (Husky) wrote:
Awesome, i'm really happy we finally have at
least a start of a
functioning query service.
For now, the two things that i guess would be helpful for most query writers:
1) A way to make ImageGrid work without resorting to the clunky
Special:FilePath hack
2) A nicer way to query Wikidata information without using federation.
I guess 2) might be a bit more difficult, but it might definitely be
something to consider.
Kind regards,
-- Hay
IMO, the best way to avoid the hack of having to construct the
Special:FilePath urls would be to have these simply available as
triples, so the standard way to get an image with this sort of URL
would just be
sdc:M1234567 ?relation commons:filename.jpg
for some ?relation to be determined.
Also, as User:Mfchris84 has noted on the talk page, another reason to
desire such triples is to make it possible to map from the values of
wikidata properties like P18 ("image") and its friends, which are of the
form commons:filename.jpg to the corresponding M-IDs, so that
structured data about the files can be accessed.
At the moment going from commons:filename.jpg -> M-ID -> sdc data is
not straightforward.
Is there a particular reason that schema:contentUrl was chosen to
point to URLs of the form
<https://upload.wikimedia.org/wikipedia/commons/q/q1/filename.jpg>
rather than
<http://commons.wikimedia.org/wiki/Special:FilePath/filename.jpg>
that WDQS uses?
Is the specific location useful? (Is it even stable?) Or would it make
more sense to use the established url form for these links, as used in
WDQS ?
As to Hay's (2), being able to use the SERVICE for label lookup in WCQS
without requiring explicit federation to WDQS would be a nice step forward.
But overall I am *hugely* impressed by what the team has rolled out.
Thank you so much!
-- James.
_______________________________________________
Wikidata mailing list
Wikidata(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata