Sorry for the email spam. Worked through it, I think.
Not too familiar
with wiki internals. :-)
This particular page doesn't have the content I'm looking for in it. It
references a template which is used by a few other versions of the same
image, presumably so the data can be stored once and be given consistently.
Not being familiar with wiki internals, that was looking to me like it
wasn't returning the entire page content... But it is, so I'll have to
recognize this situation and pull referenced templates when the information
I need isn't already there.
On Tue, Jun 3, 2014 at 2:45 AM, james harvey <jamespharvey20(a)gmail.com>
wrote:
I may have stumbled upon it. If I change the API
call from
"titles=File:XYZ.jpg" to "titles=Template:XYZ" (note: dropped the
.jpg)
then it *appears* to get me what I need.
Is this correct, or did I run across a case where it appears to work but
isn't going to be the right way to go? (Like, I'm not sure if
"Template:XYZ" directly relates to the Summary information on the
"File:XYZ.jpg" page, or if it's duplicated data that in this case matches.
And, I'm confused why the .jpg gets dropped switching "File:" to
"Template:")
And, will this always get me the full template information, or if someone
just updates the "Year" portion, would it only return back that part --
since the revisions seem to be returning data as much as they can based on
changes from the previous revision, rather than the answer ignoring any
other revisions.
On Tue, Jun 3, 2014 at 1:59 AM, james harvey <jamespharvey20(a)gmail.com>
wrote:
Given a Wikimedia Commons description page URL -
such as:
https://commons.wikimedia.org/wiki/File:Van_Gogh_-_Starry_Night_-_Google_Ar…
I would like to be able to programmatically retrieve the information in
the "Summary" header. (Values for "Artist", "Title",
"Date", "Medium",
"Dimensions", "Current location", etc.)
I believe all this information is in "Template:Artwork". I can't figure
out how to get the wikitext/json-looking template data.
If I use the API and call:
https://commons.wikimedia.org/w/api.php?action=query&format=xml&tit…
<https://commons.wikimedia.org/w/api.php?action=query&format=xml&titles=File:Van%20Gogh%20-%20Starry%20Night%20-%20Google%20Art%20Project.jpg&iilimit=max&iiprop=timestamp%7Cuser%7Ccomment%7Curl%7Csize%7Cmime&prop=imageinfo%7Crevisions&rvgeneratexml=&rvprop=ids%7Ctimestamp%7Cuser%7Ccomment%7Ccontent>
Then I don't get the information I'm looking for. This shows the most
recent revision, and its changes. Unless the most recent revision changed
this data, it doesn't show up.
To see all the information I'm looking for, it seems I'd have to specify
rvlimit=max and go through all the past revisions to figure out which is
most current. For example, if I do so and I look at revid 79665032, that
includes: "{{Artwork | Artist = {{Creator:Vincent van Gogh}} | . . . | Year
= 1889 | Technique = {{Oil on canvas}} | . . ."
Isn't there a way to get the current version in whatever format you'd
call that - the wikitext/json looking format?
In my API call, I can specify rvexpandtemplates which even with only the
most recent revision gives me the information I need, but it's largely in
HTML tables/divs/etc format rather than wikitext/json/xml/etc.
_______________________________________________
Wikitech-l mailing list
Wikitech-l(a)lists.wikimedia.org