Hi all,
Last week I went to a conference hosted by Creative Commons Australia http://creativecommons.org.au/australasiancommons. One of the talks was "Play at Powerhouse" by Sebastian Chan, the manager of web services at Powerhouse Museum. They are one of the institutions participating in Flickr's "The Commons" - http://www.flickr.com/photos/powerhouse_museum/.
A lot of their images have been transferred to Commons. I sat up in my chair fairly well when Seb raised this as an "issue". From his perspective, this is a problem, because once the images leave Flickr, they lose the ability to easily track and report usage. As an institution they use these stats to justify the effort of making their content digitally available (and geocoding it, and maintaining the Flickr community, etc).
There are some screenshots showing Flickr stats in this post: http://searchengineland.com/071213-161815.php
Making institutions feel more comfortable with their content appearing on Wikimedia Commons is obviously a good thing for us. We have pageview stats and checkusage and who knows what other exciting stuff.
I think we should develop an automated process for creating institutional stats reports and then contact the institutions whose works we use and offer them this report - starting with the Powerhouse Museum.
In fact, I don't see a reason not to make the reports public. They could be generated once a month (or day, whatever) and then sit on the toolserver for people to look at. Providing them as a public reference might actually serve as incentive for more institutions to actively engage with us.
Thoughts? Good idea or waste of time?
cheers, Brianna
On Wed, Jul 2, 2008 at 7:00 PM, Brianna Laugher brianna.laugher@gmail.com wrote:
I think we should develop an automated process for creating institutional stats reports and then contact the institutions whose works we use and offer them this report - starting with the Powerhouse Museum.
Sounds like a good idea. Combine CheckUsage + domas' Wikistats and you have some pretty awesome views per image stats. In fact I don't see a reason (except performance) to not make such stats available per image. And then stuff like total views for images in Category:X
Bryan
Bryan Tong Minh bryan.tongminh@gmail.com wrote on wed, 2 jul 2008 19:08:45 +0200:
On Wed, Jul 2, 2008 at 7:00 PM, Brianna Laugher brianna.laugher@gmail.com wrote:
I think we should develop an automated process for creating institutional stats reports and then contact the institutions whose works we use and offer them this report - starting with the Powerhouse Museum.
Sounds like a good idea. Combine CheckUsage + domas' Wikistats and you have some pretty awesome views per image stats. In fact I don't see a reason (except performance) to not make such stats available per image. And then stuff like total views for images in Category:X
... or stats per uploader or per Flickr user ... sounds great.
Regards,
Flo
2008/7/3 Bryan Tong Minh bryan.tongminh@gmail.com:
Sounds like a good idea. Combine CheckUsage + domas' Wikistats and you have some pretty awesome views per image stats. In fact I don't see a reason (except performance) to not make such stats available per image. And then stuff like total views for images in Category:X
Is Commons data in wikistats? I had a feeling it wasn't?
Brianna
On Thu, Jul 3, 2008 at 3:05 AM, Brianna Laugher brianna.laugher@gmail.com wrote:
2008/7/3 Bryan Tong Minh bryan.tongminh@gmail.com:
Sounds like a good idea. Combine CheckUsage + domas' Wikistats and you have some pretty awesome views per image stats. In fact I don't see a reason (except performance) to not make such stats available per image. And then stuff like total views for images in Category:X
Is Commons data in wikistats? I had a feeling it wasn't?
Most of the images views aren't on commons.. they are on Wikipedia. What is being suggested is determine where images are used and add up the counters for those pages.
2008/7/3 Gregory Maxwell gmaxwell@gmail.com:
On Thu, Jul 3, 2008 at 3:05 AM, Brianna Laugher brianna.laugher@gmail.com wrote:
2008/7/3 Bryan Tong Minh bryan.tongminh@gmail.com:
Sounds like a good idea. Combine CheckUsage + domas' Wikistats and you have some pretty awesome views per image stats. In fact I don't see a reason (except performance) to not make such stats available per image. And then stuff like total views for images in Category:X
Is Commons data in wikistats? I had a feeling it wasn't?
Most of the images views aren't on commons.. they are on Wikipedia. What is being suggested is determine where images are used and add up the counters for those pages.
I get that, but why not include Commons data too?
I would be interested to see Commons page stats info in its own right. If only to see if anything beats [[category:sex]] for pageviews :)
Brianna
On Thu, Jul 3, 2008 at 3:34 AM, Brianna Laugher
I get that, but why not include Commons data too?
Indeed ... why not? Which is why the data is in the released stuff. :)
at http://dammit.lt/wikistats/
I would be interested to see Commons page stats info in its own right. If only to see if anything beats [[category:sex]] for pageviews :)
Yes, apparently "Category:Shaved_genitalia_(female)" does, :-/ ... among other things.
commons.m Special:Search 2181 15612408 commons.m Main_Page 1846 69875933 commons.m Special:AutoLogin 856 2891273 commons.m Image:Vagina-anatomy-labelled2.jpg 564 5811141 commons.m Image:Vagina-anatomy1.jpg 475 4480932 commons.m Special:Upload 470 3780969 commons.m Category:Vulva 273 1844972 commons.m Category:Female_nude_in_photography 264 8720259 commons.m Category:Sex_positions 262 787852 commons.m Category:Penis 248 2540564 commons.m Image:Vagina.JPG 246 1767413 commons.m Category:Erotic 238 1552372 commons.m Penis 235 1523123 commons.m Category:Shaved_genitalia_%28female%29 233 1038409 commons.m Category:Sex 212 1207521 commons.m Commons:Upload/es 205 1645392 commons.m Category:Ejaculation 205 1335772 commons.m Special:Watchlist 193 2654496 commons.m Category:Masturbation 189 727640 commons.m Image:Vagina,Anus,Pereneum-Detail.jpg 176 1831813 commons.m Hauptseite 152 2927218 commons.m Category:Nudity 150 600007 commons.m Image:Human_penis_erect.jpg 149 1255852
On Jul 3, 2008, at 3:28 AM, Gregory Maxwell wrote:
On Thu, Jul 3, 2008 at 3:05 AM, Brianna Laugher brianna.laugher@gmail.com wrote:
2008/7/3 Bryan Tong Minh bryan.tongminh@gmail.com:
Sounds like a good idea. Combine CheckUsage + domas' Wikistats and you have some pretty awesome views per image stats. In fact I don't see a reason (except performance) to not make such stats available per image. And then stuff like total views for images in Category:X
Is Commons data in wikistats? I had a feeling it wasn't?
Most of the images views aren't on commons.. they are on Wikipedia. What is being suggested is determine where images are used and add up the counters for those pages.
So how do you account for images being removed from and added to pages over time? For example [[Image:WScottHancock.jpg]] is used on over 100 pages on en.wp alone atm but that count will likely drop dramatically in less than a day. <http://toolserver.org/~daniel/ WikiSense/CheckUsage.php?i=WScottHancock.jpg>
For that matter, what about really long pages where people tend to go to read only part of the page like talk pages? Should extra weight be given to views of an image or it's description page directly? (rather than on an article or other page that references it)
--Jeremy
[[user:jeremyb]]
On Thu, Jul 3, 2008 at 11:18 PM, Jeremy Baron jeremy@tuxmachine.com wrote:
On Jul 3, 2008, at 3:28 AM, Gregory Maxwell wrote:
On Thu, Jul 3, 2008 at 3:05 AM, Brianna Laugher brianna.laugher@gmail.com wrote:
2008/7/3 Bryan Tong Minh bryan.tongminh@gmail.com:
Sounds like a good idea. Combine CheckUsage + domas' Wikistats and you have some pretty awesome views per image stats. In fact I don't see a reason (except performance) to not make such stats available per image. And then stuff like total views for images in Category:X
Is Commons data in wikistats? I had a feeling it wasn't?
Most of the images views aren't on commons.. they are on Wikipedia. What is being suggested is determine where images are used and add up the counters for those pages.
So how do you account for images being removed from and added to pages over time? For example [[Image:WScottHancock.jpg]] is used on over 100 pages on en.wp alone atm but that count will likely drop dramatically in less than a day. http://toolserver.org/~daniel/ WikiSense/CheckUsage.php?i=WScottHancock.jpg
We could do daily CheckUsage for images and count on a day-by-day basis. Or just once a week. Or once a month. These will average out over time.
For that matter, what about really long pages where people tend to go to read only part of the page like talk pages?
It's hard to measure how often an image is looked at on a page. But then, this is not an exact metric in the first place.
Should extra weight be given to views of an image or it's description page directly? (rather than on an article or other page that references it)
We could list that separately. Our "customers" might not be interested in the per-image views, but more in the overall collection views. Like "pages containing your images were looked at 10.000 times, and the image was selected and looked at 1.000 times".
Magnus
We could do daily CheckUsage for images and count on a day-by-day basis. Or just once a week. Or once a month. These will average out over time.
Err... don't we have logs of the access to the image files themselves (including thumbnails)? It would be MUCH simpler to analyse those than to look at page views and then at image usage and then infer how much an image is loaded...
Basically, there are two interesting figures: 1) "On how many pages is a given image used?" (maybe also "on how many different wikis") 2) "How often is a given image loaded?"
What has been discussed so far is "how often are pages that use the image loaded", which seems to be besides the point...
-- Daniel
2008/7/4 Daniel Kinzler daniel@brightbyte.de:
We could do daily CheckUsage for images and count on a day-by-day basis. Or just once a week. Or once a month. These will average out over time.
Err... don't we have logs of the access to the image files themselves (including thumbnails)? It would be MUCH simpler to analyse those than to look at page views and then at image usage and then infer how much an image is loaded...
Maybe the royal 'we' has those logs, but are they accessible?
Brianna
Brianna Laugher schrieb:
2008/7/4 Daniel Kinzler daniel@brightbyte.de:
We could do daily CheckUsage for images and count on a day-by-day basis. Or just once a week. Or once a month. These will average out over time.
Err... don't we have logs of the access to the image files themselves (including thumbnails)? It would be MUCH simpler to analyse those than to look at page views and then at image usage and then infer how much an image is loaded...
Maybe the royal 'we' has those logs, but are they accessible?
I asked the royal domas, and he said we don't yet have those logs, but he "could" provide them. So, lt's poke him to do it. It's probably less effort in the end than hacking something together. And a lot cleaner anyway.
-- Daniel
Magnus Manske wrote:
We could do daily CheckUsage for images and count on a day-by-day basis. Or just once a week. Or once a month. These will average out over time.
That would work if they were exact views. But given that they're average views, the fact that one person viewed the page today with an ratio 1/1000 doesn't mean the other 999 people could view it yesterday before the image was added.
Daniel Kinzler wrote:
Err... don't we have logs of the access to the image files themselves (including thumbnails)? It would be MUCH simpler to analyse those than to look at page views and then at image usage and then infer how much an image is loaded...
I think only urls in the form http://foo/wiki/bar are being counted.
Daniel Kinzler wrote:
Err... don't we have logs of the access to the image files themselves (including thumbnails)? It would be MUCH simpler to analyse those than to look at page views and then at image usage and then infer how much an image is loaded...
I think only urls in the form http://foo/wiki/bar are being counted.
It seems to be that way currently, yes, but i see no reason not to change it. It would be simple enough, afaik.
-- daniel
On Mon, Jul 7, 2008 at 2:42 PM, Daniel Kinzler daniel@brightbyte.de wrote:
Daniel Kinzler wrote:
Err... don't we have logs of the access to the image files themselves (including thumbnails)? It would be MUCH simpler to analyse those than to look at page views and then at image usage and then infer how much an image is loaded...
I think only urls in the form http://foo/wiki/bar are being counted.
It seems to be that way currently, yes, but i see no reason not to change it. It would be simple enough, afaik.
Of course, for usage stats, we'd have to cound the full-size image and all thumbnails as well.
Magnus
Magnus Manske schrieb:
On Mon, Jul 7, 2008 at 2:42 PM, Daniel Kinzler daniel@brightbyte.de wrote:
Daniel Kinzler wrote:
Err... don't we have logs of the access to the image files themselves (including thumbnails)? It would be MUCH simpler to analyse those than to look at page views and then at image usage and then infer how much an image is loaded...
I think only urls in the form http://foo/wiki/bar are being counted.
It seems to be that way currently, yes, but i see no reason not to change it. It would be simple enough, afaik.
Of course, for usage stats, we'd have to cound the full-size image and all thumbnails as well.
Magnus
That was was the idea, yes
-- daniel
On Wed, Jul 2, 2008 at 1:00 PM, Brianna Laugher [snip]
Thoughts? Good idea or waste of time?
It's a good idea to connect the page view stats to provide stats per image and offer aggregates.
But at the same time we don't want to further the belief that views directly relate to value. Providing the right image to the right person has a lot more value than simply showing an image to lots and lots of people.
When an image is placed in some obscure Wikipedia article it might not get a lot of page views, but when it is seen it is probably of substantial interest and value, far more so than yet-another-image scrolling by in a flickr feed.
Wikipedia sees an absolutely enormous amount of traffic but it is distributed over an enormous number of articles, so many pages get fairly few views... but if you took away all the low traffic pages Wikipedia would lose its value almost completely. When something like a powerhouse museum image becomes part of a Wikipedia it becomes more than a single image, it helps complete this enormous and widely used reference work, and its contribution is far greater than the sum of its pageviews.
Just a thought....
2008/7/3 Gregory Maxwell gmaxwell@gmail.com:
On Wed, Jul 2, 2008 at 1:00 PM, Brianna Laugher [snip]
Thoughts? Good idea or waste of time?
It's a good idea to connect the page view stats to provide stats per image and offer aggregates.
But at the same time we don't want to further the belief that views directly relate to value. Providing the right image to the right person has a lot more value than simply showing an image to lots and lots of people.
I'm sure when they explain that to the people holding the purse strings, they'll be understanding :P
Page views hardly tell the whole story, but currently they get nada from us.
cheers Brianna
Dear Greg,
I cannot agree more with you: raw statistics such as page counts or links do not take into account the fact that such or such picture was really helpful to some category of users. (I may sound elitist, but I gladly prefer to witness that some student is able to get content on, say, history or science, than a thousand people going to see the latest news on some worthless TV persona.)
Funding agencies like numbers. In the scientific fields, this has lead to so-called "bibliometrics", and, unsurprisingly, to various strategies meant to raise these metrics, often at the expense of the best interests of science.
One reason while funding providers, and their management (in fine, answerable to politicians in the case of public agencies, and businesspeople in the case of private fundations) like number is that they give an illusion of objectivity, and they are easier to obtain than human evaluations.
Unfortunately, we have to make do with these quirks. Scientists watch their h-index, and museums want to know whether their images get hits. We should provide them with this data if we can.
Regards DM
2008/7/2 Gregory Maxwell gmaxwell@gmail.com:
But at the same time we don't want to further the belief that views directly relate to value. Providing the right image to the right person has a lot more value than simply showing an image to lots and lots of people.
When an image is placed in some obscure Wikipedia article it might not get a lot of page views, but when it is seen it is probably of substantial interest and value, far more so than yet-another-image scrolling by in a flickr feed.
Thing is, these institutions - and by extension the people wanting these figures - don't themselves desperately believe that views relate to value. (Ask a librarian about how meaningful they think their circulation figures are as a metric!)
But they *are* impressive. They are, for want of anything better, a first step as evidence that something is being used at all. They're a number you can quote and wave around and put in your reports and your funding requests and your cheerful press releases.
When it boils down to it, we're helping people play the game in order that they can give us better content, and if we need to do vaguely pointless things in order to do so, I say go for it :-)