From a message Erik posted to mediawiki-l (
http://lists.wikimedia.org/pipermail/mediawiki-l/2007-March/018708.html ):
There's a first demo impl of InstantCommons up at: http://141.13.22.239/ic-client/index.php/Main_Page
What this does: Allows any MediaWiki installation to transparently load (and cache) images from a central repository like Commons, provided the repository permits it. Ideally, Wikimedia Commons will do so, and any installation in the world will be able to use files from Commons as if they were locally uploaded.
The current code is in the instantcommons branch. There are still a few issues and it's not feature-complete yet before it can go to the official review.
=============== (note: it's not working on Commons, but a fake Commons)
It doesn't seem to create a log entry for the 'borrowed' images, but still creates image page s for them (but doesn't link to the source in any way, which is a bit weird). probably wiki admins would like to know which images have been 'borrowed'.
No idea why Erik didn't post here as well, as if we're not interested :)
cheers Brianna user:pfctdayelise
Brianna Laugher wrote:
What this does: Allows any MediaWiki installation to transparently load (and cache) images from a central repository like Commons, provided the repository permits it.
Cool! I have been waiting for something like that for quite a while.
It doesn't seem to create a log entry for the 'borrowed' images, but still creates image page s for them (but doesn't link to the source in any way, which is a bit weird). probably wiki admins would like to know which images have been 'borrowed'.
Linking back to the original page is essential, of course. Also, links to the uploader's user page (and also the author's page, if different) have to work to fulfill the attribution requirement. (I can check if they do, the demo site appears to be down).
But most importantly: what happens if an image is deleted from commons? Does it vanish from the sites using it via InstantCommons? Or does it just stay? That would not be good legally...
Ideally, the image would be replaced by a warning icons, and local admins can decide if they want to delete, or "really" import their cached copy, and take responsibility.
No idea why Erik didn't post here as well, as if we're not interested :)
Well, maybe he'll tell us :)
-- Daniel
On 4/1/07, Daniel Kinzler daniel@brightbyte.de wrote:
But most importantly: what happens if an image is deleted from commons? Does it vanish from the sites using it via InstantCommons? Or does it just stay? That would not be good legally...
Ideally, the image would be replaced by a warning icons, and local admins can decide if they want to delete, or "really" import their cached copy, and take responsibility.
Probably the same mess as now at the WM wikis :) They would need a CommonsTicker, or whatever.
Bryan
On 4/1/07, Daniel Kinzler daniel@brightbyte.de wrote:
Linking back to the original page is essential, of course. Also, links to the uploader's user page (and also the author's page, if different) have to work to fulfill the attribution requirement. (I can check if they do, the demo site appears to be down).
Uploaders page is often not the copyright holder. My view is that it should translucde the page text like commons wiki does on the other projects, and correctly fix up all the links.
But most importantly: what happens if an image is deleted from commons? Does it vanish from the sites using it via InstantCommons? Or does it just stay? That would not be good legally...
Image revocation is a must solve feature before we could activate it on commons. Part of the reason that we don't get in deep-crap with more copyright holders is that when they contact us we make the file go bye-bye quickly. If many other sites are invisibly and automatically mirroring commons content without human oversight and ignoring our deletions, then this kills our ability to effectively take down infringements and keep copyright holders happy.
Ideally, the image would be replaced by a warning icons, and local admins can decide if they want to delete, or "really" import their cached copy, and take responsibility.
It would be in our interest to block access to sites which continue to distribute copyright infringements which we have deleted.
On 4/1/07, Gregory Maxwell gmaxwell@gmail.com wrote:
On 4/1/07, Daniel Kinzler daniel@brightbyte.de wrote:
It would be in our interest to block access to sites which continue to distribute copyright infringements which we have deleted.
Right. On another note, has a policy been thought out about mediawiki installations which would bar their users from uploading images locally and ask them to do so on Commons?
Does the Wikimedia Commons community have the strength to deal with potentially thousands of new users not aware of our "free" policies and uploading images to Commons?
Is there a requirement that the websites allowed to use InstantCommons host free content? Or can every Tom Dick or Harry use InstantCommons?
Delphine
On 4/1/07, Delphine Ménard notafishz@gmail.com wrote:
On 4/1/07, Gregory Maxwell gmaxwell@gmail.com wrote:
On 4/1/07, Daniel Kinzler daniel@brightbyte.de wrote:
It would be in our interest to block access to sites which continue to distribute copyright infringements which we have deleted.
Right. On another note, has a policy been thought out about mediawiki installations which would bar their users from uploading images locally and ask them to do so on Commons?
Does the Wikimedia Commons community have the strength to deal with potentially thousands of new users not aware of our "free" policies and uploading images to Commons?
Is there a requirement that the websites allowed to use InstantCommons host free content? Or can every Tom Dick or Harry use InstantCommons?
It's just a software feature. We might not use it. I would strongly suggest that we not use it until we have good answers to these questions.
As far as handling new users goes, ... we need to improve that for ourselves even without the extra pressure of instantcommons. Many things have been proposed, few implemented.
As far as free content or not sites goes, It would be be very interesting when some of our commons users start enforcing the copyleft terms of their images licenses against sites using instant commons images in sites which are non-free.
For the copyright related concerns I raised earlier in the thread, here is a fun datapoint:
177,734 files uploaded to commons in Jan and Feb 2007, of those files 20,741 have already been deleted which is a bit under 12%. There have been 45,810 total image deletions on commons since Jan 1st.
This graph is also interesting: http://72.165.205.81/dimage_life3.png
It shows the distribution of age at time of deletion for files deleted on commons. The green curve is the most recent quarter, the purple is prior quarter. It's clear that the introduction of bot-deletion for backlogs has had a substantial impact on the timing of our deletions.
The slope of the CDF after it initially stabilises would indicate that once a file has passed 15 or so days old the probability of us deleting on any given day is just a small constant. This wouldn't be bad if old deletions weren't such a substantial fraction of our total deletions.
Gregory Maxwell wrote:
Ideally, the image would be replaced by a warning icons, and local admins can decide if they want to delete, or "really" import their cached copy, and take responsibility.
It would be in our interest to block access to sites which continue to distribute copyright infringements which we have deleted.
How about things we delete for violation of policy, not copyright (e.g. NC/ND stuff)? Or as dupes? It would be reasonable for people to want to keep using such images even after we deleted them.
-- Daniel
On 4/1/07, Daniel Kinzler daniel@brightbyte.de wrote:
How about things we delete for violation of policy, not copyright (e.g. NC/ND stuff)? Or as dupes? It would be reasonable for people to want to keep using such images even after we deleted them.
The dupe issue will hopefully be solved after we get image redirects. ;)
Even in the case of NC/ND material we don't want our name being associated with such files.
Other policy issues, sure, but this isn't trivially addressable so long as all deletions look the same to the software. :)
On 4/1/07, Brianna Laugher brianna.laugher@gmail.com wrote:
No idea why Erik didn't post here as well, as if we're not interested :)
Because the feature still needs quite a bit of work, and the full specifications at http://meta.wikimedia.org/wiki/InstantCommons answer some of the questions that would inevitably come up. We've already dealt with these questions, as well, when we originally proposed the project to WMF and the Board of Trustees. I will answer them again in some depth when the feature has been implemented up to specs.
On 4/1/07, Erik Moeller erik@wikimedia.org wrote:
Because the feature still needs quite a bit of work, and the full specifications at http://meta.wikimedia.org/wiki/InstantCommons answer some of the questions that would inevitably come up. We've already dealt with these questions, as well, when we originally proposed the project to WMF and the Board of Trustees. I will answer them again in some depth when the feature has been implemented up to specs.
Yes, and I whined about the copyright issues then, and I've yet to see fully satisfactory answers.
With around 200k out of 1.5million images deleted thus far and more every day,I do not believe commons is yet at a point of maturity where the automatic redistribution of our images would be a socially responsible action.
In order to get there I think we need to take at least two major steps forward in the quality of our work. We're making good progress but, like all things, it takes time.
We need to be mindful that one of the major differentiators between commons and other user contributed online image repositories is our responsiveness to copyright infringement concerns. We are remarkably more responsive than most other user contributed image repositories, and and automated image-live-mirror system without automated revocation puts that status in jeopardy.
It can be argued that we're already have the mirror problem with text, but text has a long term trend of far fewer copyright complaints per item than media, and with things like automated text copyright violation detection (http://en.wikipedia.org/wiki/Wikipedia:Suspected_copyright_violations) our review mechanisms for text are clearly more sophisticated.
Text mirrors also tend to be either fairly spammy sites whos violations no one would fault us for, or live feed recipients which preserve our responsiveness.
Gregory Maxwell wrote:
We need to be mindful that one of the major differentiators between commons and other user contributed online image repositories is our responsiveness to copyright infringement concerns. We are remarkably more responsive than most other user contributed image repositories, and and automated image-live-mirror system without automated revocation puts that status in jeopardy.
I think the problem is that our image table has no metadata information about the image, it only knows if we have it or not. We would need at least another bit: a good/bad flag. We could have more states: - Not found. - Untagged/Unknown. - Copyright violation. - No source/license. - Fair use - Non suitable license for commons (nd, nc...) - Duplicated - Superseded - Free license
Added within the templates we use, the change could be transparent for the users.
They could even be available for the deleted images (identified by their sha1 on the FileStore).
When we have this data ready we will be able to do a better job. Images without a valid license wouldn't be shared via IntantCommons, and could apply to WMF projects too (can't show the image because it doesn't have enough information, give it a "Copyrighted" style...).
InstantantCommons users should check (with a maintencance script) their images are still free, and decide upon the reason (eg. delete fair use, keep superseded) what to do. We should also provide an automatic system to automatically check the files once a month, with 'proper' defaults (they will always be able to keep violations, but don't make easy that configuration).
On 4/2/07, Platonides Platonides@gmail.com wrote:
I think the problem is that our image table has no metadata information about the image, it only knows if we have it or not. We would need at least another bit: a good/bad flag.
[snip]
InstantantCommons users should check (with a maintencance script) their images are still free, and decide upon the reason (eg. delete fair use, keep superseded) what to do. We should also provide an automatic system to automatically check the files once a month, with 'proper' defaults (they will always be able to keep violations, but don't make easy that configuration).
After giving it some more thought I think it should work something like this:
*To get instant commons access to commons sites must agree to
**Display an instant commons notice (like what the wikimedia wikis do for commons content) and include the attribution and license data. **Poll a deletion feed at least once a day
If an image is indicated as deleted, the software will then remove the commons notices from the image, and the image will look like a local upload with no mention of commons. This should be a mandatory requirement for instantcommons access to Wikimedia commons.
The instant commons software should have also feature to automatically 'delete' these images, so that they can be undeleted by the site operator. This feature should default to on, but site operators can turn it off at their own peril.
If we do end up implementing something like what Platonides suggested, a review status for all images (including deleted ones) then the instant commons auto-deletion setting could depend on the review status.
More simply, we could add a deletion reason dropdown to the deletion dialog and just store the deletion cause code in the deletion log.
/* Non-free open content license (nd, nc...) */ /* Duplicated image */ /* Superseded */ /* Quality problems */ /* other */ /* Missing copyright status */ /* Known copyright violation */ /* Privacy or Libel issues */
I am pretty sure that requiring sites to remove the commons notice on images is something we must do. We need to make it crystal clear that we can not, do not, and will not take any responsibility for an image we've deleted.
Nor should we be encouraging anyone to automatically keep images commons has deleted. Sites should only keep images auto-pulled from commons then deleted on commons after they have made an informed decision to do so, not as a result of the software.
I would be inclined to say that we should not permit instantcommons access to sites which won't use autodeletion for at least the last two causes, if we do end up with detailed deletion cause codes. But I'm less sure about that.
Gregory Maxwell wrote:
If an image is indicated as deleted, the software will then remove the commons notices from the image, and the image will look like a local upload with no mention of commons. This should be a mandatory requirement for instantcommons access to Wikimedia commons.
(...)
I am pretty sure that requiring sites to remove the commons notice on images is something we must do. We need to make it crystal clear that we can not, do not, and will not take any responsibility for an image we've deleted.
Plus, the message wording must clearly state that it *Was* pulled from commons, better with some legal wording as "we're not responsible about it" and preferably, with a link to the user which uploaded it (similar to what was done on en: stating what there are archived versions).
Another problem i see is a vandal uploads 'goatse' to commons marked as GFDL, and immediatly inserts it on one hundred MediaWiki installs. We will delete it almost immediatly but he now has it on a lot of mirrors. Requiring the file to be at least X minutes old would benefit this, but makes InstantCommons useless for those who want their users to upload at Commons (which may or may not be a good idea).
On 4/2/07, Platonides Platonides@gmail.com wrote:
Plus, the message wording must clearly state that it *Was* pulled from commons, better with some legal wording as "we're not responsible about it" and preferably, with a link to the user which uploaded it (similar to what was done on en: stating what there are archived versions).
Yes, we have some good users who can craft a lovely clear message when the time comes.
Another problem i see is a vandal uploads 'goatse' to commons marked as GFDL, and immediatly inserts it on one hundred MediaWiki installs. We will delete it almost immediatly but he now has it on a lot of mirrors. Requiring the file to be at least X minutes old would benefit this, but makes InstantCommons useless for those who want their users to upload at Commons (which may or may not be a good idea).
Another reason why autodeletion and deletion cause codes are important... "Maintaining a user submitted image library is *hard*, let the experts at Wikimedia Commons handle it for you" .... The initial access to the content is just a little part of the cost of having media on your website. :)
The delay makes sense.. Based on our deletion rates a ~10day delay would be best..
Perhaps the way to deal with that is to include an over-ride on the receiving site and figure a way to auto-fire it for content submitted by the sites users?
Gregory Maxwell wrote:
On 4/2/07, Platonides Platonides@gmail.com wrote:
Plus, the message wording must clearly state that it *Was* pulled from commons, better with some legal wording as "we're not responsible about it" and preferably, with a link to the user which uploaded it (similar to what was done on en: stating what there are archived versions).
Yes, we have some good users who can craft a lovely clear message when the time comes.
Another problem i see is a vandal uploads 'goatse' to commons marked as GFDL, and immediatly inserts it on one hundred MediaWiki installs. We will delete it almost immediatly but he now has it on a lot of mirrors. Requiring the file to be at least X minutes old would benefit this, but makes InstantCommons useless for those who want their users to upload at Commons (which may or may not be a good idea).
Another reason why autodeletion and deletion cause codes are important...
Just point that my original idea wasn't to add them ''on deletion'' but as a db field filled by the page templates, though being able to change the code at deletion seems a good ide.
The delay makes sense.. Based on our deletion rates a ~10day delay would be best..
Perhaps the way to deal with that is to include an over-ride on the receiving site and figure a way to auto-fire it for content submitted by the sites users?
A good method, but... How would we now when it came from the site users? It's almost impossible.
On 4/3/07, Platonides Platonides@gmail.com wrote:
A good method, but... How would we now when it came from the site users? It's almost impossible.
Not at all impossible, it would just take a little work. We setup a landing page they they direct their upload to commons link to, ... it sets a cookie indicating the site they came from (either one landing page per site, or grab the referrer. Make upload grab the cookie and stuff it in the database someplace..
I think that this would be very useful data even among our own projects. SUL will reduce the need for it internally, but it would still be useful with instantcommons.
Gregory Maxwell wrote:
On 4/3/07, Platonides wrote:
A good method, but... How would we now when it came from the site users? It's almost impossible.
Not at all impossible, it would just take a little work. We setup a landing page they they direct their upload to commons link to, ... it sets a cookie indicating the site they came from (either one landing page per site, or grab the referrer. Make upload grab the cookie and stuff it in the database someplace..
I think that this would be very useful data even among our own projects. SUL will reduce the need for it internally, but it would still be useful with instantcommons.
I thought on it, but i found several problems: -Most wikis will also allow local upload, not only commons. -Upload instructions will say something like "You need to upload to Wikimedia Commons http://commons.wikimedia.org" ie. they won't provide an esoteric extra parameter. -Even if they provide, some people will go directly, not getting the cookie / not providing the referrer. -Asking to provide their local wiki url won't be popular. -Some sites will have several wikis, wanting to use it on all (eg. a local wikia and wikia central).
However, i like the idea of having a "landing page" where we can give information to InstantantCommoners (commons scope, licenses, delays...), set cookies, etc.