I've created a frontend and backend for selecting freely licensed photos from Flickr to be uploaded to the Commons:
http://commons.wikimedia.org/wiki/User:FlickrLickr
Using the Flickr API, I am building a database of free photos from Flickr, and users can apply for access to the frontend to review slices of 1,000 photos each. After a slice is finished, I review it and run the upload bot to upload the selected photos to the Commons. See the above page for more information.
There are currently almost half a million CC-BY photos on Flickr, and new ones are uploaded every day. I hope that a systematic effort to review these photos will greatly enrich the Commons.
Please help by applying for access to a slice of Flickr. Best send me a private email with a link to your username so I can look at your past contributions.
Best,
Erik
I would be interested in helping out.
Andre Engels
On 9/18/05, Erik Moeller erik_moeller@gmx.de wrote:
I've created a frontend and backend for selecting freely licensed photos from Flickr to be uploaded to the Commons:
http://commons.wikimedia.org/wiki/User:FlickrLickr
Using the Flickr API, I am building a database of free photos from Flickr, and users can apply for access to the frontend to review slices of 1,000 photos each. After a slice is finished, I review it and run the upload bot to upload the selected photos to the Commons. See the above page for more information.
There are currently almost half a million CC-BY photos on Flickr, and new ones are uploaded every day. I hope that a systematic effort to review these photos will greatly enrich the Commons.
Please help by applying for access to a slice of Flickr. Best send me a private email with a link to your username so I can look at your past contributions.
Best,
Erik _______________________________________________ Commons-l mailing list Commons-l@wikimedia.org http://mail.wikipedia.org/mailman/listinfo/commons-l
Erik Moeller wrote:
Using the Flickr API, I am building a database of free photos from Flickr, and users can apply for access to the frontend to review slices of 1,000 photos each. After a slice is finished, I review it and run the upload bot to upload the selected photos to the Commons. See the above page for more information.
This sounds fantastic, but I worry about a few things.
How accurate is the metadata at flickr? Presumably for photos that people take themselves and upload, it is 100% accurate by definition.
But I worry about copyvios at Flickr leaking into Commons.
One of the things that prevents rampant copyvios at Wikimedia projects generally is community reputation. It is essentially impossible to imagine any prominent contributor uploading copyvios and lying about the license data to Wikipedia itself. And if we ever caught someone doing so, we would quickly review all of their contributions and nuke them all.
But if we're importing large quantities of questionably-licensed data from Flickr, and then Flickr bans the person for doing something wrong, how do we know about it?
This is not an insurmountable problem of course.
Reviewing things 1,000 at a time sounds reasonable, but we need to be pretty rigorous somehow.
Please help by applying for access to a slice of Flickr. Best send me a private email with a link to your username so I can look at your past contributions.
I'm known as user Jimbo Wales in most projects. I have the most edits in English Wikipedia, but still not that many. I think if you ask around, though, despite my weak history of editing, a lot of people know me and will tell you that I'm ok. :-)
--Jimbo
Jimmy Wales:
How accurate is the metadata at flickr? Presumably for photos that people take themselves and upload, it is 100% accurate by definition.
But I worry about copyvios at Flickr leaking into Commons.
The metadata at Flickr is decent, if very POV and personal. Photographers often tell little stories about how a photo came to be. Some tags are a bit silly -- for some reason, people feel the need to tag all their photos with "photo", or with their first name.
It's important to recognize that Flickr is first of all used as a repository for people to directly upload photos from their digital cameras to the Net. As such, there is an abundance of wedding and birthday photos, holiday and tourist shots, road trip series, and the like. Many photos are part of such a sequence.
This is very different from the Commons, where the primary usage pattern is "I want to put this image in an article", and newbies tend to resort very liberally to external, copyrighted content to do so. A very rigorous content screening process also seems to keep Flickr clean both of porn and copyright violations.
From the point of view of the FlickrLickr project, it helps that we are processing the content oldest first. The material we are looking through at the moment has been on Flickr for more than a year. Most copyright questions I've had to think about in the first few batches were more of the kind: "Is it OK for a logo to be visible in this shot?" "Is this sculpture copyrighted?"
Finally, at the moment, every batch is reviewed twice (the preselection by the FlickrLickr user, and the final pre-upload copyedit by myself). This has already led to a number of problem cases being filtered, such as a photo taken from Indymedia and uploaded to Flickr under CC-BY with the note "from Indymedia". I think we can handle the copyright issue, using common sense and good judgment.
But if we're importing large quantities of questionably-licensed data from Flickr, and then Flickr bans the person for doing something wrong, how do we know about it?
Should there be a truly dramatic case of someone managing to get away with uploading hundreds of copyvios over a year or so, there's a chance we'll hear about it - but if we want to be really sure, it would be cool if Flickr could make its user block or deletion log visible somehow.
I've sent you a FlickrLickr account by separate mail. I'll also write a little report after we've finished looking through the first 10,000 images.
Best,
Erik
On 9/20/05, Erik Moeller erik_moeller@gmx.de wrote:
Jimmy Wales:
How accurate is the metadata at flickr? Presumably for photos that people take themselves and upload, it is 100% accurate by definition.
But I worry about copyvios at Flickr leaking into Commons.
The metadata at Flickr is decent, if very POV and personal. Photographers often tell little stories about how a photo came to be. Some tags are a bit silly -- for some reason, people feel the need to tag all their photos with "photo", or with their first name.
I'd also add that a Creative Commons license is NOT the default at Flickr; you have to explicitly change the license to that. It is possible to change one's user setup there so that everything you upload gets marked as the Creative Commons license one chooses, but again, not the default, and too much effort in general for someone uploading copyright violations.
A bigger problem would be people uploading stuff that's made freely available by someone else but under a Commons-incompatible license, but has been incorrectly marked as the 'nearest matching' CC license, out of a desire to display its free status.
-Matt
Erik Moeller erik_moeller@gmx.de writes:
There are currently almost half a million CC-BY photos on Flickr, and new ones are uploaded every day.
Do you also harvest CC-BY-SA photos? Is it possible to select photos by user name? I'd like to copy my own photos to the commons server using a batch upload tool.
If selecting photos by user name is not possible I could provide a filename list alternatively.
Karl Eichwalder:
Erik Moeller erik_moeller@gmx.de writes:
There are currently almost half a million CC-BY photos on Flickr, and new ones are uploaded every day.
Do you also harvest CC-BY-SA photos? Is it possible to select photos by user name? I'd like to copy my own photos to the commons server using a batch upload tool.
The goal of FlickrLickr is to incrementally build a database of free photos on Flickr (oldest first) and to collaboratively review them, so it does not have per-user functionality. How many photos are we talking about here? If it's worth it, I can write a small script to retrieve and upload them.
Gruß
Erik