Jimmy Wales:
How accurate is the metadata at flickr? Presumably for photos that people take themselves and upload, it is 100% accurate by definition.
But I worry about copyvios at Flickr leaking into Commons.
The metadata at Flickr is decent, if very POV and personal. Photographers often tell little stories about how a photo came to be. Some tags are a bit silly -- for some reason, people feel the need to tag all their photos with "photo", or with their first name.
It's important to recognize that Flickr is first of all used as a repository for people to directly upload photos from their digital cameras to the Net. As such, there is an abundance of wedding and birthday photos, holiday and tourist shots, road trip series, and the like. Many photos are part of such a sequence.
This is very different from the Commons, where the primary usage pattern is "I want to put this image in an article", and newbies tend to resort very liberally to external, copyrighted content to do so. A very rigorous content screening process also seems to keep Flickr clean both of porn and copyright violations.
From the point of view of the FlickrLickr project, it helps that we are processing the content oldest first. The material we are looking through at the moment has been on Flickr for more than a year. Most copyright questions I've had to think about in the first few batches were more of the kind: "Is it OK for a logo to be visible in this shot?" "Is this sculpture copyrighted?"
Finally, at the moment, every batch is reviewed twice (the preselection by the FlickrLickr user, and the final pre-upload copyedit by myself). This has already led to a number of problem cases being filtered, such as a photo taken from Indymedia and uploaded to Flickr under CC-BY with the note "from Indymedia". I think we can handle the copyright issue, using common sense and good judgment.
But if we're importing large quantities of questionably-licensed data from Flickr, and then Flickr bans the person for doing something wrong, how do we know about it?
Should there be a truly dramatic case of someone managing to get away with uploading hundreds of copyvios over a year or so, there's a chance we'll hear about it - but if we want to be really sure, it would be cool if Flickr could make its user block or deletion log visible somehow.
I've sent you a FlickrLickr account by separate mail. I'll also write a little report after we've finished looking through the first 10,000 images.
Best,
Erik