Thanks for raising this, Rob. Some comments inline. [+Stephen, who is now leading on the structured metadata for multimedia project]

On Fri, Sep 26, 2014 at 1:59 PM, Rob Lanphier <robla@wikimedia.org> wrote:
(+cc Nemo and Wikidata-tech)

On Fri, Sep 26, 2014 at 5:33 AM, Gergo Tisza <gtisza@wikimedia.org> wrote:
On Fri, Sep 26, 2014 at 2:49 AM, Rob Lanphier <robla@wikimedia.org> wrote:
There's an item that's Luis Villa added to the MW Core backlog that I'd like to move to the Multimedia backlog:

I'm assuming everything that he describes fits nicely into what is planned for Structured Data.  Assuming that's true, should I just copy/paste into a new card in Mingle, or a new page on mw.org or what?

This seems to be about article text, or mainly about article text (articles imported from other wikis and so on).

Yeah, that's correct. I hadn't raised it myself in the metadata context for exactly that reason. But certainly there is a lot of overlap between 

https://www.mediawiki.org/wiki/Files_and_licenses_concept

and 

https://www.mediawiki.org/wiki/Multimedia/Structured_Data 

Even if the goals aren't completely the same, if nothing else, some of the schemas should really be made to line up.
 
The plan for the structured data project is to create Wikidata properties for legalese, install Wikibase on Commons (and possibly other wikis which have local images), make that Wikibase use Wikidata properties (and sometimes Wikidata items as values), create a new entity type called mediainfo (which is like a Wikibase item, but associated with a file), and add legal information to the mediainfo entries.

Part of that (the Wikidata properties) could be reused for articles and other non-file content - the source, license etc. properties are generic enough. However, if we want to use this structure to attribute files, we would either have to make mediainfo into some more generic thing that can be attached to any wiki page, or abuse the langlink/badge feature to serve a similar purpose. That is a major course correction; if we want to do something like that, that should be discussed (with the involvement of the Wikidata team) as soon as possible.

Thanks for the analysis, Gergo!  I was going to split Luis' proposal into a separate wiki page, but I see Nemo has linked to this page as the "Canonical page on the topic":

Without a deep reading that I'm admittedly just not going to have time for, it's hard to tell how related the page that Nemo linked to is to the concepts that Luis is trying to capture. 

Seems to address part but not all of them, if I understand correctly:

- non-editing authors: Does seem to be able to cope with the idea of authors who aren't in the edit history (e.g., because they edited a prior version of the work that was uploaded as a seed to the wiki)
- source: Doesn't seem to have the notion that a work might have an alternate source (e.g., something copied/pasted in from another CC BY-SA source).
- license: not clear if this copes with the notion that there might be multiple compatible licenses on the page.
 
One thing I'll note, though, before we get too complacent in thinking that files are somehow simpler than articles, we should consider these relatively common scenarios:
*  Group photo with potentially different per-person personality rights
*  PDF of a slide deck with many images
*  PDF of a Wikipedia article  :-)

Or simply the case of "I copied and pasted an article from a different CC source into the Wikipedia article" - that's what got me thinking about this a while back (though of course, as Nemo points out, PDFs are the canonical problem child here).

Luis 


--
Luis Villa
Deputy General Counsel
Wikimedia Foundation
415.839.6885 ext. 6810

This message may be confidential or legally privileged. If you have received it by accident, please delete it and let us know about the mistake. As an attorney for the Wikimedia Foundation, for legal/ethical reasons I cannot give legal advice to, or serve as a lawyer for, community members, volunteers, or staff members in their personal capacity. For more on what this means, please see our legal disclaimer.