Hi folks,
There's an item that's Luis Villa added to the MW Core backlog that I'd like to move to the Multimedia backlog: https://www.mediawiki.org/wiki/Wikimedia_MediaWiki_Core_Team/Backlog#Structu...
I'm assuming everything that he describes fits nicely into what is planned for Structured Data. Assuming that's true, should I just copy/paste into a new card in Mingle, or a new page on mw.org or what?
Rob
On Fri, Sep 26, 2014 at 2:49 AM, Rob Lanphier robla@wikimedia.org wrote:
There's an item that's Luis Villa added to the MW Core backlog that I'd like to move to the Multimedia backlog:
https://www.mediawiki.org/wiki/Wikimedia_MediaWiki_Core_Team/Backlog#Structu...
I'm assuming everything that he describes fits nicely into what is planned for Structured Data. Assuming that's true, should I just copy/paste into a new card in Mingle, or a new page on mw.org or what?
This seems to be about article text, or mainly about article text (articles imported from other wikis and so on).
The plan for the structured data project is to create Wikidata properties for legalese, install Wikibase on Commons (and possibly other wikis which have local images), make that Wikibase use Wikidata properties (and sometimes Wikidata items as values), create a new entity type called mediainfo (which is like a Wikibase item, but associated with a file), and add legal information to the mediainfo entries.
Part of that (the Wikidata properties) could be reused for articles and other non-file content - the source, license etc. properties are generic enough. However, if we want to use this structure to attribute files, we would either have to make mediainfo into some more generic thing that can be attached to any wiki page, or abuse the langlink/badge feature to serve a similar purpose. That is a major course correction; if we want to do something like that, that should be discussed (with the involvement of the Wikidata team) as soon as possible.
(+cc Nemo and Wikidata-tech)
On Fri, Sep 26, 2014 at 5:33 AM, Gergo Tisza gtisza@wikimedia.org wrote:
On Fri, Sep 26, 2014 at 2:49 AM, Rob Lanphier robla@wikimedia.org wrote:
There's an item that's Luis Villa added to the MW Core backlog that I'd like to move to the Multimedia backlog:
https://www.mediawiki.org/wiki/Wikimedia_MediaWiki_Core_Team/Backlog#Structu...
I'm assuming everything that he describes fits nicely into what is planned for Structured Data. Assuming that's true, should I just copy/paste into a new card in Mingle, or a new page on mw.org or what?
This seems to be about article text, or mainly about article text (articles imported from other wikis and so on).
The plan for the structured data project is to create Wikidata properties for legalese, install Wikibase on Commons (and possibly other wikis which have local images), make that Wikibase use Wikidata properties (and sometimes Wikidata items as values), create a new entity type called mediainfo (which is like a Wikibase item, but associated with a file), and add legal information to the mediainfo entries.
Part of that (the Wikidata properties) could be reused for articles and other non-file content - the source, license etc. properties are generic enough. However, if we want to use this structure to attribute files, we would either have to make mediainfo into some more generic thing that can be attached to any wiki page, or abuse the langlink/badge feature to serve a similar purpose. That is a major course correction; if we want to do something like that, that should be discussed (with the involvement of the Wikidata team) as soon as possible.
Thanks for the analysis, Gergo! I was going to split Luis' proposal into a separate wiki page, but I see Nemo has linked to this page as the "Canonical page on the topic": https://www.mediawiki.org/wiki/Files_and_licenses_concept
Without a deep reading that I'm admittedly just not going to have time for, it's hard to tell how related the page that Nemo linked to is to the concepts that Luis is trying to capture. Could someone (Nemo? Luis?) merge Luis's requirements into the "canonical page" to Luis' satisfaction, so I can delete most of the information from our backlog? I'll keep the item on the MW Core backlog, since I don't know where else to put it, but it's probably going to be relatively low priority for that team.
Multimedia team and Wikidata team, could you make sure you're considering the requirements that Luis brought up as you build your solution? Even if you decide to punt on some of the things that aren't strictly necessary for files, it's still good to make sure you don't paint us in a corner when if/when we do try to do something more sophisticated for articles.
One thing I'll note, though, before we get too complacent in thinking that files are somehow simpler than articles, we should consider these relatively common scenarios: * Group photo with potentially different per-person personality rights * PDF of a slide deck with many images * PDF of a Wikipedia article :-)
Rob
Rob Lanphier, 26/09/2014 22:59:
Thanks for the analysis, Gergo! I was going to split Luis' proposal into a separate wiki page, but I see Nemo has linked to this page as the "Canonical page on the topic": https://www.mediawiki.org/wiki/Files_and_licenses_concept
Without a deep reading that I'm admittedly just not going to have time for, it's hard to tell how related the page that Nemo linked to is to the concepts that Luis is trying to capture.
The gist of that idea is: associate actual copyright metadata to content; use ContentHandler for certain blobs of information. Krinkle and others were the main authors of that page and the idea was never worked on, but it can be extended in many ways... except that it's a bit pointless to expand it further when even a smaller scope is hard to work on.
Other than files, the two classic pain points about copyright metadata are a) display of page authors https://bugzilla.wikimedia.org/show_bug.cgi?id=2994#c14 b) metadata about works (e.g. books) stored across pages https://bugzilla.wikimedia.org/show_bug.cgi?id=15071
Could someone (Nemo? Luis?) merge Luis's requirements into the "canonical page" to Luis' satisfaction, so I can delete most of the information from our backlog? I'll keep the item on the MW Core backlog, since I don't know where else to put it, but it's probably going to be relatively low priority for that team.
Multimedia team and Wikidata team, could you make sure you're considering the requirements that Luis brought up as you build your solution? Even if you decide to punt on some of the things that aren't strictly necessary for files, it's still good to make sure you don't paint us in a corner when if/when we do try to do something more sophisticated for articles.
A solution based on an external wiki (Wikidata) as for files... may work for (b) but won't for (a). That said, the original idea for files could be reused for both (a) and (b).
One thing I'll note, though, before we get too complacent in thinking that files are somehow simpler than articles, we should consider these relatively common scenarios:
- Group photo with potentially different per-person personality rights
- PDF of a slide deck with many images
- PDF of a Wikipedia article :-)
The last point being bug 2994 (and friends).
Nemo
Thanks, Robla and Nemo!
We will bring up these points when we meet with the wikidata team in a week to discuss the Structured Data project (1).
My initial impression is that this would be a significant scope increase for an already large project. So for practical reasons, we may not be able to take it on until after we have implemented structured data for multimedia files.
That said, it would be important to consider this request before we start development on structured data for multimedia files, so we can investigate solutions that could make this second phase possible. So we will add this discussion to our meeting agenda.
We will update the relevant pages after we’ve had a chance to discuss this as a team.
To be continued,
Fabrice
P.S.: I have also Cc:d Luis and Stephen from our legal team, so they can remain part of that discussion as well.
(1) https://commons.wikimedia.org/wiki/Commons:Structured_data
On Sep 26, 2014, at 2:37 PM, Federico Leva (Nemo) nemowiki@gmail.com wrote:
Rob Lanphier, 26/09/2014 22:59:
Thanks for the analysis, Gergo! I was going to split Luis' proposal into a separate wiki page, but I see Nemo has linked to this page as the "Canonical page on the topic": https://www.mediawiki.org/wiki/Files_and_licenses_concept
Without a deep reading that I'm admittedly just not going to have time for, it's hard to tell how related the page that Nemo linked to is to the concepts that Luis is trying to capture.
The gist of that idea is: associate actual copyright metadata to content; use ContentHandler for certain blobs of information. Krinkle and others were the main authors of that page and the idea was never worked on, but it can be extended in many ways... except that it's a bit pointless to expand it further when even a smaller scope is hard to work on.
Other than files, the two classic pain points about copyright metadata are a) display of page authors https://bugzilla.wikimedia.org/show_bug.cgi?id=2994#c14 b) metadata about works (e.g. books) stored across pages https://bugzilla.wikimedia.org/show_bug.cgi?id=15071
Could someone (Nemo? Luis?) merge Luis's requirements into the "canonical page" to Luis' satisfaction, so I can delete most of the information from our backlog? I'll keep the item on the MW Core backlog, since I don't know where else to put it, but it's probably going to be relatively low priority for that team.
Multimedia team and Wikidata team, could you make sure you're considering the requirements that Luis brought up as you build your solution? Even if you decide to punt on some of the things that aren't strictly necessary for files, it's still good to make sure you don't paint us in a corner when if/when we do try to do something more sophisticated for articles.
A solution based on an external wiki (Wikidata) as for files... may work for (b) but won't for (a). That said, the original idea for files could be reused for both (a) and (b).
One thing I'll note, though, before we get too complacent in thinking that files are somehow simpler than articles, we should consider these relatively common scenarios:
- Group photo with potentially different per-person personality rights
- PDF of a slide deck with many images
- PDF of a Wikipedia article :-)
The last point being bug 2994 (and friends).
Nemo
Multimedia mailing list Multimedia@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/multimedia
_______________________________
On Sep 26, 2014, at 1:59 PM, Rob Lanphier robla@wikimedia.org wrote:
(+cc Nemo and Wikidata-tech)
On Fri, Sep 26, 2014 at 5:33 AM, Gergo Tisza gtisza@wikimedia.org wrote: On Fri, Sep 26, 2014 at 2:49 AM, Rob Lanphier robla@wikimedia.org wrote: There's an item that's Luis Villa added to the MW Core backlog that I'd like to move to the Multimedia backlog: https://www.mediawiki.org/wiki/Wikimedia_MediaWiki_Core_Team/Backlog#Structu...
I'm assuming everything that he describes fits nicely into what is planned for Structured Data. Assuming that's true, should I just copy/paste into a new card in Mingle, or a new page on mw.org or what?
This seems to be about article text, or mainly about article text (articles imported from other wikis and so on).
The plan for the structured data project is to create Wikidata properties for legalese, install Wikibase on Commons (and possibly other wikis which have local images), make that Wikibase use Wikidata properties (and sometimes Wikidata items as values), create a new entity type called mediainfo (which is like a Wikibase item, but associated with a file), and add legal information to the mediainfo entries.
Part of that (the Wikidata properties) could be reused for articles and other non-file content - the source, license etc. properties are generic enough. However, if we want to use this structure to attribute files, we would either have to make mediainfo into some more generic thing that can be attached to any wiki page, or abuse the langlink/badge feature to serve a similar purpose. That is a major course correction; if we want to do something like that, that should be discussed (with the involvement of the Wikidata team) as soon as possible.
Thanks for the analysis, Gergo! I was going to split Luis' proposal into a separate wiki page, but I see Nemo has linked to this page as the "Canonical page on the topic": https://www.mediawiki.org/wiki/Files_and_licenses_concept
Without a deep reading that I'm admittedly just not going to have time for, it's hard to tell how related the page that Nemo linked to is to the concepts that Luis is trying to capture. Could someone (Nemo? Luis?) merge Luis's requirements into the "canonical page" to Luis' satisfaction, so I can delete most of the information from our backlog? I'll keep the item on the MW Core backlog, since I don't know where else to put it, but it's probably going to be relatively low priority for that team.
Multimedia team and Wikidata team, could you make sure you're considering the requirements that Luis brought up as you build your solution? Even if you decide to punt on some of the things that aren't strictly necessary for files, it's still good to make sure you don't paint us in a corner when if/when we do try to do something more sophisticated for articles.
One thing I'll note, though, before we get too complacent in thinking that files are somehow simpler than articles, we should consider these relatively common scenarios:
- Group photo with potentially different per-person personality rights
- PDF of a slide deck with many images
- PDF of a Wikipedia article :-)
Rob
I agree that we should not breaden the scope of the current structured data project, but we should not ignore this earlier proposal either:
1) We should look at the use cases and requirements that Krinkle & Co analyzed, and see to what conclusions they came. I'm sure there is something to be learned there.
2) when working on the media metadata spec, we should avoid the assumption that the subject of that metadata is a file managed in the FIle namespace. It might also be the content of a wiki page, or a work not present in the wiki at all. It will probably not be practical to avoid this assumption all the way, but perhaps we can keep it to the UI level, so the same code and data structures can be re-used to describe works managed elsewhere.
-- daniel
Am 26.09.2014 23:53, schrieb Fabrice Florin:
Thanks, Robla and Nemo!
We will bring up these points when we meet with the wikidata team in a week to discuss the Structured Data project (1).
My initial impression is that this would be a significant scope increase for an already large project. So for practical reasons, we may not be able to take it on until after we have implemented structured data for multimedia files.
That said, it would be important to consider this request before we start development on structured data for multimedia files, so we can investigate solutions that could make this second phase possible. So we will add this discussion to our meeting agenda.
We will update the relevant pages after we’ve had a chance to discuss this as a team.
To be continued,
Fabrice
P.S.: I have also Cc:d Luis and Stephen from our legal team, so they can remain part of that discussion as well.
(1) https://commons.wikimedia.org/wiki/Commons:Structured_data
On Sep 26, 2014, at 2:37 PM, Federico Leva (Nemo) <nemowiki@gmail.com mailto:nemowiki@gmail.com> wrote:
Rob Lanphier, 26/09/2014 22:59:
Thanks for the analysis, Gergo! I was going to split Luis' proposal into a separate wiki page, but I see Nemo has linked to this page as the "Canonical page on the topic": https://www.mediawiki.org/wiki/Files_and_licenses_concept
Without a deep reading that I'm admittedly just not going to have time for, it's hard to tell how related the page that Nemo linked to is to the concepts that Luis is trying to capture.
The gist of that idea is: associate actual copyright metadata to content; use ContentHandler for certain blobs of information. Krinkle and others were the main authors of that page and the idea was never worked on, but it can be extended in many ways... except that it's a bit pointless to expand it further when even a smaller scope is hard to work on.
Other than files, the two classic pain points about copyright metadata are a) display of page authors https://bugzilla.wikimedia.org/show_bug.cgi?id=2994#c14 b) metadata about works (e.g. books) stored across pages https://bugzilla.wikimedia.org/show_bug.cgi?id=15071
Could someone (Nemo? Luis?) merge Luis's requirements into the "canonical page" to Luis' satisfaction, so I can delete most of the information from our backlog? I'll keep the item on the MW Core backlog, since I don't know where else to put it, but it's probably going to be relatively low priority for that team.
Multimedia team and Wikidata team, could you make sure you're considering the requirements that Luis brought up as you build your solution? Even if you decide to punt on some of the things that aren't strictly necessary for files, it's still good to make sure you don't paint us in a corner when if/when we do try to do something more sophisticated for articles.
A solution based on an external wiki (Wikidata) as for files... may work for (b) but won't for (a). That said, the original idea for files could be reused for both (a) and (b).
One thing I'll note, though, before we get too complacent in thinking that files are somehow simpler than articles, we should consider these relatively common scenarios:
- Group photo with potentially different per-person personality rights
- PDF of a slide deck with many images
- PDF of a Wikipedia article :-)
The last point being bug 2994 (and friends).
Nemo
Multimedia mailing list Multimedia@lists.wikimedia.org mailto:Multimedia@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/multimedia
On Sep 26, 2014, at 1:59 PM, Rob Lanphier <robla@wikimedia.org mailto:robla@wikimedia.org> wrote:
(+cc Nemo and Wikidata-tech)
On Fri, Sep 26, 2014 at 5:33 AM, Gergo Tisza <gtisza@wikimedia.org mailto:gtisza@wikimedia.org> wrote:
On Fri, Sep 26, 2014 at 2:49 AM, Rob Lanphier <robla@wikimedia.org <mailto:robla@wikimedia.org>> wrote: There's an item that's Luis Villa added to the MW Core backlog that I'd like to move to the Multimedia backlog: https://www.mediawiki.org/wiki/Wikimedia_MediaWiki_Core_Team/Backlog#Structured_license_metadata I'm assuming everything that he describes fits nicely into what is planned for Structured Data. Assuming that's true, should I just copy/paste into a new card in Mingle, or a new page on mw.org <http://mw.org/> or what? This seems to be about article text, or mainly about article text (articles imported from other wikis and so on). The plan for the structured data project is to create Wikidata properties for legalese, install Wikibase on Commons (and possibly other wikis which have local images), make that Wikibase use Wikidata properties (and sometimes Wikidata items as values), create a new entity type called mediainfo (which is like a Wikibase item, but associated with a file), and add legal information to the mediainfo entries. Part of that (the Wikidata properties) could be reused for articles and other non-file content - the source, license etc. properties are generic enough. However, if we want to use this structure to attribute files, we would either have to make mediainfo into some more generic thing that can be attached to any wiki page, or abuse the langlink/badge feature to serve a similar purpose. That is a major course correction; if we want to do something like that, that should be discussed (with the involvement of the Wikidata team) as soon as possible.
Thanks for the analysis, Gergo! I was going to split Luis' proposal into a separate wiki page, but I see Nemo has linked to this page as the "Canonical page on the topic": https://www.mediawiki.org/wiki/Files_and_licenses_concept
Without a deep reading that I'm admittedly just not going to have time for, it's hard to tell how related the page that Nemo linked to is to the concepts that Luis is trying to capture. Could someone (Nemo? Luis?) merge Luis's requirements into the "canonical page" to Luis' satisfaction, so I can delete most of the information from our backlog? I'll keep the item on the MW Core backlog, since I don't know where else to put it, but it's probably going to be relatively low priority for that team.
Multimedia team and Wikidata team, could you make sure you're considering the requirements that Luis brought up as you build your solution? Even if you decide to punt on some of the things that aren't strictly necessary for files, it's still good to make sure you don't paint us in a corner when if/when we do try to do something more sophisticated for articles.
One thing I'll note, though, before we get too complacent in thinking that files are somehow simpler than articles, we should consider these relatively common scenarios:
- Group photo with potentially different per-person personality rights
- PDF of a slide deck with many images
- PDF of a Wikipedia article :-)
Rob
Wikidata-tech mailing list Wikidata-tech@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-tech
Thanks for raising this, Rob. Some comments inline. [+Stephen, who is now leading on the structured metadata for multimedia project]
On Fri, Sep 26, 2014 at 1:59 PM, Rob Lanphier robla@wikimedia.org wrote:
(+cc Nemo and Wikidata-tech)
On Fri, Sep 26, 2014 at 5:33 AM, Gergo Tisza gtisza@wikimedia.org wrote:
On Fri, Sep 26, 2014 at 2:49 AM, Rob Lanphier robla@wikimedia.org wrote:
There's an item that's Luis Villa added to the MW Core backlog that I'd like to move to the Multimedia backlog:
https://www.mediawiki.org/wiki/Wikimedia_MediaWiki_Core_Team/Backlog#Structu...
I'm assuming everything that he describes fits nicely into what is planned for Structured Data. Assuming that's true, should I just copy/paste into a new card in Mingle, or a new page on mw.org or what?
This seems to be about article text, or mainly about article text (articles imported from other wikis and so on).
Yeah, that's correct. I hadn't raised it myself in the metadata context for exactly that reason. But certainly there is a lot of overlap between
https://www.mediawiki.org/wiki/Files_and_licenses_concept
and
https://www.mediawiki.org/wiki/Multimedia/Structured_Data
Even if the goals aren't completely the same, if nothing else, some of the schemas should really be made to line up.
The plan for the structured data project is to create Wikidata properties
for legalese, install Wikibase on Commons (and possibly other wikis which have local images), make that Wikibase use Wikidata properties (and sometimes Wikidata items as values), create a new entity type called mediainfo (which is like a Wikibase item, but associated with a file), and add legal information to the mediainfo entries.
Part of that (the Wikidata properties) could be reused for articles and other non-file content - the source, license etc. properties are generic enough. However, if we want to use this structure to attribute files, we would either have to make mediainfo into some more generic thing that can be attached to any wiki page, or abuse the langlink/badge feature to serve a similar purpose. That is a major course correction; if we want to do something like that, that should be discussed (with the involvement of the Wikidata team) as soon as possible.
Thanks for the analysis, Gergo! I was going to split Luis' proposal into a separate wiki page, but I see Nemo has linked to this page as the "Canonical page on the topic": https://www.mediawiki.org/wiki/Files_and_licenses_concept
Without a deep reading that I'm admittedly just not going to have time for, it's hard to tell how related the page that Nemo linked to is to the concepts that Luis is trying to capture.
Seems to address part but not all of them, if I understand correctly:
- *non-editing authors:* Does seem to be able to cope with the idea of authors who aren't in the edit history (e.g., because they edited a prior version of the work that was uploaded as a seed to the wiki) - *source: *Doesn't seem to have the notion that a work might have an alternate source (e.g., something copied/pasted in from another CC BY-SA source). - *license:* not clear if this copes with the notion that there might be multiple compatible licenses on the page.
One thing I'll note, though, before we get too complacent in thinking that files are somehow simpler than articles, we should consider these relatively common scenarios:
- Group photo with potentially different per-person personality rights
- PDF of a slide deck with many images
- PDF of a Wikipedia article :-)
Or simply the case of "I copied and pasted an article from a different CC source into the Wikipedia article" - that's what got me thinking about this a while back (though of course, as Nemo points out, PDFs are the canonical problem child here).
Luis
Luis Villa, 26/09/2014 23:55:
https://www.mediawiki.org/wiki/Files_and_licenses_concept Without a deep reading that I'm admittedly just not going to have time for, it's hard to tell how related the page that Nemo linked to is to the concepts that Luis is trying to capture.
Seems to address part but not all of them, if I understand correctly:
- *non-editing authors:* Does seem to be able to cope with the idea of
authors who aren't in the edit history (e.g., because they edited a prior version of the work that was uploaded as a seed to the wiki)
- *source: *Doesn't seem to have the notion that a work might have an
alternate source (e.g., something copied/pasted in from another CC BY-SA source).
- *license:* not clear if this copes with the notion that there might be
multiple compatible licenses on the page.
Please integrate these points straight into that page, they certainly fit well AFAICS even if maybe it will not be implemented that way for pages.
Nemo
On Fri, Sep 26, 2014 at 11:33 PM, Federico Leva (Nemo) nemowiki@gmail.com wrote:
Please integrate these points straight into that page, they certainly fit well AFAICS even if maybe it will not be implemented that way for pages.
https://www.mediawiki.org/wiki/Files_and_licenses_concept#Purpose_and_use_ca...
multimedia@lists.wikimedia.org