License information (was: PDF/Collection feature live on de.wikibooks)

List overview All Threads
Download

newer

older

Re: [Wikitech-l] The never-dying...

Revision & log suppression for...

Johannes Beigel

14 Oct 2008 14 Oct '08

3:12 a.m.

Am 10.10.2008 um 21:22 schrieb Erik Moeller:

...

2008/10/10 Derbeth derbeth@wp.pl:

...
I wonder about the legal aspects. In my opinion, when you create a ready-to-print version, you have to attach the text of GFDL license to it - directly, not as a link. Like it is done in http://en.wikibooks.org/wiki/Image:LaTeX.pdf.

As Erik wrote: This is already implemented (either a title of an article or a URL to some license text can be set in LocalSettings.php), but it's currently not configured.

...

...
Secondly, current version of the tool does a plagiarism - beacause it does not mention image authors and does not provide any mean (like by making images clickable) to check these authors.

Ouch, thanks for pointing that out. Tricky to do this automatically since it's all wiki-text with templates, but we'll investigate a solution here.

We'd highly appreciate input from the community regarding this topic!

The printed books from PediaPress contain a list of figures where the license of each image is listed, together with the URL to the image description page. As some kind of "hotfix" this solution could be implemented in the PDF export of the Collection extension, too. But this doesn't really solve the problem.

We think it's more of a technical/software thing, so I cross-posted (and set Reply-To) to Wikitech-l.

In our opinion, license management/handling must be a core feature of MediaWiki, because the software is explicitely developed for the collaborative distribution of free content. Licenses of the containing articles and images should not be represented via some agreed-upon convention but via structured (and machine-readable) information, available for each relevant object in the wiki.

Some information that would be desired:

- Full (official) name of the license(s). - Whether the full text of the license has to be included or a reference sufficient. - Reference to the full text of the license(s) (in some rigidly defined format like wikitext). - Whether attribution is required. If so: The list of required attributions.

So, basically all the information that's required to check if it's possible to take some part of the MediaWiki and use it somewhere else and all the information that has to be included in that other place. This information could be made accessible via MediaWiki API, but ideally it's contained in the wikitext and/or XHTML, too.

All this could be handled via microformats, even inside of templates, but the main point is that any kind of new technique has to be enforced, ideally via MediaWiki software itself: In the commons wikis there are some conventions that can be used in software by people/ companies like us (although we have to work with hacks and workarounds), but oftentimes, in wikis with smaller communities this information doesn't even exist at all.

-- Johannes Beigel

Show replies by date

Johannes Beigel

14 Oct 14 Oct

4:56 a.m.

New subject: License information (was: PDF/Collection feature live on de.wikibooks)

BTW: PediaPress has a stand on the Frankfurter Buchmesse (Frankfurt Book Fair), booth E427 in hall 4.2. We'd be really happy to meet people from the community to talk about all kinds of MediaWiki related stuff.

So, if some of you are there and can make it... we're looking forward to meet you!

-- Johannes Beigel

Brianna Laugher

29 Jan 29 Jan

4:48 a.m.

New subject: [Textbook-l] License information (was: PDF/Collection feature live on de.wikibooks)

2008/10/14 Johannes Beigel johannes.beigel@pediapress.com:

...

...
...
Secondly, current version of the tool does a plagiarism - beacause it does not mention image authors and does not provide any mean (like by making images clickable) to check these authors.

Ouch, thanks for pointing that out. Tricky to do this automatically since it's all wiki-text with templates, but we'll investigate a solution here.

We'd highly appreciate input from the community regarding this topic!

The printed books from PediaPress contain a list of figures where the license of each image is listed, together with the URL to the image description page. As some kind of "hotfix" this solution could be implemented in the PDF export of the Collection extension, too. But this doesn't really solve the problem.

We think it's more of a technical/software thing, so I cross-posted (and set Reply-To) to Wikitech-l.

In our opinion, license management/handling must be a core feature of MediaWiki, because the software is explicitely developed for the collaborative distribution of free content. Licenses of the containing articles and images should not be represented via some agreed-upon convention but via structured (and machine-readable) information, available for each relevant object in the wiki.

Some information that would be desired:

Full (official) name of the license(s).

Whether the full text of the license has to be included or a

reference sufficient.

Reference to the full text of the license(s) (in some rigidly

defined format like wikitext).

Whether attribution is required. If so: The list of required

attributions.

So, basically all the information that's required to check if it's possible to take some part of the MediaWiki and use it somewhere else and all the information that has to be included in that other place. This information could be made accessible via MediaWiki API, but ideally it's contained in the wikitext and/or XHTML, too.

Because different wikis implement licenses in different ways (ie there are no naming conventions for license templates), I am not sure this license information would belong in MediaWiki core. But I think that definitely Wikimedia Commons, and perhaps other Wikimedia wikis that accept freely licensed uploads, should work on providing a "community API" layer. My thinking behind this is that the communities build a lot of structure into their content via templates or categories or whatever. It makes sense to provide an API to stop every third party user having to reinvent the wheel.

On Wikimedia Commons a little bit of work has been done to this end: http://commons.wikimedia.org/wiki/Commons:Commons_API

In particular this contains some of the license info you mentioned. e.g. below is the info for the GFDL.

GFDL

full_name GNU Free Documentation License attach_full_license_text 1 attribute_author 1 keep_under_same_license 1 keep_under_similar_license 0 license_logo_url http://upload.wikimedia.org/wikipedia/commons/thumb/2/22/Heckert_GNU_white.s... license_info_url http://www.gnu.org/copyleft/fdl.html license_text_url http://www.gnu.org/licenses/fdl.txt

The "Commons API" also has an author field. http://toolserver.org/~magnus/commonsapi.php?image=Sa-warthog.jpg&meta I think at the moment this is being taken from the {{information}} template. You can see in this example it includes a wiki link; it should have already been resolved to a full URL, so there is definitely still work to be done.

I would be interested to know if further development of the Commons API would be "heading in the right direction" for PediaPress.

cheers, Brianna

-- They've just been waiting in a mountain for the right moment: http://modernthings.org/

Johannes Beigel

8:22 a.m.

New subject: License information (was: PDF/Collection feature live on de.wikibooks)

On 29.01.2009, at 13:48, Brianna Laugher wrote:

...

Because different wikis implement licenses in different ways (ie

there

...

are no naming conventions for license templates), I am not sure this license information would belong in MediaWiki core.

It could be opt-in for each wiki, but MediaWiki API is already there and would be a perfect way to make this information available.

Of course one has to think about the kind of data and the format -- e.g. Magnus has, and you mention:

...

On Wikimedia Commons a little bit of work has been done to this end: http://commons.wikimedia.org/wiki/Commons:Commons_API

We've been aware of this page and Magnus' implementation, and we think it looks really good!

The information is (AFAIK) scraped from the rendered XHTML of articles. This could be done in a less error-prone way (and more efficiently) if the data would be stored and accessed via database in some way. Of course this would require some discussion, formal decisions and code changes. But as I stated in an earlier post: I think MediaWiki is so widely used by people who want to share and collaborate on free content, that it's not too farfetched to build some "license infrastracture" into the software itself.

...

I would be interested to know if further development of the Commons API would be "heading in the right direction" for PediaPress.

It's definitely heading in the right direction! It should become "more official" though. :-)

-- Johannes

Brianna Laugher

4:55 p.m.

New subject: License information (was: PDF/Collection feature live on de.wikibooks)

2009/1/30 Johannes Beigel johannes.beigel@pediapress.com:

...

On 29.01.2009, at 13:48, Brianna Laugher wrote:

...
On Wikimedia Commons a little bit of work has been done to this end: http://commons.wikimedia.org/wiki/Commons:Commons_API

We've been aware of this page and Magnus' implementation, and we think it looks really good!

The information is (AFAIK) scraped from the rendered XHTML of articles. This could be done in a less error-prone way (and more efficiently) if the data would be stored and accessed via database in some way. Of course this would require some discussion, formal decisions and code changes. But as I stated in an earlier post: I think MediaWiki is so widely used by people who want to share and collaborate on free content, that it's not too farfetched to build some "license infrastracture" into the software itself.

I agree that it makes a lot of sense. But because it would be a big change, I fear that unless the lead developers show great enthusiasm for the idea, it will take a very long time to be accepted and completed. Whereas building an "add-on" tool can be faster to get to point of functionality.

It may be a good idea to try and build the Commons API to mimic the MediaWiki API, imagining that in the future such information will be available via that. So then hopefully for now people could use the Commons API, and in the future switch to the MediaWiki API by just changing the API URL, and all their queries could stay the same.

How does that sound? Other ideas about how to approach it are welcome...

cheers Brianna

-- They've just been waiting in a mountain for the right moment: http://modernthings.org/

Brion Vibber

5:30 p.m.

New subject: License information

On 1/29/09 4:55 PM, Brianna Laugher wrote:

...

I agree that it makes a lot of sense. But because it would be a big change, I fear that unless the lead developers show great enthusiasm for the idea, it will take a very long time to be accepted and completed. Whereas building an "add-on" tool can be faster to get to point of functionality.

/me shows enthusiasm :)

-- brion

Daniel Kinzler

30 Jan 30 Jan

12:24 a.m.

New subject: License information

Brianna Laugher schrieb:

...

I agree that it makes a lot of sense. But because it would be a big change, I fear that unless the lead developers show great enthusiasm for the idea, it will take a very long time to be accepted and completed. Whereas building an "add-on" tool can be faster to get to point of functionality.

Guys, before re-inventing several wheels, please look at what we already have.

Please have a look at http://commons.wikimedia.org/wiki/Commons:Tag_categories, which defines a way to make license tags machine readable. Using that scheme, it would be easy to build a script on the toolserver that delivers metadata in a machine readable form. No need for screen scraping.

Also, please consider http://www.mediawiki.org/wiki/Extension:RDF which provides a way for mediawiki to serve machine readable metadata about anything and everything. It would be easy to integrate it into license tags. It has been around for years, all it needs is a little push from the community and some code review.

-- daniel

Gerard Meijssen

12:52 a.m.

New subject: License information

Hoi, There is RDF, there is Semantic MediaWiki. Why should one get a push and the other not. Semantic MediaWiki is used on production websites. Its usability is continuously being improved. No cobwebs there.

Having machine readable information is great, but would it not make more sense to have human readable text. As in not only English ? Thanks, GerardM

2009/1/30 Daniel Kinzler daniel@brightbyte.de

...

Brianna Laugher schrieb:

...
I agree that it makes a lot of sense. But because it would be a big change, I fear that unless the lead developers show great enthusiasm for the idea, it will take a very long time to be accepted and completed. Whereas building an "add-on" tool can be faster to get to point of functionality.

Guys, before re-inventing several wheels, please look at what we already have.

Please have a look at http://commons.wikimedia.org/wiki/Commons:Tag_categories, which defines a way to make license tags machine readable. Using that scheme, it would be easy to build a script on the toolserver that delivers metadata in a machine readable form. No need for screen scraping.

Also, please consider http://www.mediawiki.org/wiki/Extension:RDF which provides a way for mediawiki to serve machine readable metadata about anything and everything. It would be easy to integrate it into license tags. It has been around for years, all it needs is a little push from the community and some code review.

-- daniel

Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Daniel Kinzler

1:28 a.m.

New subject: License information

Gerard Meijssen schrieb:

...

Hoi, There is RDF, there is Semantic MediaWiki. Why should one get a push and the other not. Semantic MediaWiki is used on production websites. Its usability is continuously being improved. No cobwebs there.

SMW is of course an option for integrating metadata, but I expect it will take considerably more time to review that and get it usable on wmf sites.

...

Having machine readable information is great, but would it not make more sense to have human readable text. As in not only English ?

Sure, but I don't see the connection. The RDF extension just adds the machine readable stuff to the human readable stuff we already have. It's basically for annotating templates, and retrieving that annotation.

-- daniel

Gerard Meijssen

2:42 a.m.

New subject: License information

Hoi, When we invest time in implementing time in the RDF extension, the chances of the eventual support of Semantic MediaWiki are severely diminished. It may take less time to get the RDF extension in shape, this is your hunch, but it is a choice only made because it is quick. Not because it provides the most benefits.

What is a translation but another type of annotation ? Thanks, GerardM

2009/1/30 Daniel Kinzler daniel@brightbyte.de

...

Gerard Meijssen schrieb:

...
Hoi, There is RDF, there is Semantic MediaWiki. Why should one get a push and

the

...
other not. Semantic MediaWiki is used on production websites. Its

usability

...
is continuously being improved. No cobwebs there.

SMW is of course an option for integrating metadata, but I expect it will take considerably more time to review that and get it usable on wmf sites.

...
Having machine readable information is great, but would it not make more sense to have human readable text. As in not only English ?

Sure, but I don't see the connection. The RDF extension just adds the machine readable stuff to the human readable stuff we already have. It's basically for annotating templates, and retrieving that annotation.

-- daniel

Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Daniel Kinzler

2:48 a.m.

New subject: License information

...

What is a translation but another type of annotation ? Thanks,

This *Could* be modeled like that in theory. But I don't see an easy way to implement this with a low cost of transition. Basically, it would require license info to be not handled via templates at all.

I don't see that happening anytime soon. Also because it causes new problems, such as the question how to introduce new license tags, etc.

-- daniel

5809

Age (days ago)

5917

Last active (days ago)

wikitech-l@lists.wikimedia.org

10 comments

5 participants

tags (0)

participants (5)

Brianna Laugher
Brion Vibber
Daniel Kinzler
Gerard Meijssen
Johannes Beigel