Hi,
I'm not sure where to post this, but I'm pretty sure if I put it here someone from PediaPress will probably read it.
I just found out about the PediaPress bookmarklet "create a collection from any MediaWiki" thing and I thought I would try it. I added the "Melbourne" article from en.wikipedia, wikitravel and Wikimedia Commons, as well as an article called "Street press" from somewhere else. When I downloaded the PDF, it contained 3 copies of the en.wikipedia "Melbourne" article and the "Street press" article. So I guess there is some bug with multiple articles from different sources that happen to have the same name.
Secondly, this is not a bug but a feature request: en.wikipedia in particular produces an awful amount of crud that is not that useful for printing: references, external links etc. For the [[Melbourne]] article, there are 22 pages of beautiful text and images, and no less than 11 1/2 pages of crud, mostly consisting of 184 references. Would it be possible to have an option to exclude references? Maybe replace them all with a note like "To see original references, please visit [url]."
thanks, Brianna
oops, one more thing: It appears that for pages collected using the bookmarklet, license information is not recorded or included. I don't particularly want a copy of the GFDL any more than references, but a notice to that effect would be nice.
*looks at the API* hm... bizarrely, I don't see the license info being one of the API parameters. Did I miss it or is it a bug?
Brianna
2009/1/29 Brianna Laugher brianna.laugher@gmail.com:
Hi,
I'm not sure where to post this, but I'm pretty sure if I put it here someone from PediaPress will probably read it.
I just found out about the PediaPress bookmarklet "create a collection from any MediaWiki" thing and I thought I would try it. I added the "Melbourne" article from en.wikipedia, wikitravel and Wikimedia Commons, as well as an article called "Street press" from somewhere else. When I downloaded the PDF, it contained 3 copies of the en.wikipedia "Melbourne" article and the "Street press" article. So I guess there is some bug with multiple articles from different sources that happen to have the same name.
Secondly, this is not a bug but a feature request: en.wikipedia in particular produces an awful amount of crud that is not that useful for printing: references, external links etc. For the [[Melbourne]] article, there are 22 pages of beautiful text and images, and no less than 11 1/2 pages of crud, mostly consisting of 184 references. Would it be possible to have an option to exclude references? Maybe replace them all with a note like "To see original references, please visit [url]."
thanks, Brianna
-- They've just been waiting in a mountain for the right moment: http://modernthings.org/
Hi,
On Jan 28, 2009, at 2:42 PM, Brianna Laugher wrote:
oops, one more thing: It appears that for pages collected using the bookmarklet, license information is not recorded or included. I don't particularly want a copy of the GFDL any more than references, but a notice to that effect would be nice.
*looks at the API* hm... bizarrely, I don't see the license info being one of the API parameters. Did I miss it or is it a bug?
assuming you are talking about the MediaWiki API. Yes this is a bug with MediaWiki.
In our opinion, license management/handling should be a core feature of MediaWiki, because the software is explicitely developed for the collaborative generation of free content. But from the perspective of a re-user this is almost non-existent[1].
Heiko
[1] http://markmail.org/message/cfshfwqse3gno372
Brianna
2009/1/29 Brianna Laugher brianna.laugher@gmail.com:
Hi,
I'm not sure where to post this, but I'm pretty sure if I put it here someone from PediaPress will probably read it.
I just found out about the PediaPress bookmarklet "create a collection from any MediaWiki" thing and I thought I would try it. I added the "Melbourne" article from en.wikipedia, wikitravel and Wikimedia Commons, as well as an article called "Street press" from somewhere else. When I downloaded the PDF, it contained 3 copies of the en.wikipedia "Melbourne" article and the "Street press" article. So I guess there is some bug with multiple articles from different sources that happen to have the same name.
Secondly, this is not a bug but a feature request: en.wikipedia in particular produces an awful amount of crud that is not that useful for printing: references, external links etc. For the [[Melbourne]] article, there are 22 pages of beautiful text and images, and no less than 11 1/2 pages of crud, mostly consisting of 184 references. Would it be possible to have an option to exclude references? Maybe replace them all with a note like "To see original references, please visit [url]."
thanks, Brianna
-- They've just been waiting in a mountain for the right moment: http://modernthings.org/
-- They've just been waiting in a mountain for the right moment: http://modernthings.org/
Textbook-l mailing list Textbook-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/textbook-l
-- Heiko Hees / brainbot technologies AG Boppstrasse 64 / 55118 Mainz Fon +49 (0) 61 31 - 2 11 63 91
(No need to forward my messages to mwlib anymore; I subscribed there and will write there if appropriate :))
2009/1/29 Heiko Hees heiko@pediapress.com:
Hi,
On Jan 28, 2009, at 2:42 PM, Brianna Laugher wrote:
oops, one more thing: It appears that for pages collected using the bookmarklet, license information is not recorded or included. I don't particularly want a copy of the GFDL any more than references, but a notice to that effect would be nice.
*looks at the API* hm... bizarrely, I don't see the license info being one of the API parameters. Did I miss it or is it a bug?
assuming you are talking about the MediaWiki API. Yes this is a bug with MediaWiki.
In our opinion, license management/handling should be a core feature of MediaWiki, because the software is explicitely developed for the collaborative generation of free content. But from the perspective of a re-user this is almost non-existent[1].
Heiko
I see what you mean. I will reply in more detail to your post to wikitech-l.
I meant a much more simple getting of the info that appears in the footer of MediaWikis which says the license or rights page. I made a patch to add that info to the API. https://bugzilla.wikimedia.org/show_bug.cgi?id=17224 Although putting the name & URL of the GFDL doesn't comply with it, it is definitely better than nothing (and for many licenses it is almost sufficient).
But as I said I will reply on your other post.
thanks, Brianna
2009/1/29 Heiko Hees heiko@pediapress.com:
Hi,
On Jan 28, 2009, at 2:42 PM, Brianna Laugher wrote:
oops, one more thing: It appears that for pages collected using the bookmarklet, license information is not recorded or included. I don't particularly want a copy of the GFDL any more than references, but a notice to that effect would be nice.
*looks at the API* hm... bizarrely, I don't see the license info being one of the API parameters. Did I miss it or is it a bug?
assuming you are talking about the MediaWiki API. Yes this is a bug with MediaWiki.
You can now find out the license of a wiki via the API :)
http://en.wikipedia.org/w/api.php?action=query&meta=siteinfo&siprop=...
this still doesn't help with finding the license for files, or knowing what the requirements of the license are (eg fulltext, or authors), but at least it lets you put a link and the name of the license.
cheers Brianna
I'm forwarding this email to the mwlib mailinglist.
--Andrwew Whitworth
---------- Forwarded message ---------- From: Brianna Laugher brianna.laugher@gmail.com Date: Wed, Jan 28, 2009 at 8:37 AM Subject: [Textbook-l] PediaPress error with multiple pages with the same name To: Wikimedia textbook discussion textbook-l@lists.wikimedia.org
Hi,
I'm not sure where to post this, but I'm pretty sure if I put it here someone from PediaPress will probably read it.
I just found out about the PediaPress bookmarklet "create a collection from any MediaWiki" thing and I thought I would try it. I added the "Melbourne" article from en.wikipedia, wikitravel and Wikimedia Commons, as well as an article called "Street press" from somewhere else. When I downloaded the PDF, it contained 3 copies of the en.wikipedia "Melbourne" article and the "Street press" article. So I guess there is some bug with multiple articles from different sources that happen to have the same name.
Secondly, this is not a bug but a feature request: en.wikipedia in particular produces an awful amount of crud that is not that useful for printing: references, external links etc. For the [[Melbourne]] article, there are 22 pages of beautiful text and images, and no less than 11 1/2 pages of crud, mostly consisting of 184 references. Would it be possible to have an option to exclude references? Maybe replace them all with a note like "To see original references, please visit [url]."
thanks, Brianna
-- They've just been waiting in a mountain for the right moment: http://modernthings.org/
_______________________________________________ Textbook-l mailing list Textbook-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/textbook-l
Hi,
On Jan 28, 2009, at 2:37 PM, Brianna Laugher wrote:
I'm not sure where to post this, but I'm pretty sure if I put it here someone from PediaPress will probably read it.
yes :)
I just found out about the PediaPress bookmarklet "create a collection from any MediaWiki" thing and I thought I would try it. I added the "Melbourne" article from en.wikipedia, wikitravel and Wikimedia Commons, as well as an article called "Street press" from somewhere else. When I downloaded the PDF, it contained 3 copies of the en.wikipedia "Melbourne" article and the "Street press" article. So I guess there is some bug with multiple articles from different sources that happen to have the same name.
Yes this is a bug. The feature[1] is rather prototypish. If one wants to implement this correctly much more work is involved, especially to get licence handling right.
Secondly, this is not a bug but a feature request: en.wikipedia in particular produces an awful amount of crud that is not that useful for printing: references, external links etc. For the [[Melbourne]] article, there are 22 pages of beautiful text and images, and no less than 11 1/2 pages of crud, mostly consisting of 184 references. Would it be possible to have an option to exclude references? Maybe replace them all with a note like "To see original references, please visit [url]."
Good idea! I'll add it on the "customization of PDF-output"-wishlist[2].
Heiko
[1] http://pediapress.com/collection/ [2] http://code.pediapress.com/wiki/ticket/419#comment:1
textbook-l@lists.wikimedia.org