[Foundation-l] Printed on Demand Books from Wikipedia articles

List overview All Threads
Download

newer

older

[Foundation-l] Native Cherokee XML...

Re: [Foundation-l] ..

Volker Haas

12 Jul 2006 12 Jul '06

3:06 p.m.

Hi everybody,

as the title suggests I would like to introduce a website to the wikimedia community that offers books consisting of wikipedia articles that areprinted on demand. The unique feature is that every user can individually pick arbitrary articles from the wikipedia (currently only from the english version) and add them to his book. Once the user is satisfied with the book's content, he can download a pdf-preview and check how the book will look like when printed. The last step is to order the book and get a unique encyclopedia containing only selected wikipedia articles.

The service is located at http://www.pediapress.com and has not yet been publicly announced.

I would like to know the opinion of the wikimedia community about this service, since I am aware of the fact that a couple of copyright and also "moral" issues are related to selling wikipedia content.

Some of these issues are pointed out in the thread "Offering Wikibooks content for sale" (http://www.gossamer-threads.com/lists/wiki/foundation/63024).

I would like to give you some details about our efforts to comply with the GFDL and also the spirit of the wikipedia content and community:

* on the pediapress.com website and in the printed books it is clearly stated that all articles have their origin in the wikipedia project

* below each article a link (in case of the book the plain URL) to the original wikipedia article is shown

* all principle authors are listed below an article in the book if possible

* pediapress does not generate any unnecessary traffic on wikipedia since we use the wikipedia dumps

* we e-mailed with Angela Beesley (member of the Board of Trustees of the Wikimedia Foundation) and Brad Patrick (general counsel to the Wikimedia Foundation) who did not see anything problematic with our service

Last but not least I want to point out that the pediapress affiliate program has the goal to generate donations to the wikimedia foundation: after ordering a book, users have the option to publish and promote their book. Thereby others are able to buy the published book. The publisher receives a provision of 10% of the books price which he can donate to the wikimedia foundation.Please note that pediapress is still in a "beta" phase even though compiling, printing and ordering books is perfectly possible. But since all processing and layouting is done automatically there are still some shortcomings that will be fixed asap.

I don't want to deny that my company (brainbot technologies AG - http://brainbot.com) and I hope that pediapress supports itself financially. The main motivation nevertheless was to give something back to the wikipedia community. Since me and my colleagues aren't exactly good writers but heavy consumers of the wikipedia, we thought it would be a better idea to try to promote the offline usage of the wikipedia by building pediapress.

Thanks for your patience - I hope some of you check out pediapress and post their opinions.

Volker Haas

Show replies by date

Oldak Quill

12 Jul 12 Jul

3:57 p.m.

"Morally", as you put it, this project raises no concerns. I would say that it is "morally" good - using the work we have created, providing Wikipedia in more media and raising money for Wikimedia. I wish you the best of luck. Perhaps the only problem in this respect is that the articles will be unedited (I assume?) and unverified. Our mistakes will forever be distilled in ink. This is not a great concern since there is currently no way to get 100% verified Wikipedia articles yet anyway. Will you include a description of what Wikipedia is and a disclaimer at the front of each book?

Legally, there are some issues to take note of. As I'm sure you know, each book would need to contain a full copy of the GFDL 1.2 license and would have to provide the names of a certain number of authors for each article (as you mention in your post).

There have been some attempts at this kind of thing from within Wikipedia. They are called "Readers" - compilations of articles (on, say, Mammals) which are edited and presented in quite a pleasant format. I suggest you attempt to work with the users who have worked on Readers to get ideas for presentation/editing. It would be better for all parties if you stay in communication with the community rather than simply exist coldly outside.

May I ask how you are planning on printing and binding these books? Will they be hardback or softback? What kind of presses, papers and inks will you be using? Will you provide "template books" (like the above readers) with precollated selections of articles on mammals or whatnot?

Wikipedia's allowance for commercial use has always had these kind of projects in mind. I wish you luck again and hope you keep in communication with us.

On 12/07/06, Volker Haas volker.haas@brainbot.com wrote:

...

Hi everybody,

as the title suggests I would like to introduce a website to the wikimedia community that offers books consisting of wikipedia articles that areprinted on demand. The unique feature is that every user can individually pick arbitrary articles from the wikipedia (currently only from the english version) and add them to his book. Once the user is satisfied with the book's content, he can download a pdf-preview and check how the book will look like when printed. The last step is to order the book and get a unique encyclopedia containing only selected wikipedia articles.

The service is located at http://www.pediapress.com and has not yet been publicly announced.

I would like to know the opinion of the wikimedia community about this service, since I am aware of the fact that a couple of copyright and also "moral" issues are related to selling wikipedia content.

Some of these issues are pointed out in the thread "Offering Wikibooks content for sale" (http://www.gossamer-threads.com/lists/wiki/foundation/63024).

I would like to give you some details about our efforts to comply with the GFDL and also the spirit of the wikipedia content and community:

on the pediapress.com website and in the printed books it is clearly

stated that all articles have their origin in the wikipedia project

below each article a link (in case of the book the plain URL) to the

original wikipedia article is shown

all principle authors are listed below an article in the book if possible

pediapress does not generate any unnecessary traffic on wikipedia

since we use the wikipedia dumps

we e-mailed with Angela Beesley (member of the Board of Trustees of

the Wikimedia Foundation) and Brad Patrick (general counsel to the Wikimedia Foundation) who did not see anything problematic with our service

Last but not least I want to point out that the pediapress affiliate program has the goal to generate donations to the wikimedia foundation: after ordering a book, users have the option to publish and promote their book. Thereby others are able to buy the published book. The publisher receives a provision of 10% of the books price which he can donate to the wikimedia foundation.Please note that pediapress is still in a "beta" phase even though compiling, printing and ordering books is perfectly possible. But since all processing and layouting is done automatically there are still some shortcomings that will be fixed asap.

I don't want to deny that my company (brainbot technologies AG - http://brainbot.com) and I hope that pediapress supports itself financially. The main motivation nevertheless was to give something back to the wikipedia community. Since me and my colleagues aren't exactly good writers but heavy consumers of the wikipedia, we thought it would be a better idea to try to promote the offline usage of the wikipedia by building pediapress.

Thanks for your patience - I hope some of you check out pediapress and post their opinions.

Volker Haas

foundation-l mailing list foundation-l@wikimedia.org http://mail.wikipedia.org/mailman/listinfo/foundation-l

-- Oldak Quill (oldakquill@gmail.com)

Volker Haas

13 Jul 13 Jul

9:34 a.m.

Hi Oldak,

thanks for your comments!

Oldak Quill wrote:

...

Perhaps the only problem in this respect is that the articles will be unedited (I assume?) and unverified. Our mistakes will forever be distilled in ink. This is not a great concern since there is currently no way to get 100% verified Wikipedia articles yet anyway.

Your assumtion is correct: all articles are not edited or verified by us. The articles we print are exact "copies" of the articles in the wikipedia dump we use.

...

Will you include a description of what Wikipedia is and a disclaimer at the front of each book?

We do not give a description of what Wikipedia is. But as you have probably seen, we clearly state on the website and in the books that the articles are from Wikipedia. In the footer of the pediapress website we also link wikipedia.org. If you think that we should include a brief description about Wikipedia and make it more clear in the books where the articles are from we could try to improve on that issue.

...

Legally, there are some issues to take note of. As I'm sure you know, each book would need to contain a full copy of the GFDL 1.2 license and would have to provide the names of a certain number of authors for each article (as you mention in your post).

The GFDL is included in every book.

...

There have been some attempts at this kind of thing from within Wikipedia. They are called "Readers" - compilations of articles (on, say, Mammals) which are edited and presented in quite a pleasant format. I suggest you attempt to work with the users who have worked on Readers to get ideas for presentation/editing.

We are familiar with the Wikipedia Readers. But since all layouting of books with pediapress is done automatically we can't really focus on improving the layout/presentation of individual articles. Nonetheless we try to find "general rules" on how to improve the layout of the books. This is a rather hard problem - and we sometimes fail...

...

It would be better for all parties if you stay in communication with the community rather than simply exist coldly outside.

It is absolutely not our intent to develop pediapress on an "isolated island". We want to stay in touch with the community for the mutual benefit.

...

May I ask how you are planning on printing and binding these books?

The books are printed by Instabook (http://www.instabook.net/). You can find more infos on our info site or their website ;-) http://pediapress.com/info/

...

Will they be hardback or softback?

Softback with glossy finish.

...

Will you provide "template books" (like the above readers) with precollated selections of articles on mammals or whatnot?

We will not provide "template books". On the main page of pediapress we feature a couple of books. These books were ordered by pediapress users and chosen by us to be featured. The featured books are also not edited in any way. Featured books are books that have been "published" by users who have ordered the book. More info on publishing books can be found at http://pediapress.com/info/#affiliate

...

Wikipedia's allowance for commercial use has always had these kind of projects in mind. I wish you luck again and hope you keep in communication with us.

Thanks again for the comments! I hope I could answer your questions in a satisfactory manner!

- Volker

...

On 12/07/06, Volker Haas volker.haas@brainbot.com wrote:

...
Hi everybody,

....

foundation-l mailing list foundation-l@wikimedia.org http://mail.wikipedia.org/mailman/listinfo/foundation-l

Erik Moeller

12 Jul 12 Jul

4:11 p.m.

Volker,

this is very powerful stuff, and it's great that you're doing it. A couple of technical questions:

1) Are you using the highest quality images available for the PDF generation? Is the quality setting deliberately low? Comparing, for instance, http://upload.wikimedia.org/wikipedia/en/8/8e/Chirpedpulse.jpg with the image used in the PDF under "Wigner quasi-probability distribution", the PDF image seems to have much more pronounced compression artifacts.

2) What mechanism do you use to generate the PDF files? Is there any chance that part of the software might become open source, if it isn't already? I know many people who would be interested in this functionality outside the context of Wikipedia.

If this service works well, from my perspective (I don't have any Foundation authority) it would be great if this could be developed into a larger partnership, with the project being featured prominently on our side (perhaps in return for a larger percentage of the profits).

Erik

Oldak Quill

4:26 p.m.

Sorry, just a one more question (shame mailing lists don't have edit buttons!).

Will you be avoiding inclusion of fair use images? I would highly recommend this as it is quite hard to know where you are with fair use.

As Erik suggests, working within Wikipedia itself might be useful, might I suggest that you operate transparently. What I mean by this is that you tell us that you have had an order for a book and the client has specified a particular list of articles. Knowing this we can help you select the best versions of the article from the history, edit these versions for spelling and grammar, help you remove fallacious statements within these versions and help select which images to include. If you agree with this it would be a good idea to set up a Wiki so that this kind of editing can easily be done.

On 12/07/06, Erik Moeller eloquence@gmail.com wrote:

...

Volker,

this is very powerful stuff, and it's great that you're doing it. A couple of technical questions:

Are you using the highest quality images available for the PDF

generation? Is the quality setting deliberately low? Comparing, for instance, http://upload.wikimedia.org/wikipedia/en/8/8e/Chirpedpulse.jpg with the image used in the PDF under "Wigner quasi-probability distribution", the PDF image seems to have much more pronounced compression artifacts.

What mechanism do you use to generate the PDF files? Is there any

chance that part of the software might become open source, if it isn't already? I know many people who would be interested in this functionality outside the context of Wikipedia.

If this service works well, from my perspective (I don't have any Foundation authority) it would be great if this could be developed into a larger partnership, with the project being featured prominently on our side (perhaps in return for a larger percentage of the profits).

Erik _______________________________________________ foundation-l mailing list foundation-l@wikimedia.org http://mail.wikipedia.org/mailman/listinfo/foundation-l

-- Oldak Quill (oldakquill@gmail.com)

Volker Haas

13 Jul 13 Jul

10:23 a.m.

Hi Oldak

Oldak Quill wrote:

...

Will you be avoiding inclusion of fair use images? I would highly recommend this as it is quite hard to know where you are with fair use.

Currently we also include fair use images. Each image's copyright is stated in the list of figures of the book (which can be seen in the preview as well). It is the users obligation to check whether he has problematic images in his book (as stated in the terms of service). Do you think this is problematic?

...

As Erik suggests, working within Wikipedia itself might be useful, might I suggest that you operate transparently. What I mean by this is that you tell us that you have had an order for a book and the client has specified a particular list of articles. Knowing this we can help you select the best versions of the article from the history, edit these versions for spelling and grammar, help you remove fallacious statements within these versions and help select which images to include. If you agree with this it would be a good idea to set up a Wiki so that this kind of editing can easily be done.

This is a nice idea but confronts us with a couple of problems: a user compiles a book and previews it and he expects to get a book which is identical to the preview. Therefore we would not want to have the articles changed after the user saw the preview.

Nonetheless I think that it would be a mutual benefit, if pediapress could help discover articles that need "attention". If we can figure out a good way of doing so we are open to discussion!

...

...

foundation-l mailing list foundation-l@wikimedia.org http://mail.wikipedia.org/mailman/listinfo/foundation-l

Jack

10:56 a.m.

May I suggest that pediapress not require cookies to just look at the site, tho I can understand requiring cookies when a customer builds up a book.

Wikipedia for instance only creates cookies when account holders wish to be automatically logged in - no cookies are required to just read articles.

-- user:Jeandré

Volker Haas

1:39 p.m.

Hi Jack,

thanks for your comment! It is now possible to look at the pediapress main page without having cookies enabled. But it is still not possible to order books without cookies. The reason is that we store session information in the cookie.

-- Volker

Jack wrote:

...

May I suggest that pediapress not require cookies to just look at the site, tho I can understand requiring cookies when a customer builds up a book.

Wikipedia for instance only creates cookies when account holders wish to be automatically logged in - no cookies are required to just read articles.

Jack

3:05 p.m.

You may want to take the minor edit http://en.wikipedia.org/wiki/Help:Minor_edit flag into account when determining the "at least five of the principal authors of the Document".

E.g. in the "USS Nautilus (SSN-571)" article in Tim Cruze's book "Big Science", "Furry" is listed as one of the principal authors, eventho user:Furry only made 2 minor edits on 2005-09-30 to the article (deleting a space and a full stop). user:The Epopt completely rewrote the article on 2004-01-05t23:52:26z, but is not listed as a principal author.

Ignoring edits that were reverted http://en.wikipedia.org/wiki/WP:RV would also improve the authors list.

The contact form at http://pediapress.com/contact/ is reporting "500 - Internal Server Error" on send.

-- user:Jeandré Beans: The VD on WP book.

Volker Haas

3:17 p.m.

Hi Jack, thanks for your comment about the principle author list. We try to improve that. Sorry for the inconvenience with the contact form - that bug is fixed.

-- Volker

Jack wrote:

...

You may want to take the minor edit http://en.wikipedia.org/wiki/Help:Minor_edit flag into account when determining the "at least five of the principal authors of the Document".

E.g. in the "USS Nautilus (SSN-571)" article in Tim Cruze's book "Big Science", "Furry" is listed as one of the principal authors, eventho user:Furry only made 2 minor edits on 2005-09-30 to the article (deleting a space and a full stop). user:The Epopt completely rewrote the article on 2004-01-05t23:52:26z, but is not listed as a principal author.

Ignoring edits that were reverted http://en.wikipedia.org/wiki/WP:RV would also improve the authors list.

The contact form at http://pediapress.com/contact/ is reporting "500

Internal Server Error" on send.

Ben Yates

18 Jul 18 Jul

10:34 p.m.

One major usability issue is that articles directly in a category are added when the category is added, but not articles within subcategories. This can lead to some strange results.

On 7/13/06, Volker Haas volker.haas@brainbot.com wrote:

...

Hi Jack, thanks for your comment about the principle author list. We try to improve that. Sorry for the inconvenience with the contact form - that bug is fixed.

-- Volker

Jack wrote:

...
You may want to take the minor edit http://en.wikipedia.org/wiki/Help:Minor_edit flag into account when determining the "at least five of the principal authors of the Document".

E.g. in the "USS Nautilus (SSN-571)" article in Tim Cruze's book "Big Science", "Furry" is listed as one of the principal authors, eventho user:Furry only made 2 minor edits on 2005-09-30 to the article (deleting a space and a full stop). user:The Epopt completely rewrote the article on 2004-01-05t23:52:26z, but is not listed as a principal author.

Ignoring edits that were reverted http://en.wikipedia.org/wiki/WP:RV would also improve the authors list.

The contact form at http://pediapress.com/contact/ is reporting "500

Internal Server Error" on send.

foundation-l mailing list foundation-l@wikimedia.org http://mail.wikipedia.org/mailman/listinfo/foundation-l

-- Ben Yates Wikipedia blog - http://wikip.blogspot.com

James Hare

11:02 p.m.

Yes. And why doesn't Wikipedia get DPLs? They could prove useful for us -- they're not just for automagically displaying news articles on Wikinews.

On 7/18/06, Ben Yates bluephonic@gmail.com wrote:

...

One major usability issue is that articles directly in a category are added when the category is added, but not articles within subcategories. This can lead to some strange results.

On 7/13/06, Volker Haas volker.haas@brainbot.com wrote:

...
Hi Jack, thanks for your comment about the principle author list. We try to improve that. Sorry for the inconvenience with the contact form - that bug is fixed.

-- Volker

Jack wrote:

...
You may want to take the minor edit http://en.wikipedia.org/wiki/Help:Minor_edit flag into account when determining the "at least five of the principal authors of the Document".

E.g. in the "USS Nautilus (SSN-571)" article in Tim Cruze's book "Big Science", "Furry" is listed as one of the principal authors, eventho user:Furry only made 2 minor edits on 2005-09-30 to the article (deleting a space and a full stop). user:The Epopt completely rewrote the article on 2004-01-05t23:52:26z, but is not listed as a principal author.

Ignoring edits that were reverted http://en.wikipedia.org/wiki/WP:RV would also improve the authors list.

The contact form at http://pediapress.com/contact/ is reporting "500

Internal Server Error" on send.

foundation-l mailing list foundation-l@wikimedia.org http://mail.wikipedia.org/mailman/listinfo/foundation-l

-- Ben Yates Wikipedia blog - http://wikip.blogspot.com _______________________________________________ foundation-l mailing list foundation-l@wikimedia.org http://mail.wikipedia.org/mailman/listinfo/foundation-l

Brion Vibber

19 Jul 19 Jul

7:54 a.m.

James Hare wrote:

...

Yes. And why doesn't Wikipedia get DPLs?

Because we don't want to spend a million dollars on servers to keep the site from crashing when people cross multiple 200-thousand-page categories?

-- brion vibber (brion @ pobox.com)

Volker Haas

8:58 a.m.

I believe that adding all articles in a category and also in all subcategories would result in many unwanted articles in the book. By carefully selecting a category with only a few subcategories this problem could be avoided, but I believe in most cases adding subcategories as well has undesired effects.

By using the "suggested articles" button on the build page you should ideally get the "best" articles in the subcategories as well (after you added some articles you want in your book). I am aware that currently it is not possible to add all suggested articles - we are thinking about adding this functionality.

-- Volker

Since many categories have lots of articles assigned to them and also

Ben Yates wrote:

...

One major usability issue is that articles directly in a category are added when the category is added, but not articles within subcategories. This can lead to some strange results.

On 7/13/06, Volker Haas volker.haas@brainbot.com wrote:

...
Hi Jack, thanks for your comment about the principle author list. We try to improve that. Sorry for the inconvenience with the contact form - that bug is fixed.

-- Volker

Jack wrote:

...
You may want to take the minor edit http://en.wikipedia.org/wiki/Help:Minor_edit flag into account when determining the "at least five of the principal authors of the Document".

E.g. in the "USS Nautilus (SSN-571)" article in Tim Cruze's book "Big Science", "Furry" is listed as one of the principal authors, eventho user:Furry only made 2 minor edits on 2005-09-30 to the article (deleting a space and a full stop). user:The Epopt completely rewrote the article on 2004-01-05t23:52:26z, but is not listed as a principal author.

Ignoring edits that were reverted http://en.wikipedia.org/wiki/WP:RV would also improve the authors list.

The contact form at http://pediapress.com/contact/ is reporting "500

Internal Server Error" on send.

foundation-l mailing list foundation-l@wikimedia.org http://mail.wikipedia.org/mailman/listinfo/foundation-l

Volker Haas

13 Jul 13 Jul

9:56 a.m.

Hi Erik,

I also thank you for the comments and questions!

Erik Moeller wrote:

...

Are you using the highest quality images available for the PDF

generation? Is the quality setting deliberately low? Comparing, for instance, http://upload.wikimedia.org/wikipedia/en/8/8e/Chirpedpulse.jpg with the image used in the PDF under "Wigner quasi-probability distribution", the PDF image seems to have much more pronounced compression artifacts.

We always try using the best image available (in terms of resolution). Currently we restrict the resolution of the images in the book to 150 dpi. This is done to prevent the pdfs to get to large. However, the example you provided indicates that this might be to low. We will check that and decide if we increase the maximum resolution.

...

What mechanism do you use to generate the PDF files?

We implemented a mediawiki parser from scratch. The result of the parsing process is an intermediate format that then gets transformed to latex (for the books) and to html for displaying the articles on the pediapress website. As you probably guess, this is a non-trivial process - to be precise: it's a rather difficult challenge ;-)

...

Is there any chance that part of the software might become open source, if it isn't already?

So far it is not planned to release the parser software as open source. But if we can figure out a good cooperation model we can talk about that in the future.

...

I know many people who would be interested in this functionality outside the context of Wikipedia.

If this service works well, from my perspective (I don't have any Foundation authority) it would be great if this could be developed into a larger partnership, with the project being featured prominently on our side (perhaps in return for a larger percentage of the profits).

We would be happy if the pediapress project would result in a partnership with the wikimedia foundation. I currently do not know how the details of such a partnership would "look like" - but we are absolutely open to discussion in that matter.

...

foundation-l mailing list foundation-l@wikimedia.org http://mail.wikipedia.org/mailman/listinfo/foundation-l

James Hare

12 Jul 12 Jul

4:29 p.m.

This is a very good idea.

Y'know, I came up with this same idea yesterday!

On 7/12/06, Volker Haas volker.haas@brainbot.com wrote:

...

Hi everybody,

as the title suggests I would like to introduce a website to the wikimedia community that offers books consisting of wikipedia articles that areprinted on demand. The unique feature is that every user can individually pick arbitrary articles from the wikipedia (currently only from the english version) and add them to his book. Once the user is satisfied with the book's content, he can download a pdf-preview and check how the book will look like when printed. The last step is to order the book and get a unique encyclopedia containing only selected wikipedia articles.

The service is located at http://www.pediapress.com and has not yet been publicly announced.

I would like to know the opinion of the wikimedia community about this service, since I am aware of the fact that a couple of copyright and also "moral" issues are related to selling wikipedia content.

Some of these issues are pointed out in the thread "Offering Wikibooks content for sale" (http://www.gossamer-threads.com/lists/wiki/foundation/63024).

I would like to give you some details about our efforts to comply with the GFDL and also the spirit of the wikipedia content and community:

on the pediapress.com website and in the printed books it is clearly

stated that all articles have their origin in the wikipedia project

below each article a link (in case of the book the plain URL) to the

original wikipedia article is shown

all principle authors are listed below an article in the book if

possible

pediapress does not generate any unnecessary traffic on wikipedia

since we use the wikipedia dumps

we e-mailed with Angela Beesley (member of the Board of Trustees of

the Wikimedia Foundation) and Brad Patrick (general counsel to the Wikimedia Foundation) who did not see anything problematic with our service

Last but not least I want to point out that the pediapress affiliate program has the goal to generate donations to the wikimedia foundation: after ordering a book, users have the option to publish and promote their book. Thereby others are able to buy the published book. The publisher receives a provision of 10% of the books price which he can donate to the wikimedia foundation.Please note that pediapress is still in a "beta" phase even though compiling, printing and ordering books is perfectly possible. But since all processing and layouting is done automatically there are still some shortcomings that will be fixed asap.

I don't want to deny that my company (brainbot technologies AG - http://brainbot.com) and I hope that pediapress supports itself financially. The main motivation nevertheless was to give something back to the wikipedia community. Since me and my colleagues aren't exactly good writers but heavy consumers of the wikipedia, we thought it would be a better idea to try to promote the offline usage of the wikipedia by building pediapress.

Thanks for your patience - I hope some of you check out pediapress and post their opinions.

Volker Haas

foundation-l mailing list foundation-l@wikimedia.org http://mail.wikipedia.org/mailman/listinfo/foundation-l

Oldak Quill

4:32 p.m.

I invented something quite similar to Wikipedia back in August 2001 - shame I didn't implement it! ;)

On 12/07/06, James Hare messedrocker@gmail.com wrote:

...

This is a very good idea.

Y'know, I came up with this same idea yesterday!

On 7/12/06, Volker Haas volker.haas@brainbot.com wrote:

...
Hi everybody,

as the title suggests I would like to introduce a website to the wikimedia community that offers books consisting of wikipedia articles that areprinted on demand. The unique feature is that every user can individually pick arbitrary articles from the wikipedia (currently only from the english version) and add them to his book. Once the user is satisfied with the book's content, he can download a pdf-preview and check how the book will look like when printed. The last step is to order the book and get a unique encyclopedia containing only selected wikipedia articles.

The service is located at http://www.pediapress.com and has not yet been publicly announced.

I would like to know the opinion of the wikimedia community about this service, since I am aware of the fact that a couple of copyright and also "moral" issues are related to selling wikipedia content.

Some of these issues are pointed out in the thread "Offering Wikibooks content for sale" (http://www.gossamer-threads.com/lists/wiki/foundation/63024).

I would like to give you some details about our efforts to comply with the GFDL and also the spirit of the wikipedia content and community:

on the pediapress.com website and in the printed books it is clearly

stated that all articles have their origin in the wikipedia project

below each article a link (in case of the book the plain URL) to the

original wikipedia article is shown

all principle authors are listed below an article in the book if

possible

pediapress does not generate any unnecessary traffic on wikipedia

since we use the wikipedia dumps

we e-mailed with Angela Beesley (member of the Board of Trustees of

the Wikimedia Foundation) and Brad Patrick (general counsel to the Wikimedia Foundation) who did not see anything problematic with our service

Last but not least I want to point out that the pediapress affiliate program has the goal to generate donations to the wikimedia foundation: after ordering a book, users have the option to publish and promote their book. Thereby others are able to buy the published book. The publisher receives a provision of 10% of the books price which he can donate to the wikimedia foundation.Please note that pediapress is still in a "beta" phase even though compiling, printing and ordering books is perfectly possible. But since all processing and layouting is done automatically there are still some shortcomings that will be fixed asap.

I don't want to deny that my company (brainbot technologies AG - http://brainbot.com) and I hope that pediapress supports itself financially. The main motivation nevertheless was to give something back to the wikipedia community. Since me and my colleagues aren't exactly good writers but heavy consumers of the wikipedia, we thought it would be a better idea to try to promote the offline usage of the wikipedia by building pediapress.

Thanks for your patience - I hope some of you check out pediapress and post their opinions.

Volker Haas

foundation-l mailing list foundation-l@wikimedia.org http://mail.wikipedia.org/mailman/listinfo/foundation-l

foundation-l mailing list foundation-l@wikimedia.org http://mail.wikipedia.org/mailman/listinfo/foundation-l

-- Oldak Quill (oldakquill@gmail.com)

Robert Scott Horning

9:06 p.m.

Volker Haas wrote:

...

Hi everybody,

as the title suggests I would like to introduce a website to the wikimedia community that offers books consisting of wikipedia articles that areprinted on demand.

I am curious how you plan to compile these books in a format that looks good on paper and not strictly as a web format. While this can be automated to an extent, I do think there are some issues that come from trying to move content from a web format to a printed page, and not all of these can be completely automated. As I havn't been able to preview a book yet,

...

we e-mailed with Angela Beesley (member of the Board of Trustees of

the Wikimedia Foundation) and Brad Patrick (general counsel to the Wikimedia Foundation) who did not see anything problematic with our service.

While notifying the Wikimedia Foundation board may be demonstrating good faith, I want to emphasis that it is not necessary to seek permission in order to republish Wikimedia project content. The GFDL spells out the terms and conditions quite clearly and in theory should be sufficient for non-lawyers to get into print versions of Wikimedia content.

This does get back to the point in question that I raised with my first posting about print versions of Wikimedia content. What should reasonable project guidelines be for links to print versions like this? As I've pointed out, this is but the beginning of a huge number of request that will be coming for services like this, and telling everybody to get their own legal counsel and duke it out in the courtroom is not going to be a pleasant experience for anybody.

Not permitting any links to services like this might be a solution, and encouraging independent advertising of these services outside of Wikimedia projects as the only real advertising venue.

There is, however, some value for individuals who want to purchase print versions and it would be logical to offer some links to printed versions. This would be a service to our "readers" that may want to get some added value to Wikimedia projects. Some sort of standard should be applied *if* we permit some external links, even if of the variety like the ISBN links to on-line booksellers. That we do offer links to commercial services on Wikimedia projects already for books should demonstrate at least some sort of precedence to permit reprint links as well, as long as they are just as descrete and don't clutter up the article/project pages.

-- Robert Scott Horning

Volker Haas

13 Jul 13 Jul

10:45 a.m.

Hi Robert

Robert Scott Horning wrote:

...

I am curious how you plan to compile these books in a format that looks good on paper and not strictly as a web format. While this can be automated to an extent, I do think there are some issues that come from trying to move content from a web format to a printed page, and not all of these can be completely automated.

As you have pointed out, an automatic conversion of html (or mediawiki markup) into latex can not be automated in a way that the book is absolutely perfect. But right now we are pretty satisfied with the results - even though we are still working on improvments of the conversion. The conversion is done with a parser we developed from scratch - as mentioned in a previous post. The mediawiki markup is translated into an internal representation which then gets transformed to latex (to be more precise "context") - this is the hard part. It is slightly easier to transform the internal representation back to html since some css-style information can be maintained.

...

As I havn't been able to preview a book yet,

I hope you did not encounter any technical problems with the pediapress website?

- Volker

Anthony

12:05 p.m.

On 7/13/06, Volker Haas volker.haas@brainbot.com wrote:

...

Hi Robert

Robert Scott Horning wrote:

...
I am curious how you plan to compile these books in a format that looks good on paper and not strictly as a web format. While this can be automated to an extent, I do think there are some issues that come from trying to move content from a web format to a printed page, and not all of these can be completely automated.

As you have pointed out, an automatic conversion of html (or mediawiki markup) into latex can not be automated in a way that the book is absolutely perfect. But right now we are pretty satisfied with the results - even though we are still working on improvments of the conversion. The conversion is done with a parser we developed from scratch - as mentioned in a previous post. The mediawiki markup is translated into an internal representation which then gets transformed to latex (to be more precise "context") - this is the hard part. It is slightly easier to transform the internal representation back to html since some css-style information can be maintained.

As I've mentioned before, I'd love to see a Wikimedia project which does exactly this. Then the latex could be edited collaboratively to make things more "absolutely perfect". It's nice to see it's at least somewhat possible, though I'd say the quality of the previews right now is fairly low.

Anthony

Robert Scott Horning

1:03 p.m.

Anthony wrote:

...

On 7/13/06, Volker Haas volker.haas@brainbot.com wrote:

...
Robert Scott Horning wrote:

...
I am curious how you plan to compile these books in a format that looks good on paper and not strictly as a web format. While this can be automated to an extent, I do think there are some issues that come from trying to move content from a web format to a printed page, and not all of these can be completely automated.

As you have pointed out, an automatic conversion of html (or mediawiki markup) into latex can not be automated in a way that the book is absolutely perfect.

As I've mentioned before, I'd love to see a Wikimedia project which does exactly this. Then the latex could be edited collaboratively to make things more "absolutely perfect". It's nice to see it's at least somewhat possible, though I'd say the quality of the previews right now is fairly low.

Anthony

While this is something that perhaps should be moved to the tech list, I'm curious about what the user interface of something like this ought to look like if we did something like this?

MediaWiki software certainly is capable of storing and retrieving raw ASCII text, and perhaps we could throw another "tab" that would be an editable LaTeX version of the content that could also be "regenerated" from the Wiki markup content. This would be another independent page with edit histories, like the talk page.

Other kinds of options might be available, but the user experience would have to be smooth and consistant with other aspects of editing Wikimedia content.

I like the idea too, and might want to get involved with trying to put it together.

-- Robert Scott Horning

Anthony

1:41 p.m.

On 7/13/06, Robert Scott Horning robert_horning@netzero.net wrote:

...

Anthony wrote:

...
On 7/13/06, Volker Haas volker.haas@brainbot.com wrote:

...
Robert Scott Horning wrote:

...
I am curious how you plan to compile these books in a format that looks good on paper and not strictly as a web format. While this can be automated to an extent, I do think there are some issues that come from trying to move content from a web format to a printed page, and not all of these can be completely automated.

As you have pointed out, an automatic conversion of html (or mediawiki markup) into latex can not be automated in a way that the book is absolutely perfect.

As I've mentioned before, I'd love to see a Wikimedia project which does exactly this. Then the latex could be edited collaboratively to make things more "absolutely perfect". It's nice to see it's at least somewhat possible, though I'd say the quality of the previews right now is fairly low.

Anthony

While this is something that perhaps should be moved to the tech list, I'm curious about what the user interface of something like this ought to look like if we did something like this?

Well, the easiest way would unfortunately involve a fork of the content. You'd have to pick a particular edition to export to LaTeX, and then edit it from there. There would be rules, of course, that no major changes to the content could be made, and that minor fixes (e.g. grammatical) would have to be backported, but that's messy.

It'd be nicer if the merging could also be automated, but that's complicated. The best solution would probably involve a way to put the formatting directives directly into the wikitext, but that has its own problems.

...

MediaWiki software certainly is capable of storing and retrieving raw ASCII text, and perhaps we could throw another "tab" that would be an editable LaTeX version of the content that could also be "regenerated" from the Wiki markup content. This would be another independent page with edit histories, like the talk page.

Yeah, that's basically what I'd see it as, except I'm not sure how you'd be able to regenerate the page if significant formatting changes had been made without losing those changes (or performing a manual merge).

...

Other kinds of options might be available, but the user experience would have to be smooth and consistant with other aspects of editing Wikimedia content.

I like the idea too, and might want to get involved with trying to put it together.

Well, the first step would be to create the wikitext->LaTeX converter. Really I think that alone is a time consuming job, and the rest of the details can wait for it. I haven't gotten a look at wiki2xml yet though. Maybe it could be easily converted into something like that.

It's nice to see that there's some level of interest. Being that it's such a complicated project I'm not sure there's enough yet, though.

Anthony

Magnus Manske

1:15 p.m.

Anthony schrieb:

...

On 7/13/06, Volker Haas volker.haas@brainbot.com wrote:

...
Hi Robert

Robert Scott Horning wrote:

...
I am curious how you plan to compile these books in a format that looks good on paper and not strictly as a web format. While this can be automated to an extent, I do think there are some issues that come from trying to move content from a web format to a printed page, and not all of these can be completely automated.

As you have pointed out, an automatic conversion of html (or mediawiki markup) into latex can not be automated in a way that the book is absolutely perfect. But right now we are pretty satisfied with the results - even though we are still working on improvments of the conversion. The conversion is done with a parser we developed from scratch - as mentioned in a previous post. The mediawiki markup is translated into an internal representation which then gets transformed to latex (to be more precise "context") - this is the hard part. It is slightly easier to transform the internal representation back to html since some css-style information can be maintained.

As I've mentioned before, I'd love to see a Wikimedia project which does exactly this. Then the latex could be edited collaboratively to make things more "absolutely perfect". It's nice to see it's at least somewhat possible, though I'd say the quality of the previews right now is fairly low.

My wiki2xml converter is in the subversion repository. It is a set of scripts to convert MediaWiki markup into XML, and from there into other formats, including plain text, HTML, DocBook, and ODT (OpenDocument/OpenOffice format). OpenOffice is also open source and can generate PDFs natively.

I admit that wiki2xml is currently less advanced than the impressive pediapress software, as it does not render <math> tags, does not make a list of figures, etc. which is due to me being busy, and being too lazy merging in patches by others ;-)

Magnus

Shlomi Fish

1:18 p.m.

On Thursday 13 July 2006 16:15, Magnus Manske wrote:

...

Anthony schrieb:

...
On 7/13/06, Volker Haas volker.haas@brainbot.com wrote:

...
Hi Robert

Robert Scott Horning wrote:

...
I am curious how you plan to compile these books in a format that looks good on paper and not strictly as a web format. While this can be automated to an extent, I do think there are some issues that come from trying to move content from a web format to a printed page, and not all of these can be completely automated.

As you have pointed out, an automatic conversion of html (or mediawiki markup) into latex can not be automated in a way that the book is absolutely perfect. But right now we are pretty satisfied with the results - even though we are still working on improvments of the conversion. The conversion is done with a parser we developed from scratch - as mentioned in a previous post. The mediawiki markup is translated into an internal representation which then gets transformed to latex (to be more precise "context") - this is the hard part. It is slightly easier to transform the internal representation back to html since some css-style information can be maintained.

As I've mentioned before, I'd love to see a Wikimedia project which does exactly this. Then the latex could be edited collaboratively to make things more "absolutely perfect". It's nice to see it's at least somewhat possible, though I'd say the quality of the previews right now is fairly low.

My wiki2xml converter is in the subversion repository. It is a set of scripts to convert MediaWiki markup into XML, and from there into other formats, including plain text, HTML, DocBook, and ODT (OpenDocument/OpenOffice format). OpenOffice is also open source and can generate PDFs natively.

Nice. I've been meaning to write something like that. But we also need a way to convert from DocBook/XML to MediaWiki markup.

Regards,

Shlomi Fish

--------------------------------------------------------------------- Shlomi Fish shlomif@iglu.org.il Homepage: http://www.shlomifish.org/

Chuck Norris wrote a complete Perl 6 implementation in a day but then destroyed all evidence with his bare hands, so no one will know his secrets.

6714

Age (days ago)

6721

Last active (days ago)

wikimedia-l@lists.wikimedia.org

23 comments

11 participants

tags (0)

participants (11)

Anthony
Ben Yates
Brion Vibber
Erik Moeller
Jack
James Hare
Magnus Manske
Oldak Quill
Robert Scott Horning
Shlomi Fish
Volker Haas