--- Rowan Collins rowan.collins@gmail.com wrote:
Actually, http://download.wikimedia.org *does* now carry filtered dumps which don't carry User: and Wikipedia: pages.
Ah - That needs to be made much more explicit. Titles like all_titles_in_ns0.gz are buried in a rather long list at http://download.wikimedia.org/wikipedia/en/ and are fairly cryptic to anybody not familiar with MediaWiki.
But this doesn't solve the problem in hand, because we don't currently filter the *image* dumps in any way. This is actually a big problem for all sorts of licensing issues, because all the hard work of permission tagging is likely to be lost in transmission.
If the images are dissociated from their image description pages, then *all* of them are legally useless regardless of license.
But one very relevant implication is that images will be redistributed even if they're not used *anywhere*, let alone if they're only used on user pages. Remember: images are not attached to pages, they are just referenced from them.
Any use would need to follow the license on the image description page. If the GNU FDL is used, then the reuser would need to follow that license, if the CC-BY/SA is used, then the reuser would need to follow that license, if the CC-BY/NC is used, then the reuser would have to follow that license, and if the image can only be used on Wikipedia, then the reuser can't use it at all. The point is that downloading the image database en masse means that the reuser will need to follow the license of each image individually regardless of the presence of NC and used with permission images.
Recap: Since we allow multiple licenses anyway, any reuser would need to follow a whole bunch of different licenses. Adding NC and used with permission in ways that do not infect encyclopedia articles would not be an undue burden or really be anything special given that fact. So I don't see your point.
-- mav
__________________________________ Yahoo! Mail - PC Magazine Editors' Choice 2005 http://mail.yahoo.com