In a message dated 12/10/2010 1:10:26 PM Pacific Standard Time, jamesmikedupont@googlemail.com writes:
My point is we should index them ourselves. We should have the pages used as references first listed in an easy to use manner and if possible we should cache them. If they are not cacheable because of some restrictions, the references should be marked somehow as not as good and people might find better references. In the end, like citeseer you will find that pages that are available and open and cachable will be cited and used more than pages that are not.
Right now, I dont know of a simple way to even get this list of references from wp. There is alot of work to do, and if we do this, it will benefit the wikipedia. Another thing to do is to translate the pages referenced.
mike
I understand your point, but you're avoiding answering the points I raised. They are archived at archive.org by permission. You tell archive.org to archive your site, and they do. You tell them to stop, and they do. What advantage would we have to repeat the caching yet again that archive.org is already doing? You haven't answered that.
No matter what occurs, you're going to have trouble retrieving the list of refs from a WP page (or any web page), without knowing some programming language like PHP. Using PHP it's a fairly trivial parsing request. It's that's your only problem, I can write you a script to do it, for twenty bucks.
You cannot translate a work, which is under copyright protection, without violating their copyright. Copyright extends to any effort that substantially mimics the underlying work. A translation is found to violate copyright. You could however make a parody :)
W