On Fri, Jan 21, 2011 at 3:58 AM, Rajarshi Guha rajarshi.guha@gmail.com wrote:
On Jan 19, 2011, at 10:19 AM, Carcharoth wrote:
On Wed, Jan 19, 2011 at 3:10 PM, Andrew Gray <andrew.gray@dunelm.org.uk
wrote:
I'm curious as well. I'm also curious as to why the user wants to extract this information, given that they should (going by their signature) have access to databases that already have this sort of information (the sort of databases that should be supplying the information in the Wikipedia infoboxes). There probably is a reason, but I can't immediately think of one.
Partly because Wikipedia has done an aggregation on the multiple data sources
It might be better to extract links to the sources, rather than the actual data itself, which could be in a vandalised state at the time of extraction. What I guess I'm saying is that the data is better obtained from the sources, rather than Wikipedia. Or at the least cross-checking with the sources needs to be done, depending on what the data will be used for.
Carcharoth