On Fri, Jan 21, 2011 at 3:58 AM, Rajarshi Guha <rajarshi.guha(a)gmail.com> wrote:
On Jan 19, 2011, at 10:19 AM, Carcharoth wrote:
On Wed, Jan 19, 2011 at 3:10 PM, Andrew Gray
<andrew.gray(a)dunelm.org.uk
wrote:
I'm curious as well. I'm also curious as to why the user wants to
extract this information, given that they should (going by their
signature) have access to databases that already have this sort of
information (the sort of databases that should be supplying the
information in the Wikipedia infoboxes). There probably is a reason,
but I can't immediately think of one.
Partly because Wikipedia has done an aggregation on the multiple data
sources
It might be better to extract links to the sources, rather than the
actual data itself, which could be in a vandalised state at the time
of extraction. What I guess I'm saying is that the data is better
obtained from the sources, rather than Wikipedia. Or at the least
cross-checking with the sources needs to be done, depending on what
the data will be used for.
Carcharoth