On Thu, Aug 5, 2010 at 4:23 PM, Bod Notbod <bodnotbod(a)gmail.com> wrote:
Google has attempted to answer the question of how
many books exist in
a very interesting blog post.
Interesting! This, in a nutshell, is why projects to collect all the
world's bibliographic data face a hard challenge.
Why am I posting this to Foundation-l?
Well, one of the things it reveals is the difficulty of answering this
question and I hope that it has some relation to Wikimedia projects;
in particular, I didn't know that multiple books (entirely unrelated
books) have shared ISBNs. So, if nothing else, it might impact...
AFAIK, this is a fairly uncommon problem; I've never run across it in
6+ years of working with lots of books & library catalogs every day.
What is a much, much, much bigger problem is the issue of serials
holdings: "serials" are normally taken to be things like magazines and
journals, but in library land also might refer to, say, book series,
or government reports that are published with serial numbers. All
sorts of stuff, in other words, and it's cataloged and referred to in
all sorts of ways, which makes it tough for people looking for good
unique identifiers (or trying to figure out what counts as "a book").
And I also thought that Google's attempt to
catologue all books was
parallel to our goal of... well, I'm not sure that we ever say we're
attempting to catalogue ALL knowledge... but we seem to be making a
decent fist of it so far.
It's certainly related to recent thoughts about a bibliographic wiki;
obviously relevant to wikibooks; and it's interesting to think about
scale, which is something that's been on my mind lately. I don't know
how much effort Google made to get records from national libraries in
remote reaches of the world, but I'd imagine that there is still a big
chunk of stuff missing from this count that's not in OCLC etc.
Nonetheless I think posts like this help delineate the general scale
of the information universe that we are trying to usefully capture. I
don't have any idea how those 130M books might map onto topics, for
instance, but I'm guessing our 15M articles don't quite cover it yet.
* I use this address for lists; send personal messages to phoebe.ayers