Re: [libraries] [Wikidata] Internet Archive has launched a National Emergency Library

27 Mar 2020


      On 3/27/20 3:05 PM, Karen Coyle wrote:
...
Open Library (https://openlibrary.org) has some good features (full 
disclosure: I was on the original OL team at the Archive) but it doesn't 
solve the wheat/chaff problem - something that all large libraries have. 
It also doesn't have a way to provide a useful order of retrievals, 
which is also the case for the Google Books site (OCLC uses numbers of 
holdings, which is pretty good, but no one else has access to that data).
There are some other metrics one can use for a useful ordering.  Number
of views/downloads, which the Internet Archive tracks, is a popular one,
and useful to a point, though it shouldn't be used to exclusion of all
else, as it can can end up hiding useful materials on less popular
topics.
For subject browsing on The Online Books Page, I use some metadata-
based measures to rank within subject categories.  Dates are used for
ranking, where boosts are given not just for recency (as also occurs
in many OPACs) but also for temporal proximity to the subject, if
we have relevant dates in the heading.  (So books on the American Civil
War published during or near the time of the war get a boost, for
instance.) I also use the ordering of subjects in my records as an
estimate of their importance, so listings for subject X will turn up
books with X is their first subject above books with that as their fifth
subject. (That's one reason I really hope libraries don't drop
support for librarian-assigned subject ordering, as some newer
systems do.)  We also cluster similar subjects, an effect that's
most relevant for subjects that don't have many books filed
under them.
I also give a boost to "work" clusters (which in my case are manually
rather than automatically created; though in an automated system one
could use number of editions or amount of metadata recorded for them as
a rough estimate of how important publishers and librarians have found
a work-- at least if the clustering is reasonably accurate.)
There are other techniques one can use for useful ordering.  These
are ones I've found worth implementing on my sites, and could also
be used elsewhere if one saw fit.
John
...
I would love to see curated collections from these book databases. Open 
Library has lists, but they are personal lists and not well managed. How 
can we create useful collections from these online materials?
I'll mention that one project I did was comparing the holdings in a 
public library to the Open Library open access books so that the library 
could offer unlimited access to books where they would generally have 
only a few hard copy items. This was in keeping with the sense of their 
collection but also expanded access. If we could link from digitized 
copies to library collections that would be a huge gain. It solves the 
wheat/chaff problem, although not the ranking one. The problem there is 
matching works/expressions (ISBN is not good enough).
Anyway, onward - and if anyone wishes to manage a project, please post 
widely as I think a crowd-sourced solution is much needed.
kc
On 3/26/20 2:12 PM, Federico Leva (Nemo) wrote:
...
Karen Coyle, 26/03/20 17:44:
...
Unfortunately, until someone turns this into a library it's just a 
random pile of books.
I think the general idea is that archive.org is indeed the "pile of 
books" while the actual library (aspirationally) is openlibrary.org. 
Looking at the collection on archive.org is like looking at the 
compactus room or the inventory books.
Federico

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

Re: [libraries] [Wikidata] Internet Archive has launched a National Emergency Library