On 3/27/20 3:05 PM, Karen Coyle wrote:
Open Library (
https://openlibrary.org) has some good
features (full
disclosure: I was on the original OL team at the Archive) but it doesn't
solve the wheat/chaff problem - something that all large libraries have.
It also doesn't have a way to provide a useful order of retrievals,
which is also the case for the Google Books site (OCLC uses numbers of
holdings, which is pretty good, but no one else has access to that data).
There are some other metrics one can use for a useful ordering. Number
of views/downloads, which the Internet Archive tracks, is a popular one,
and useful to a point, though it shouldn't be used to exclusion of all
else, as it can can end up hiding useful materials on less popular
topics.
For subject browsing on The Online Books Page, I use some metadata-
based measures to rank within subject categories. Dates are used for
ranking, where boosts are given not just for recency (as also occurs
in many OPACs) but also for temporal proximity to the subject, if
we have relevant dates in the heading. (So books on the American Civil
War published during or near the time of the war get a boost, for
instance.) I also use the ordering of subjects in my records as an
estimate of their importance, so listings for subject X will turn up
books with X is their first subject above books with that as their fifth
subject. (That's one reason I really hope libraries don't drop
support for librarian-assigned subject ordering, as some newer
systems do.) We also cluster similar subjects, an effect that's
most relevant for subjects that don't have many books filed
under them.
I also give a boost to "work" clusters (which in my case are manually
rather than automatically created; though in an automated system one
could use number of editions or amount of metadata recorded for them as
a rough estimate of how important publishers and librarians have found
a work-- at least if the clustering is reasonably accurate.)
There are other techniques one can use for useful ordering. These
are ones I've found worth implementing on my sites, and could also
be used elsewhere if one saw fit.
John
I would love to see curated collections from these book databases. Open
Library has lists, but they are personal lists and not well managed. How
can we create useful collections from these online materials?
I'll mention that one project I did was comparing the holdings in a
public library to the Open Library open access books so that the library
could offer unlimited access to books where they would generally have
only a few hard copy items. This was in keeping with the sense of their
collection but also expanded access. If we could link from digitized
copies to library collections that would be a huge gain. It solves the
wheat/chaff problem, although not the ranking one. The problem there is
matching works/expressions (ISBN is not good enough).
Anyway, onward - and if anyone wishes to manage a project, please post
widely as I think a crowd-sourced solution is much needed.
kc
On 3/26/20 2:12 PM, Federico Leva (Nemo) wrote:
Karen Coyle, 26/03/20 17:44:
Unfortunately, until someone turns this into a
library it's just a
random pile of books.
I think the general idea is that
archive.org is indeed the "pile of
books" while the actual library (aspirationally) is
openlibrary.org.
Looking at the collection on
archive.org is like looking at the
compactus room or the inventory books.
Federico