[WikiEN-l] CAS Discourages Using SciFinder to Help Curate Wikipedia

Steve Summit scs at eskimo.com
Sat Mar 8 16:46:20 UTC 2008


David Gerard wrote:
> http://wwmm.ch.cam.ac.uk/blogs/murrayrust/?p=997
>
> They have a specific hate-on for Wikipedia:
>
> "Chemical Abstracts Service (CAS) objects to anyone encouraging the
> use of SciFinder - and STN - to curate third-party databases or
> chemical substance collections, including the one found in Wikipedia."

By an interesting coincidence, just this morning I've been working
on my own little database of CAS numbers.  You won't find,
anywhere on the net, a nice little tab-delimited file of chemical
information containing columns for name, formula, CAS number, etc.,
precisely because of this CAS claim on the copyright of their numbers.
I was wondering how long it'd be until CAS complained about Wikipedia.

Whether they're non-profit or not, CAS acts precisely as jealous
of its set of numbers as any other commercial database company.
The impression I get is that this *is* a significant nuisance for
chemists and other scientists.  Other entities (I can probably
find the details) have attempted to establish their own, freer
sets of unique identifiers for chemical compounds, precisely in
hopes of avoiding the cumbersome restrictions placed on the use
of CAS numbers.  But CAS has sued -- and I think successfully --
to discourage this, claiming either that they own the idea of a
single master database of unique identifiers for chemical
compounds, or that having a competing set of identifiers would
sow confusion.

Whatever the legality of the situation, I'm betting Wikipedia
isn't the first information service to have incurred CAS's wrath.
We can certainly learn from the experiences of others.  For
example, here's a set of URLs I collected a little while ago
which permit free web-based lookup, by name or CAS number, of
information about chemicals, including their CAS numbers:

	http://www.emolecules.com/
	http://www.cdc.gov/niosh/npg/
	http://www.inchem.org/documents/icsc
	http://pubchem.ncbi.nlm.nih.gov/
	http://chemfinder.cambridgesoft.com/

I get the impression that CAS is grudgingly tolerant of services
which allow one-at-a-time, interactive lookup of chemicals, but
what they're adamant about is that no one create a simple
database of chemicals using the CAS number as a primary or
secondary key.  Wikipedia isn't quite one of those, of course,
but you could do a decent job of creating one by writing a script
to go through every article in [[:Category:Chemical compounds]]
and extract the relevant information from the {{Chembox}} at the
top -- precisely as I've currently got a script in the background
doing.

I work with a guy who used to work at CambridgeSoft; on Monday
I'll ask him what he knows about the CAS situation.



More information about the WikiEN-l mailing list