Hi BHL Wiki Workers,
Thank you for letting me sit in for Monday’s meeting and again to @Tiago Lubiana<mailto:tiagolubiana@gmail.com> for allowing time to discuss the inclusion of BHL’s Acknowledgment of Harmful Content on BHL’s meta wiki page: https://meta.wikimedia.org/wiki/Biodiversity_Heritage_Library/Harmful_Conte….
I submitted a Digital Library Federation (DLF) Forum Lightening Talk proposal around this topic and if you would be so kind as to take a few minutes, I would appreciate your vote to help my talk get selected.
Please go to https://airtable.com/applZThXOH2co8de5/shr2l3Jv2kMTOYfXm • and vote for “Upholding truth (and sanity) through Wiki: securing BHL's Acknowledgment of Harmful Content”
Abstract and Proposal submitted
Bianca Crowley
Smithsonian Libraries and Archives, Biodiversity Heritage Library, United States of America
Brief Abstract
The semantic web offers a safe-harbor for language subject to new U.S. government imposed scrutiny. By harnessing the linked data environment, organizations like the Biodiversity Heritage Library (BHL) can secure anti-racist statements in democratized knowledge spaces that enable contextualization, multilingual translation, and uphold truth (and sanity) for historical accuracy.
Full Proposal
The volatile U.S. political climate spurred by the new administration has required federal institutions to reconsider strategic goals under a different lens that is explicitly prohibitive about language and particular institutional values. Headquartered at Smithsonian Libraries and Archives, the Biodiversity Heritage Library (BHL) is the largest free and open access digital library of biodiversity relevant publications and archival materials. Its 63 million+ pages document over 500 years of natural science knowledge, a portion of which describes the origins, justifications, and consequences of “scientifically proven” White male supremacy. Published in September 2021, BHL’s Acknowledgement of Harmful Content is an example of language subject to U.S. government imposed scrutiny. By placing a copy of this language on Wikimedia’s Meta Wiki, BHL seeks to secure its acknowledgment from potential revision or removal, facilitate multilingual translation of the text, provide space for critique, and permit concepts described in the acknowledgment to be semantically connected to the historical contexts, scholarly research, and evidence that accurately demonstrate why the acknowledgment is necessary and long overdue. As an organizational value statement embedded within a semantic web environment, BHL’s acknowledgment becomes part of a critical digital infrastructure that is informing search results, data networks, and machine learning. This talk will discuss how the Wikimedia Foundation’s semantic web implementation is effective for democratizing access to knowledge and providing invaluable context via linked open data that can uphold truth (and sanity) for historical accuracy.
Thank you for your consideration,
Bianca
--
Bianca Crowley
Digital Collections Manager, Biodiversity Heritage Library
she/her/hers
phone: 202.633.2239
crowleyb(a)si.edu<mailto:crowleyb@si.edu>
biodiversitylibrary.org<https://biodiversitylibrary.org/>
Smithsonian Libraries and Archives
LibrariesArchives.si.edu<https://librariesarchives.si.edu/>
Hello, everyone, here is what happened in the last 2 weeks (and a few
days):
(adapted from
https://meta.wikimedia.org/wiki/BHL/Our_outcomes/WiR/Status_updates/2025-04…
)04 April 2025 - 21 April 2025
This by-weekly update took a bit longer due to (1) me getting a nasty flu
past week and (2) Easter celebrations around Brazil. Better late than
never, so here they are!
General Updates
- We got a session approved for Living Data 2025:
<https://www.livingdata2025.com/> Me & Anabela Plos (GBIF Argentina) are
organizing the session *Wikimedia and Biodiversity Data: A Mutualistic
Relationship in the Open Knowledge Ecosystem* at Living Data in October.
The rubric says this session explores the intersection of Wikimedia
projects (Wikipedia, Wikidata, and Commons) with global biodiversity
infrastructures like GBIF, iNaturalist, and the Biodiversity Heritage
Library (BHL), emphasizing their alignment for data mobilization,
standardization, and public engagement and knowledge dissemination*.
(...)*The session might have anywhere from 60 min to 120 min in
total. *Abstracts
are super welcome, including virtual presentations* as Living Data will
be a hybrid conference. Registrations are already open
<https://www.livingdata2025.com/registration.html>. Note, though, that
there is a fee for registration of virtual participants (USD 120).
- *Grant submitted for the Wikimedia Research Fund 2025*: Titled
*Biodiversity
Knowledge Gaps on Wikipedia: A Cross-Lingual Analysis of Species Coverage
and Contribution Patterns*, the grant a possible way to try and extend
this work. It is a 9-month research project to investigate the flow of
biodiversity content in Portuguese, Spanish and English Wikipedias. Not
directly BHL, but with tight links to strategies to make BHL content reach
wider audiences. I made the 12-page project available on Zenodo too
<https://zenodo.org/records/15236084>. The grant is also a possibility
of funding my travel to Living Data; let's see how it goes!
Technical Updates
- *Reporting the WiR work as a paper:* I have been investigating
the Biodiversity
Data Journal <https://bdj.pensoft.net/about#Author-Guidelines> as a
venue for publishing a little article about the Wikimedian-in-Residence
process. I lean towards a *Data Paper* treating the BHL — SDC subset as
a dataset on its own. It would be a matter of extracting the relevant
triples from the Commons + Wikidata pair and making it available for reuse.
Possible uses include machine learning applications (like Mike Trizna
did for the Flickr subset
<https://huggingface.co/spaces/MikeTrizna/bhl_flickr_search>) or
interactive art applications, as suggested in BHL's Annual Meeting. The BHL
Image Explorer <https://bhl-gallery.toolforge.org/> is also, on its own,
an example of reuse of the dataset. Other options for reporting on what we
did include GigaScience <https://academic.oup.com/gigascience> and the RIO
Journal <https://riojournal.com/>, as well as releasing a pre-print in
the ArphaHub <https://preprints.arphahub.com/>.
- *Acknowledgement of Harmful Content*: Due to the pace of changes in
the U.S. federal landscape for DEI support, Bianca raised the possibiloity
of hosting BHL's Acknowledgement of Harmful Content on the Meta Wiki page.
Benefits include the possibility of translating the information into
multiple languages. I have transcluded it into Meta Wiki
<https://meta.wikimedia.org/wiki/Biodiversity_Heritage_Library/Harmful_Conte…>
as a fork of the content in the BHL website; we probably should discuss in
the future whether to turn this into the main source of the information or
not to avoid conflicting drifts in the future.
- *One more QuickStatements generator*: The BHL Title QuickStatements
generator <https://bhl-qs-generator-production.up.railway.app/> is
online. Rod Page’s excellent bhl2wiki <https://bhl2wiki.herokuapp.com/>
should still be the go-to tool for adding BHL DOIs to Wikidata. This tool
is just slightly different: it (1)is Built in Python/Flask; (2) looks up
BHL authors reconciled to Wikidata via BHL creator ID (P4081)
<https://www.wikidata.org/wiki/Property:P4081> (3) adds BHL bibliography
ID (P4327) <https://www.wikidata.org/wiki/Property:P4327>)
<https://www.wikidata.org/wiki/Property:P4327>, (4) uses "written work"
<https://www.wikidata.org/wiki/Q47461344> as the instance of (P31)
<https://www.wikidata.org/wiki/Property:P31> value, (5) includes an API
endpoint at /api/quickstatements and (6) uses the new multiple language
system
<https://www.wikidata.org/wiki/Help:Default_values_for_labels_and_aliases>
for the titles. Eventually, though, there should be only one *bhl2wiki*
tool. That will need some coordination and should take more time than this
quick workaround, and I apologize for any confusion.
- *The Wikimedia Hackathon*: I am joining the Wikimedia Hackathon
<https://www.mediawiki.org/wiki/Wikimedia_Hackathon_2025> from May 2nd
to May 4th in Istanbul, Turkey. It should be fun. I still haven't picked a
hackathon project. I might leverage the gathering of knowledgeable
tech-savvy Wikimedians to try and start the tool for direct upload of
images from BHL --> Commons. There are many possible projects, though, and
other ideas may appear!
- *Cleaning code*: I kept working on cleaning the code for the
structured data uploads
<https://github.com/lubianat/bhl_sdc_data_curation>, making it more
readable and reusable in the future. The idea is that after this
Wikimedian-in-Residence contract is finished, the script is ready for
bite-sized, fun, volunteer-driven contributions (by myself and others).
And that is it! Thank you again for reading!
Have a great week,
Tiago
*——————————————————————————*
*Tiago Lubiana*
*Wikimedian-in-Residence, Biodiversity Heritage Library
<https://www.biodiversitylibrary.org/>*
*tiago.bio.br <https://tiago.bio.br>*
Hi, everyone, here is what happened in the last 2 weeks of our Wiki work:
(adapted from
https://meta.wikimedia.org/wiki/BHL/Our_outcomes/WiR/Status_updates/2025-04…
)21 March 2024 - 04 April 2025
There is only *one month left* and I am already missing it, though I will
definitely be around in the BHL-Wiki Working groupǃ The updates for this
week are shorter, as focus went fully on the events and on
documenting/cleaning up the code.
General Updates
- *We hosted three 1Pic1Bio events *to increase usage of BHL images on
Wikipedia. The events occurred on Spanish on March 26
<https://meta.wikimedia.org/wiki/Event:1Pic1Bio_(Spanish)>, in French on
March 28 <https://meta.wikimedia.org/wiki/Event:1Pic1Bio_(French)> and
in Portuguese on April 2
<https://meta.wikimedia.org/wiki/Event:1Pic1Bio_(Portuguese)> and were
hosted by me, Giovanna, Siobhan and Lidia <https://lidiapv.com/>. Each
of the events took around 1h45 and brought different insights on the
relation between Wikimedia communities and BHL. As the events were related
to equity, diversity and inclusion, they were sponsored by the Wikimedia
Foundation, but not the Smithsonian, because of the need to comply with
U.S. Federal orders. The events were recorded via Zoom and recordings will
be made available by WMF in the near future.
<https://meta.wikimedia.org/wiki/File:Gemeinn%C3%BCzzige_Naturgeschichte_des…>Due
to the events, this curious killer whale illustration
<https://commons.wikimedia.org/wiki/File:Gemeinn%C3%BCzzige_Naturgeschichte_…>
from 1780 now decorates the session on "killer whales through history
<https://pt.wikipedia.org/wiki/Orca>" in the Portuguese Wikipedia.
- *The BHL Day ̟and Annual Meeting are coming up next week* and Siobhan
and Sabine will be in Berlin to discuss all kinds of nice Wiki things. I'll
join them virtually on a 45-minute workshop on the morning of April 10
discussing reuse of the BHL collection. If you are in Berlin, join usǃ
Technical Updates
- *BHL Image Explorer*ː The events were also great testing grounds for
the BHL Image Explorer <https://bhl-gallery.toolforge.org/>, and now it
is moving towards becoming a mature tool to explore BHL Images in Commons.
There were several technical changes based on user feedback / pain points,
including
- *Wikidata autocomplete*ː the taxon search now uses Wikidata as a
backend for the taxon selection. This means searching for common names
(e.g. *Baobab* instead of *Adansonia digitata)* is possible, as the
tool uses the full information on Wikidata to suggest candidates. Only
candidates with GBIF identifiers are shown.
- *Click-based navigation*ː I have added a clickable taxonomic
hierarchy box, enabling users to navigate to upper taxa by
clicking around
(e.g. if I am on the page for *Adansonia digitata
<https://bhl-gallery.toolforge.org/?taxonKey=5406695>* I can go to
the order *Malvales <https://bhl-gallery.toolforge.org/?taxonKey=941>*
with a single click. The species names are also clickable,
redirecting the
user to the BHL Explorer page for the species.
- *Distribution map*ː A map of GBIF Occurrences is displayed next to
the taxonomy, giving some visual feedback on where a taxon is expected to
occur. This may help users to know better if a species is *present*
in South America or *exclusive* to South America, for example.
- *Wikimedia reuse counter*ː Image boxes now show a Wikimedia reuse
counter for each of the images, giving an idea of their impact inside of
the Wikimedia ecosystem. Most images have 0 global Wikimedia uses – a lot
of opportunities for volunteersǃ ː)
- *Refactor and clean-up*ː A core activity now is a bit invisibleː going
through the code base and making sure it is reusable in the future. I am
focusing the work on two repositories, one for the BHL Image Explorer
<https://github.com/lubianat/bhl-gallery> and the other one for the scripts
to update structured data on Commons
<https://github.com/lubianat/bhl_sdc_data_curation>.
- *Internet Archive / ARCH*ː On March 25, I met with Karl Blumentahl
from the Internet Archive to discuss possible uses of ARCH to improve
HTR/OCR and taxonomic name recognition for plates in the BHL Image
collection. I had no follow-up on his side since then, but it may
eventually lead to something.
And that is it, and see you in a couple of weeks. For those going to
Berlin, have a safe travel and a great eventǃ
Cheers,
Tiago
*——————————————————————————*
*Tiago Lubiana*
*Wikimedian-in-Residence, Biodiversity Heritage Library
<https://www.biodiversitylibrary.org/>*
*tiago.bio.br <https://tiago.bio.br>*