Hello, all,
Here is what has happened in the Wikimedian-in-Residentship over the past two weeks (transcluded from the Meta-Wiki page https://meta.wikimedia.org/wiki/BHL/Our_outcomes/WiR/Status_updates/2025-02-07 .):24 January 2025 - 07 February 2025
This report will be will be shorter than the last one https://meta.wikimedia.org/wiki/BHL/Our_outcomes/WiR/Status_updates/2025-01-24, just to keep everything concise:
https://meta.wikimedia.org/wiki/File:Histoire_naturelle_des_perroquets_(9949396895).jpgA *perroquet* (Amazona aestiva https://www.wikidata.org/wiki/Q525897) now with taxonomic links to Wikidata on Commons https://commons.wikimedia.org/wiki/File:Histoire_naturelle_des_perroquets_(9949396895).jpg#P180 based on Flickr tags. BHL OCR https://www.biodiversitylibrary.org/page/40064449 did not contain the information. An opportunity for round tripping?
- After suggestions at the PIWG meeting, a new BHL Creator ID Mix'n'Match catalog https://mix-n-match.toolforge.org/#/catalog/6686 is now available on Wikidata. I have streamed https://www.youtube.com/watch?v=T-q8vgVOrQM&t=3709s a bit the workflow of working with it.
- Now the metadata model has a cardinality column https://docs.google.com/spreadsheets/d/1ocqDQBFaKAQvPsP3HMlrh52faiHiaDU-D9P3yz1oV_M/edit?gid=0#gid=0, i.e. how many of such values are expected per item. Thank you, Susan, for the idea!
- The bot code for adding metadata to Commons https://github.com/lubianat/bhl_sdc_data_curation now includes parsing the Flickr API and reconciling to Wikidata to get "depicts" statements.+"prominent" rank, facilitating downstream reuse on Wikipedia.
- As we focus on South American and African taxa, a task appears: how to determine that a given taxon occurs in those regions? The current workflow goes like this:
1. Add structured metadata, including depicts (P180) https://www.wikidata.org/wiki/Property:P180 statements, connecting images to Wikidata. 2. Add taxon range (P9714) https://www.wikidata.org/wiki/Property:P9714 values to taxa on Wikidata from credible sources (maybe IUCN, GBIF or others, e.g. this batch for South American birds https://quickstatements.toolforge.org/#/batch/243501). 3. Combine both pieces of information to infer categorization to the South America https://commons.wikimedia.org/wiki/Category:Files_from_the_Biodiversity_Heritage_Library_in_South_America and Africa https://commons.wikimedia.org/wiki/Category: categories. 4. Add categories in a batch using pywikibot https://www.mediawiki.org/wiki/Manual:Pywikibot/category.py.
This means that categorization *may be done after the SDC is added*, which opens up a lot the range of target works. For that reason, I am switching to larger scoped works with many images, e.g reviews for taxonomic groups such as Histoire naturelle des perroquets (1805) https://docs.google.com/document/d/1yVmbbK5wXQKjHQI96nq2vRSaZldN00hD9C3vYACmegE/edit?tab=t.0#heading=h.qmeubp341147. *Tell me your favorite BHL work, and I'll be happy to work on it! *
There is a Google Spreadsheet https://docs.google.com/spreadsheets/d/1YhMSb_iBylJaWPX37kZbVzdyWoFidT9a31Pl0oY3buc/edit?gid=0#gid=0 containing the case study documents and other metadata about the ongoing uploads.
- The bottlenecks for uploads seem to be (1) sorting the images in the correct volumes on Commons + Wikidata and (2) figuring out who *actually* illustrated/painted/engraved the images. I am trying to automate some of the steps https://github.com/lubianat/bhl_sdc_data_curation/blob/main/volume_creation_app/app.py, but it is still time-consuming. It is fun, though.
- It may be the case for a more automated but to stick to the super basics, i.e. adding depicts (P180) https://www.wikidata.org/wiki/Property:P180 (inferred from Flickr), BHL page ID (P687) https://www.wikidata.org/wiki/Property:P687, depicts (P180) https://www.wikidata.org/wiki/Property:P180 and depicts (P180) https://www.wikidata.org/wiki/Property:P180 .
- A bot request https://www.wikidata.org/wiki/Wikidata:Requests_for_permissions/Bot/TiagoLubianaBot_5 is going on on Wikidata to add illustrations to up to 30,000 taxon items. This may significantly increase the BHL Image presence on Wikidata (currently around 15,700) and Wikipedia (currently around 5,100) https://glamtools.toolforge.org/glamorous.php?doit=1&category=Files+from+the+Biodiversity+Heritage+Library&use_globalusage=1&show_details=1&projects%5Bwikipedia%5D=1&projects%5Bwikibooks%5D=1&projects%5Bwikispecies%5D=1&projects%5Bwikidata%5D=1&projects%5Bwikiversity%5D=1 .
- On Feb 14, I'll present on the BHL work at the Wikimedia Brasil Open Meeting, in Portuguese (link still not available).
- On Feb 20, I'll be answering questions in an Ask Me Anything format on the BHL Staff call, and there is an open working document https://docs.google.com/document/d/1aZv-fJlF3lD16oocAS9BgFfKLK1acyO-JP-LuCowi_0/edit?tab=t.0#heading=h.f7oq4smrfwlj .
And that is it, and thank you once more for reading it. As always, if you have questions or comments, just let me know,
Cheers, Tiago
*——————————————————————————* *Tiago Lubiana* *Wikimedian-in-Residence, Biodiversity Heritage Library https://www.biodiversitylibrary.org/*
*tiago.bio.br https://tiago.bio.br*