It is a little while ago now that I was asked, by Roberta Wedge, about the "missing women" on the English Wikipedia, who are identified in the Oxford Dictionary of National Biography (ODNB). I'm glad to say that, with a recent technical development I now have a list.
(Recent means "since this morning", by the way - thanks as ever to Magnus Manske for his assiduous pushing of the tech support envelope for Wikidata. The new tool PagePile tool at https://tools.wmflabs.org/pagepile/ filled the final gap.)
So I can reveal a list of 2,345 women with their Wikidata IDs at https://www.wikidata.org/wiki/User:Charles_Matthews/PagePile
This comes out as pile 188 for PagePile. If you look at the wikitext on my page, the data is actually in two columns, Wikidata ID followed by name. (This is tabbed data, though the first four or so tabs seemed not to come out straight. Doubtless a teething problem.) It was a question of combining two Wikidata queries with the new tool, which generally speaking handles and combine lists in a clean way that wasn't previously available.
So much for the technical advance. A few comments on the content. These are females with an OBIN (Oxford biographical identifier), rather than those with an article to themselves. They are not automatically notable, therefore. "Missing" is relative: Myra Hindley is on the list, but that is because the name is a redirect on enWP.
Also, this list is not warranted complete, since we are aware of a couple of percent of OBINs that do not yet have a Wikidata entry. Around 150 or 200 more names may have to be added, in the end.
All said, this is the first sight of realistic list, however, of the ODNB missing women. Translation lists, such as those with existing articles on the German or French Wikipedias, can easily be added.
Charles
On Thu, Jul 30, 2015 at 2:13 PM Charles Matthews < charles.r.matthews@ntlworld.com> wrote:
This comes out as pile 188 for PagePile. If you look at the wikitext on my page, the data is actually in two columns, Wikidata ID followed by name. (This is tabbed data, though the first four or so tabs seemed not to come out straight. Doubtless a teething problem.)
FWIW, they are perfectly fine, but because the the first three items IDs are one digit shorter, the tab display in the browser "shifts". You can make them display on-wiki by wrapping the entire block in <pre>. I'll throw in an "export to wikitext" for "get item names" soon.
<sigh> I have a little list....
https://meta.wikimedia.org/wiki/User:Rich_Farmbrough/ODNB_articles_on_women_... (needs splitting up because it's so large).
Unfortunately I am prohibited from putting this on English Wikipedia.
It has been intimated that were I to encourage anyone else to do so it would be equivalent to doing so myself.
I shall have to have a word with The Powers That Be, I suppose.
All the best.
Richard.
Hi Richard,
Your list seems to have some technical problems: most of the entries seem to have many red crosses next to them (for no apparent reason), and the rest seem to blindly link to https://meta.wikimedia.org/wiki/User:Rich_Farmbrough/ODNB_entry https://meta.wikimedia.org/wiki/User:Rich_Farmbrough/ODNB_entry Can you fix those issues? If so, it sounds like a useful list, that I'd be happy to copy over to enwp if there are editors that would find it useful.
Thanks, Mike
On 2 Aug 2015, at 14:23, Richard Farmbrough richard@farmbrough.co.uk wrote:
<sigh> I have a little list....
https://meta.wikimedia.org/wiki/User:Rich_Farmbrough/ODNB_articles_on_women_... (needs splitting up because it's so large).
Unfortunately I am prohibited from putting this on English Wikipedia.
It has been intimated that were I to encourage anyone else to do so it would be equivalent to doing so myself.
I shall have to have a word with The Powers That Be, I suppose.
All the best.
Richard.
Wikimedia UK mailing list wikimediauk-l@wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikimediauk-l WMUK: https://wikimedia.org.uk
It's because Rich is using that page as a template to create the entries. I assume it's easier for the script that scrapes the database to just supply parameters to a template. The three crosses (or ticks) are for "confirmed", "linked" and "used".
HTH
I had hoped we were beyond manual lists and English-Wikipedia-only mentality...
I'll see if I can run it against my mix'n'match ODNB set, and cross-check with "female" on Wikidata later.
On Mon, Aug 3, 2015 at 1:33 AM rexx rexx@blueyonder.co.uk wrote:
It's because Rich is using that page as a template to create the entries. I assume it's easier for the script that scrapes the database to just supply parameters to a template. The three crosses (or ticks) are for "confirmed", "linked" and "used".
HTH
Rexx
On 2 August 2015 at 20:56, Michael Peel email@mikepeel.net wrote:
Hi Richard,
Your list seems to have some technical problems: most of the entries seem to have many red crosses next to them (for no apparent reason), and the rest seem to blindly link to https://meta.wikimedia.org/wiki/User:Rich_Farmbrough/ODNB_entry Can you fix those issues? If so, it sounds like a useful list, that I'd be happy to copy over to enwp if there are editors that would find it useful.
Thanks, Mike
On 2 Aug 2015, at 14:23, Richard Farmbrough richard@farmbrough.co.uk wrote:
<sigh> I have a little list....
https://meta.wikimedia.org/wiki/User:Rich_Farmbrough/ODNB_articles_on_women_... (needs splitting up because it's so large).
Unfortunately I am prohibited from putting this on English Wikipedia.
It has been intimated that were I to encourage anyone else to do so it would be equivalent to doing so myself.
I shall have to have a word with The Powers That Be, I suppose.
All the best.
Richard.
Wikimedia UK mailing list wikimediauk-l@wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikimediauk-l WMUK: https://wikimedia.org.uk
Wikimedia UK mailing list wikimediauk-l@wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikimediauk-l WMUK: https://wikimedia.org.uk
Wikimedia UK mailing list wikimediauk-l@wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikimediauk-l WMUK: https://wikimedia.org.uk
On 3 August 2015 at 08:04, Magnus Manske magnusmanske@googlemail.com wrote:
I had hoped we were beyond manual lists and English-Wikipedia-only mentality...
They still have their uses, e.g.
https://en.wikipedia.org/wiki/User:Charles_Matthews/Longer_DNB
and
https://en.wikipedia.org/w/index.php?title=Wikipedia:WikiProject_Dictionary_...
within the "DNB universe" alone. But I can testify that these are labour-intensive efforts. Putting data into Wikidata, and pulling it out in a custom fashion via queries, is hugely slicker.
Charles
Richard,
I tried to extract the names from your list (which was a challenge, given that some are stated with "name", some with "fornames", some "forenames", "surname", not always in that order, "alt name"...)
After some tweaking, I ended up with 6570 names (some of them double, see above).
Of these, 3481 matched the ODNB names in mix'n'match.
Of these, 24 are not women on Wikidata. I mage a PagePile for those: https://tools.wmflabs.org/pagepile/api.php?id=251&action=get_data&fo...
It appears that at least some of the women on your list are not women. Example from your list:
# {{User:Rich Farmbrough/ODNB entry|image=1|known for=army medical officer and transvestite|born=c.1799|died=1865|forenames=James |surname=Barry}}
"transvestite" does, AFAIK, not qualify as "woman"; even if so, it would be more a transgender case?
Extrapolating from this, there would be less than 50 entries in your list (if one could extract the proper ODNB names) that are not marked as women in Wikidata, most of them correctly so.
On the other hand, there are 6241 Wikidata items with an ODNB ID that are marked as women [1]. Compared with your 6570, that would mean at least 329 women on Wikidata are not marked as such, or do not exist at all.
All of the ODNB items on Wikidata that are marked as human have a gender assigned, so unless something/someone went very wrong, missing gender assignment is not the issue. There are also no ODNB items that do not have an "instance of".
Which leaves these explanations: * There are ~330 women in your list that are neither in Mix'n'match, nor in Wikidata * There are ~330 bogus women in your list * Some combination of the above
As women make up ~10% of the ODNB, 330 missing women in Wikidata would mean we are missing a set of at least 3000 ODNB entries somewhere.
That's all I could come up with quickly.
Cheers, Magnus
[1] https://tools.wmflabs.org/autolist/?language=en&project=wikipedia&ca...
On Sun, Aug 2, 2015 at 2:24 PM Richard Farmbrough richard@farmbrough.co.uk wrote:
<sigh> I have a little list....
https://meta.wikimedia.org/wiki/User:Rich_Farmbrough/ODNB_articles_on_women_... (needs splitting up because it's so large).
Unfortunately I am prohibited from putting this on English Wikipedia.
It has been intimated that were I to encourage anyone else to do so it would be equivalent to doing so myself.
I shall have to have a word with The Powers That Be, I suppose.
All the best.
Richard.
Wikimedia UK mailing list wikimediauk-l@wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikimediauk-l WMUK: https://wikimedia.org.uk
Hi Magnus,
It appears that at least some of the women on your list are not women. Example from your list:
# {{User:Rich Farmbrough/ODNB entry|image=1|known for=army medical officer and transvestite|born=c.1799|died=1865|forenames=James |surname=Barry}}
"transvestite" does, AFAIK, not qualify as "woman"; even if so, it would be more a transgender case?
If there are cases that are 'disputable', it may be an idea to take a list to the LGBT noticeboard for confirmation. In the case of James Barry, she lived as a man but was born a woman. As this is a historical case, today's terminology (such as 'transgender') does not fit well, but certainly Barry can be accurately identified as a woman and is regularly quoted as having notability for being the first British woman doctor, albeit by deception during her life time.
Fae
On 03/08/15 11:18, Fæ wrote:
In the case of James Barry, she lived as a man but was born a woman. As this is a historical case, today's terminology (such as 'transgender') does not fit well, but certainly Barry can be accurately identified as a woman and is regularly quoted as having notability for being the first British woman doctor, albeit by deception during her life time.
There is a comment here:
https://en.wikipedia.org/wiki/James_Barry
James Barry (surgeon) (c. 1790s–1865), physician with disputed gender
Gordo
On 3 August 2015 at 11:04, Magnus Manske magnusmanske@googlemail.com wrote:
After some tweaking, I ended up with 6570 names (some of them double, see above).
Of these, 3481 matched the ODNB names in mix'n'match.
Of these, 24 are not women on Wikidata. I mage a PagePile for those:
https://tools.wmflabs.org/pagepile/api.php?id=251&action=get_data&fo...
It appears that at least some of the women on your list are not women. Example from your list:
# {{User:Rich Farmbrough/ODNB entry|image=1|known for=army medical officer and transvestite|born=c.1799|died=1865|forenames=James |surname=Barry}}
"transvestite" does, AFAIK, not qualify as "woman"; even if so, it would be more a transgender case?
Quite a well-known case:
https://en.wikipedia.org/wiki/James_Barry_(surgeon)
of a woman living as a man.
Extrapolating from this, there would be less than 50 entries in your list (if one could extract the proper ODNB names) that are not marked as women in Wikidata, most of them correctly so.
On the other hand, there are 6241 Wikidata items with an ODNB ID that are marked as women [1]. Compared with your 6570, that would mean at least 329 women on Wikidata are not marked as such, or do not exist at all.
All of the ODNB items on Wikidata that are marked as human have a gender assigned, so unless something/someone went very wrong, missing gender assignment is not the issue. There are also no ODNB items that do not have an "instance of".
Which leaves these explanations:
- There are ~330 women in your list that are neither in Mix'n'match, nor
in Wikidata
- There are ~330 bogus women in your list
- Some combination of the above
The first bullet is probably close enough to the truth.
As women make up ~10% of the ODNB, 330 missing women in Wikidata would mean we are missing a set of at least 3000 ODNB entries somewhere.
The inference needs tweaking, though.
Work on the women in the first edition and first two supplements should have ensured that no missing women are from the older half (in which women represent a lower proportion of entries). Allowing for that, we get more like the current estimate of say 1500 to 2000 ODNB entries missing from the first pass. (There is a plan for doing more, BTW, which I have discussed with Andrew Gray.)
Charles
On 3 August 2015 at 11:04, Magnus Manske magnusmanske@googlemail.com wrote:
Of these, 24 are not women on Wikidata. I mage a PagePile for those:
https://tools.wmflabs.org/pagepile/api.php?id=251&action=get_data&fo...
Rich's list comes well out of closer examination. Some genders were wrong
(turns out Philip or Ray can be a girl). There are confusing duos, a pseudonym, a Celtic case where the ODNB is inexplicit, fictional people, and a barmaid conflated with a ballad about her. All good stuff.
Charles
All the articles are described as "about women" by ODNB themselves. James Barry is generally considered a woman, her motivation for assuming male identity is presumed to be the desire to practice medicine. Women historically have passed as men to join male-exclusive professions, though one might imagine the successes were usually undocumented.
The red crosses are to be be used as the items are found, confirmed to be the same person, and the information from the ODNB article is used in the Wikipedia article. Clearly this is something where the OBIN will be fantastically useful for.
The forename, name, nickname, married name, etc. fields are for the template to construct whatever alternative article names or redirects are needed. Human names are complex, those of the British upper classes, doubly so, though at least they use mainly the basic Latin alphabet.
The list is sufficiently long that it would need to be split, presumably by letter, to avoid exceeding the transclude limits.
Thanks for your many and various comments, I hope that covers the queries.
Richard.
wikimediauk-l@lists.wikimedia.org