I feel I should clarify here. Most editors do not gender-identify in a
public manner on projects. There aren't many who have "This user is
female/male" userboxes (in fact, most editors don't have userboxes). They
don't use the male/female contributor categories. We cannot be certain how
many people choose to use gender-specific userpages on the projects that
have male/female user differentiation abilities.
That is completely separate from the editor surveys, individual results of
which are non-public. I'm hard pressed to suggest that people are
incorrectly identifying their gender there any more than they might do in
any other survey process (which typically comes with disclaimers such as
"accurate within 1% in 19 out of 20 times").
Laura is proposing the building of a dataset from publicly accessible
information, and my comment relates to what information she will be able to
derive from the publicly stated genders of the users working in the
research topic area.
Risker/Anne