On Thu, May 8, 2014 at 8:26 PM, phoebe ayers phoebe.wiki@gmail.com wrote:
---------- Forwarded message ---------- From: David Gerard dgerard@gmail.com
While acknowledging the likely truth of the flaws in scientific knowledge production as it stands (single studies in medicine being literally useless, as 80% are actually wrong) ... I think you'll have a bit of an uphill battle attempting to enforce stronger standards in Wikipedia than exist in the field itself. We could go to requiring all medical sourced to be Cochrane-level studies of studies of studies,
That actually is the current best practice for medical articles in English, I believe, and I think it's a good one: https://en.wikipedia.org/wiki/Wikipedia:MEDRS
Indeed so, and I agree it is a good idea.
Sourcing to reviews when possible is particularly relevant for a field (like medicine) that has a well-established tradition of conducting and publishing systematic reviews -- but I find it a useful practice in lots of areas, on the theory that reviews are generally more helpful for someone trying to find out more about a topic.
This is of course part of the same scholarly system that I was referring to earlier in this discussion.
Within Wikipedia, peer-reviewed publications and/or systematic reviews of such studies are considered among the most valuable and high-quality sources. They're a vital building block of the knowledge that Wikipedia seeks to disseminate. We know that all human methods are imperfect; but we're also agreed that the scholarly method is, by and large, superior to other methods of knowledge production.
Now, when I suggested that the Foundation bring these established methods to bear on Wikipedia itself, you (and one or two others) chimed in with concerns about real and potential flaws of scholarly studies and the peer review system. It seemed to me as though underlying these comments there were some sense that, while scholarly methods were good to illuminate any topic under the sun that Wikipedia writes about, they wouldn't be welcome as a method to illuminate Wikipedia itself.
I am well aware of the various documented problems with peer review, and its occasional failures. They haven't led Wikipedia to abandon its view that, by and large, peer-reviewed studies are among the best sources available. So I didn't think your raising problems with aspects of the scholarly method was particularly germane to this discussion of content quality studies. If we didn't believe in the scholarly method, we wouldn't privilege its output in Wikipedia.
Anthony: I hear you about veracity being particularly important in medical articles; and I don't mean to get us too far in the weeds about what quality means -- there's lots to do on lots of articles that I think would be pretty obvious quality improvement, including straight-up fact-checking.
I think any research programme evaluating the quality of Wikipedia content should first and foremost focus on such basics: veracity and fact checking.
Given that the post that started this thread referenced medical content, are you telling me that you think it would be useless to have qualified medical experts reviewing Wikipedia's medical content, because the
process
would be "opaque, messy, prone to failure and doesn't always support innovation"?
No, that is not what I am saying; and leaping to that conclusion seems
like
a rather pointy and bad-faith approach, which makes it just that much more of an effort to participate in this conversation -- if you want to have a dialog with other people, please try to be more generous in your assumptions.
I hope I have explained why I reacted the way I did. Your comments led me to believe that you were simply not very keen on Wikipedia being subjected to a test, using the most objective method available.
What I was trying to say is that I don't think your implication that there is already a well-designed solution that will fix all our problems is correct -- both because it's difficult to apply peer review in this context, and because peer review has plenty of problems itself. I think blind-review quality studies can be useful, but I don't think they're a panacea, anymore than scholarly peer review is itself a panacea for making sure good scholarly work gets published.
There are well-established methods for assessing the quality of written work. I should think that a team composed of both academics well-versed in study design and statistics and Wikimedians familiar with Wikipedia content would over time be able to come up with a methodology that produces good results in assessing project content in various topic areas against the Wikimedia vision.
Once the basic framework has been established, the academics concerned should be given full intellectual freedom to assess the content as they see fit.
I think such efforts would demonstrate leadership, and reflect well on the Foundation.
Anyway, reviewer studies are one tool for assessing quality, but imho they are mostly good for raising awareness of Wikipedia within a particular field (thus possibly gaining new editors), and occasionally for correcting the few articles that do get reviewed.
Article quality has lots of dimensions, including those that reviewers might look for, and others that might not be apparent:
- factual accuracy -- that seems pretty straightforward, though of course
it's not always -- cf historical debates, new evidence coming to light, etc.
- important facets of the topic being highlighted and appropriate coverage
-- also pretty straightforward, except when it's not: what if a new and emerging theory isn't noted, or a historical one given short shrift? More to the point for reviewers, what if *my* theory isn't highlighted?
- A good bibliography and references -- I think experts can particularly
weigh in on this, though standards vary widely across fields and articles for what gets cited, and what's good/seminal/classic is of course never easy to determine and is always under debate.
For some of these aspects, the Wikimedia movement has standards that could be communicated to reviewers. For example, the requirement that content be neutral, reflect prevalent opinions in proportion to their prevalence in the best sources, and so on – a reviewer should not complain that a theory she or he doesn't like, but which is part of scholarly discourse, is given due visibility in Wikipedia. Failures might occur, but we know no system is perfect. All you can do is impress upon reviewers what ideal you are pursuing, and trust in their intellectual honesty to assess articles in terms of their being an effort to meet that goal.
- clear writing -- sometimes we get accused of being too dry or pedantic,
when that's our house style. What to do with this?
- Accessibility -- depends entirely on who is reading it, doesn't it? are
our physics articles accessible to grad students? Usually. Accessible to laypeople or 10th graders? Rarely.
In other areas, like the one you mention here, standards are lacking. I cannot recall the Wikimedia Foundation board ever having provided guidance on whether maths content, say, should be written so that it is helpful to kids doing their schoolwork, to maths students doing their coursework, or maths professors looking to brush up on an area. This is a point Anne touched upon, and there have been many complaints over the years that some Wikipedia content is not written in a way that would be helpful to the average reader.*
I think this reflects a lack of vision on the part of the Foundation as to what kind of reference work Wikipedia should be. And I believe the reason is that opinions on the matter vary, and that people in the Foundation feel that whatever guidance they might provide on such an issue would be disliked by some section of the Wikipedia community.
(Personally, I think that every maths article should at least have an introduction that a 9th-grader would be able to understand.)
- Answers readers' questions -- hard to know without something like
article
feedback or another measuring mechanism. The questions of a new student
are
rarely those of an expert. Using medicine as an example: does the article on cancer answer the questions of doctors, or of newly-diagnosed patients (who are likely to be reading it)? Or the patients' relatives and caregivers? (Or none of the above?)
So yes, we should do reviewer studies to review for "objective" quality. Also, if we're serious about seeing how our articles meet reader needs [certainly one dimension of quality], we should also do reviewer studies with lots of groups of reviewers (medical experts, high school students, cancer patients!) And we should look at automated quality metrics, since reviewing 31 million articles by hand does not necessarily scale. And, we should look into ways to follow up on quality studies with things like to-do lists generated from reviewers, getting people in societies and universities engaged in editing based on the outcome of reviewing, etc. -- so that all of this work has the outcome of measurably improved quality.
Personally (not speaking for the WMF or other trustees here) I think the best thing the WMF can do is provide a platform for this kind of work:
yes,
we can (and do) fund research studies, but in line with our general
mission
to provide the infrastructure for the projects to grow on, we can also
help
build tools to make this work easier, so that groups like Wiki Project Med etc can get studies done easily as well. And we (the community) should develop a list of tools that those interested in doing this work need and want -- and those tools could be developed anywhere, under the aegis of
the
WMF or not.
I disagree. I do think the Foundation, and you as a board member, have a responsibility here. The provision of "high quality content" is part of the Foundation's core aims and values.
You have money – about ten times as much as five or six years ago. I would urge you to invest some of it in seeing how good the present system of content production is at delivering content that meets the aspirations expressed in the Foundation's core values.
Gaining objective data on this would be instructive to both the movement and the public, and provide an important stimulus for quality improvement and measurement efforts, and the recruitment of qualified editors. I understand that quantitative metrics (number of edits, editors, articles and page views) are easier to collect, but still find it disappointing that you haven't made more vigorous efforts to evaluate quality, using the input of subject matter experts.
Identifying problems is useful. Neither the Foundation nor the community should be afraid of some being found, and becoming public. They should be glad, because whenever a problem is brought into focus, it presents the Wikimedia movement with a chance to overcome it, and become a better and more effective movement as a result.
Likewise, when Wikipedians get an article through scholarly peer review, as WikiProject Medicine/Wiki Project Med Foundation have just now managed to do, this motivates further such efforts and fosters learning within the community, based on outside expert input. That is a really good thing, and I would like you to support such grassroots efforts to the best of your ability.
(Off the top of my head, these could include: tools to pull a random/blind sample from a category, perhaps across already-rated articles, that could be replicated across topics to do multiple comparable reviewer studies. Tools to consolidate editor-rating metrics from across languages; maybe representing those ratings in Wikidata. A strong to-do list functionality, and a strong category/quality rating intersection functionality, so that, say, an oncologist interested in working on poor-quality cancer articles could easily get to editing. Displaying all this data easily in the projects, by article. etc. etc.)
These are good ideas that would complement any research initiative undertaken by the Foundation itself. I for one would be happy to see resources invested in pursuing them.
But please consider funding your own content quality research programme, and supporting and encouraging such research being done. The aim in this should not be to be given a clean bill of health that validates the status quo, but the identification of things gone wrong, and improvement opportunities.
Content quality is not the only area worth studying. Wikipedian anthropology and sociology – investigating behavioural patterns, interaction patterns and the effectiveness of administrative structures in the community – would be another worthwhile topic of study. There are plenty of anecdotes of social dysfunction in the community (evidence can be found any day at AN/I, or in documented failures such as that of the Croatian Wikipedia highlighted in the press last autumn).** There have been a handful of studies focused on this area, but I would love the Foundation see more to advance and publicise such research.
Basically, I believe there are unexploited potentials here. Academic research programmes would create synergies, lead to an influx of new people and ideas, and vitalise the movement. The media coverage that Wiki Project Med Foundation/WikiProject Medicine has generated is a good example of that. Such public debates have multiple benefits: they make Wikipedia less opaque, they explain to the public how Wikipedia content is generated and why relying on Wikipedia content may sometimes be a bad idea, they provide visibility for the quality improvement efforts that are underway, and demonstrate social responsibility.
* See the discussion of Wikipedia's maths articles by Alan Riskin from the Maths Department at Whittier College, see http://wikipediocracy.com/2013/10/20/elementary-mathematics-on-wikipedia-2/
** http://www.dailydot.com/politics/croatian-wikipedia-fascist-takeover-controv...