Adrianne,
You might also be interested in posting this on wiki-research-l, https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
There are quite a lot of academic researchers reading that list, and I know at least one of them has done research into references & citations, so in addition to code you might also get some good feedback on your methods.
Regards, Morten
On 2 January 2013 22:04, Adrianne Wadewitz wadewitz@gmail.com wrote:
User:Sadads and I are currently working on an academic article related to the coverage of historical information on Wikipedia (see the outline of our methods below). We were thinking that some of the existing bots or perhaps people in the bot-writing community might be able to help us write some scripts that will pull the information we need off wiki. We are not script writers, but we think what we want to do is pretty easy. Please let us know if you can help us out! Thanks!
Adrianne (User:Wadewitz) and Alex (User:Sadads)
In this study, we use the following ways to analyze the ways in which Wikipedia articles approach historiography. We approached our analysis in two ways: quantitatively and qualitatively. In our quantitative approach, we followed the following procedures:
First, we look at the number of different sources the article cites. We determined this by running a script over the article that counted the number of discrete citations in the footnotes and works cited. Because many articles have a large number of sources but rely on a small number of them for much of their information, we also look at how often each source is used and whether any one source is used disproportionately. While there are reliable sources that could be used in this way, we have found that this is a marker of an article that presents only one historiographical viewpoint.
Second, we are also interested in the types of sources used. So, using a script to check the publication information and template information of the source, we analyzed the ratio of journal to book to newspaper to web sources. Moreover, because articles that have a wide span of publication dates tend to have a good representation of historiography, we analyzed the dates published of the sources.
Third, we searched the articles for the following words, based on a preliminary survey of 25 articles we used as initial. These words indicated that the articles approached history and historiography from an ambiguous or debatable position: “probably”, “possibly”, “on the other hand”, “one view”, “bias”, “perspectives”. We also searched for sections such as “Historiography”, “Modern view”, “Legacy”, and “Assessment”.
We chose to analyze 19th-century FA, GA, and B articles. The GA and FA articles have undergone a review process on Wikipedia and thus should be better. We excluded any B article that had been through a peer review on the site, as we wanted to contrast articles that had been through Wikipedia content revision process. We wanted to know what the “best” articles Wikipedia had to offer before and after comment by the community. We also chose this field as both of us have some familiarity with the time period but neither of us had worked extensively on the articles, so there was no conflict of interest. We also excluded any military history articles because of the significant difference in historiographic focus of the military history community. Additionally, the Wikipedia community has a significant more coverage on the topic of military history, both in number of articles and level of commitment to that subtopic within the community, WikiProject Military History being one of the most active and having a different standard of topic coverage.
-- Dr. Adrianne Wadewitz Mellon Digital Scholarship Fellow Center for Digital Learning + Research Occidental College http://www.oxy.edu/center-digital-learning-research/about https://sites.google.com/site/wadewitz/
Wikibots-l mailing list Wikibots-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikibots-l