Andrew -- Good question. I have an answer. It's a few years old. But if you like my method, I bring the data up to date.I used my exploratory parsing mechanism to look for [[File: ... ]] links to media files. I first ignored files with familiar suffixes like jpg, png, gif and pdf. This left lots of ogg and ogv files which I separated out as videos. This left a couple of oga files and some strange suffixes I didn't recognize like djvu, shivg and ext. I ignored them.All total I found 878 video files on 707 pages, 227 of which were flagged as "Articles containing video clips".I also looked for {{cite video ... }} templates and found 9,716 of them.I'm scraping this information from an enwiki.xml dump file downloaded Sep 22, 2010. It was 12,162,183,168 bytes uncompressed and contained 2,598,517 pages.I'm attaching a text file with one line for each page on which I found (at least) one video. The tab-separated columns are: page-title, media-file, clips-flag.I'd be happy to adjust my methods if there are other ways to markup a video. I hope this is useful.Best regards. -- WardOn Jan 21, 2013, at 3:17 PM, Andrew Lih wrote:Hi all,_______________________________________________I'm wondering if anyone has done any research into identifying which articles in Wikipedia have associated video?There is this category, which only has 280 or so articles:It seems far from complete. Appreciate any advice or previous work in this area.The background: I'm working with some grad students on staging a Wiki Makes Video contest in April, and we'd like to do some measurement of the current state of video in Wikipedia.Thanks, and email me if you'd like to know more about the video project for April.-Andrew
Wiki-research-l mailing list
Wiki-research-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
_______________________________________________
Wiki-research-l mailing list
Wiki-research-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l