Thanks Ward -- very useful! It would be interesting to run it again on a
recent dump and to find whether certain categories are getting better video
treatment, though the set
Fascinating that in about 2.5 years, the number of videos in that category
has not changed much.
By coincidence, I was looking at a 2009 blog post I had about Encarta and
Wikipedia's lack of video/multimedia.
"There is a loss to the world with the absence of Encarta’s historic images
[and video]. Because Wikipedia has a strict “free” edict on content,
especially images and multimedia, it will always be at a disadvantage in
having visuals that are unique and under copyright protection. For that,
the community will have to wait until copyright runs out on those
materials. Technology may be fast, but that’s one area that will be slow."
-Andrew
On Mon, Jan 21, 2013 at 6:04 PM, Ward Cunningham <ward(a)c2.com> wrote:
Andrew -- Good question. I have an answer. It's a
few years old. But if
you like my method, I bring the data up to date.
I used my exploratory parsing mechanism to look for [[File: ... ]] links
to media files. I first ignored files with familiar suffixes like jpg, png,
gif and pdf. This left lots of ogg and ogv files which I separated out as
videos. This left a couple of oga files and some strange suffixes I didn't
recognize like djvu, shivg and ext. I ignored them.
All total I found 878 video files on 707 pages, 227 of which were flagged
as "Articles containing video clips".
I also looked for {{cite video ... }} templates and found 9,716 of them.
I'm scraping this information from an enwiki.xml dump file downloaded Sep
22, 2010. It was 12,162,183,168 bytes uncompressed and contained 2,598,517
pages.
I'm attaching a text file with one line for each page on which I found (at
least) one video. The tab-separated columns are: page-title, media-file,
clips-flag.
I'd be happy to adjust my methods if there are other ways to markup a
video. I hope this is useful.
Best regards. -- Ward
On Jan 21, 2013, at 3:17 PM, Andrew Lih wrote:
Hi all,
I'm wondering if anyone has done any research into identifying which
articles in Wikipedia have associated video?
There is this category, which only has 280 or so articles:
http://en.wikipedia.org/wiki/Category:Articles_containing_video_clips
It seems far from complete. Appreciate any advice or previous work in this
area.
The background: I'm working with some grad students on staging a Wiki
Makes Video contest in April, and we'd like to do some measurement of the
current state of video in Wikipedia.
Thanks, and email me if you'd like to know more about the video project
for April.
-Andrew
_______________________________________________
Wiki-research-l mailing list
Wiki-research-l(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
_______________________________________________
Wiki-research-l mailing list
Wiki-research-l(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l