I'm wondering what people have found to be the best practices for identifying video in Wikipedia articles.
A number of issues:
- One of the problems is the OGG is a container, so simply parsing article Wikimarkup may not be sufficient to identify video content.
- You can go by category, but this is not always fully accurate
- Are GIFs that are animated considered video? Some are, and some aren't.
Interested in hearing what people think, or whether we have a taxonomy of video types that are well defined.
-Andrew
On Fri, Dec 5, 2014 at 10:24 AM, Andrew Lih andrew@andrewlih.com wrote:
I'm wondering what people have found to be the best practices for identifying video in Wikipedia articles.
A number of issues:
- One of the problems is the OGG is a container, so simply parsing article
Wikimarkup may not be sufficient to identify video content.
You can go by category, but this is not always fully accurate
Are GIFs that are animated considered video? Some are, and some aren't.
I would argue that 'motion image' should be the criteria from a viewer/reader perspective, which a library should be serving, so I would say that they should be included.
Interested in hearing what people think, or whether we have a taxonomy of video types that are well defined.
-Andrew
Wikivideo-l mailing list Wikivideo-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikivideo-l
When structured data on Commons is live, this will be quite easy (allowing time for media to be tagged also).
If you are looking for a solution that works now I do not have any better ideas.
*Med vänliga hälsningar,Jan Ainali*
Verksamhetschef, Wikimedia Sverige http://wikimedia.se 0729 - 67 29 48
*Tänk dig en värld där varje människa har fri tillgång till mänsklighetens samlade kunskap. Det är det vi gör.* Bli medlem. http://blimedlem.wikimedia.se
2014-12-05 16:24 GMT+01:00 Andrew Lih andrew@andrewlih.com:
I'm wondering what people have found to be the best practices for identifying video in Wikipedia articles.
A number of issues:
- One of the problems is the OGG is a container, so simply parsing article
Wikimarkup may not be sufficient to identify video content.
You can go by category, but this is not always fully accurate
Are GIFs that are animated considered video? Some are, and some aren't.
Interested in hearing what people think, or whether we have a taxonomy of video types that are well defined.
-Andrew
Wikivideo-l mailing list Wikivideo-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikivideo-l
On Dec 5, 2014 12:09 PM, "Jan Ainali" jan.ainali@wikimedia.se wrote:
When structured data on Commons is live, this will be quite easy
(allowing time for media to be tagged also).
If you are looking for a solution that works now I do not have any better
ideas.
Med vänliga hälsningar, Jan Ainali
Verksamhetschef, Wikimedia Sverige 0729 - 67 29 48
Tänk dig en värld där varje människa har fri tillgång till mänsklighetens
samlade kunskap. Det är det vi gör.
Bli medlem.
2014-12-05 16:24 GMT+01:00 Andrew Lih andrew@andrewlih.com:
I'm wondering what people have found to be the best practices for
identifying video in Wikipedia articles.
A number of issues:
- One of the problems is the OGG is a container, so simply parsing
article Wikimarkup may not be sufficient to identify video content.
You can go by category, but this is not always fully accurate
Are GIFs that are animated considered video? Some are, and some aren't.
Interested in hearing what people think, or whether we have a taxonomy
of video types that are well defined.
-Andrew
Are you looking for a list of articles with videos? We can probably do that now with a db query (there may be a small number of false negatives on the ogg front, but probably 98% of them can be identified from db. Gifs present a complicating factor but probably still do-able.).
--bawolff
Brian, thanks yes that would be what I'd be looking for.
In fact, a monthly report on a regular basis would be really interesting to see.
I've worked in the past with Ward Cunningham on his fast parser to get some initial data for last year, but it'd be great to get an update.
FYI, here were some of our findings then:
A January 2013 dump of the English Wikipedia database, we were able to identify 4,061 instances of video files embedded in Wikipedia articles.
Count: Video file 169 Articleevolution.ogg 21 Verifiability and Neutral point of view (Common Craft)-600px-en.ogv 13 Commercial-LBJ1964ElectionAdDaisyGirl.ogv 9 SeaSnails.ogg 8 Machinima sample reindeer full size.ogg 8 1946-10-08 21 Nazi Chiefs Guilty.ogv 6 Wikipedia video tutorial-1-Editing-en.ogv 6 The Impact Of Wikipedia.webm 6 LightningCNP.ogg 6 Camouflage (1944).ogv
The number of actual unique videos used in articles was 3,100, removing duplicate uses of videos in multiple articles. Overall, the number of videos used in English Wikipedia articles is fairly low, at a rate of 0.1%, when compared to the 4.2 million articles in January 2013
-Andrew
-Andrew Lih Associate professor of journalism, American University Email: andrew@andrewlih.com WEB: http://www.andrewlih.com BOOK: The Wikipedia Revolution: http://www.wikipediarevolution.com PROJECT: Wiki Makes Video http://en.wikipedia.org/wiki/Wikipedia:WikiProject_Wiki_Makes_Video
On Fri, Dec 5, 2014 at 1:19 PM, Brian Wolff bawolff@gmail.com wrote:
On Dec 5, 2014 12:09 PM, "Jan Ainali" jan.ainali@wikimedia.se wrote:
When structured data on Commons is live, this will be quite easy
(allowing time for media to be tagged also).
If you are looking for a solution that works now I do not have any
better ideas.
Med vänliga hälsningar, Jan Ainali
Verksamhetschef, Wikimedia Sverige 0729 - 67 29 48
Tänk dig en värld där varje människa har fri tillgång till
mänsklighetens samlade kunskap. Det är det vi gör.
Bli medlem.
2014-12-05 16:24 GMT+01:00 Andrew Lih andrew@andrewlih.com:
I'm wondering what people have found to be the best practices for
identifying video in Wikipedia articles.
A number of issues:
- One of the problems is the OGG is a container, so simply parsing
article Wikimarkup may not be sufficient to identify video content.
You can go by category, but this is not always fully accurate
Are GIFs that are animated considered video? Some are, and some
aren't.
Interested in hearing what people think, or whether we have a taxonomy
of video types that are well defined.
-Andrew
Are you looking for a list of articles with videos? We can probably do that now with a db query (there may be a small number of false negatives on the ogg front, but probably 98% of them can be identified from db. Gifs present a complicating factor but probably still do-able.).
--bawolff
Wikivideo-l mailing list Wikivideo-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikivideo-l
Andrew,
Thanks for these helpful stats!
We’re painfully aware that video remains under-utilised on our sites.
Until we can devote more resources to make it easier to transcode, upload and share video, it may be helpful to start discussions with both video contributors and article editors, to better understand what would be needed for video to become more frequently integrated in articles. This could inform future plans to allocate more resources to this important content type.
Are any conversations taking place along those lines? If not, would anyone like to spearhead a public discussion in coming months?
Fabrice
On Dec 5, 2014, at 10:50 AM, Andrew Lih andrew@andrewlih.com wrote:
Brian, thanks yes that would be what I'd be looking for.
In fact, a monthly report on a regular basis would be really interesting to see.
I've worked in the past with Ward Cunningham on his fast parser to get some initial data for last year, but it'd be great to get an update.
FYI, here were some of our findings then:
A January 2013 dump of the English Wikipedia database, we were able to identify 4,061 instances of video files embedded in Wikipedia articles.
Count: Video file 169 Articleevolution.ogg 21 Verifiability and Neutral point of view (Common Craft)-600px-en.ogv 13 Commercial-LBJ1964ElectionAdDaisyGirl.ogv 9 SeaSnails.ogg 8 Machinima sample reindeer full size.ogg 8 1946-10-08 21 Nazi Chiefs Guilty.ogv 6 Wikipedia video tutorial-1-Editing-en.ogv 6 The Impact Of Wikipedia.webm 6 LightningCNP.ogg 6 Camouflage (1944).ogv
The number of actual unique videos used in articles was 3,100, removing duplicate uses of videos in multiple articles. Overall, the number of videos used in English Wikipedia articles is fairly low, at a rate of 0.1%, when compared to the 4.2 million articles in January 2013
-Andrew
-Andrew Lih Associate professor of journalism, American University Email: andrew@andrewlih.com mailto:andrew@andrewlih.com WEB: http://www.andrewlih.com http://www.andrewlih.com/ BOOK: The Wikipedia Revolution: http://www.wikipediarevolution.com http://www.wikipediarevolution.com/ PROJECT: Wiki Makes Video http://en.wikipedia.org/wiki/Wikipedia:WikiProject_Wiki_Makes_Video http://en.wikipedia.org/wiki/Wikipedia:WikiProject_Wiki_Makes_Video
On Fri, Dec 5, 2014 at 1:19 PM, Brian Wolff <bawolff@gmail.com mailto:bawolff@gmail.com> wrote:
On Dec 5, 2014 12:09 PM, "Jan Ainali" <jan.ainali@wikimedia.se mailto:jan.ainali@wikimedia.se> wrote:
When structured data on Commons is live, this will be quite easy (allowing time for media to be tagged also).
If you are looking for a solution that works now I do not have any better ideas.
Med vänliga hälsningar, Jan Ainali
Verksamhetschef, Wikimedia Sverige 0729 - 67 29 48
Tänk dig en värld där varje människa har fri tillgång till mänsklighetens samlade kunskap. Det är det vi gör. Bli medlem.
2014-12-05 16:24 GMT+01:00 Andrew Lih <andrew@andrewlih.com mailto:andrew@andrewlih.com>:
I'm wondering what people have found to be the best practices for identifying video in Wikipedia articles.
A number of issues:
One of the problems is the OGG is a container, so simply parsing article Wikimarkup may not be sufficient to identify video content.
You can go by category, but this is not always fully accurate
Are GIFs that are animated considered video? Some are, and some aren't.
Interested in hearing what people think, or whether we have a taxonomy of video types that are well defined.
-Andrew
Are you looking for a list of articles with videos? We can probably do that now with a db query (there may be a small number of false negatives on the ogg front, but probably 98% of them can be identified from db. Gifs present a complicating factor but probably still do-able.).
--bawolff
Wikivideo-l mailing list Wikivideo-l@lists.wikimedia.org mailto:Wikivideo-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikivideo-l https://lists.wikimedia.org/mailman/listinfo/wikivideo-l
Wikivideo-l mailing list Wikivideo-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikivideo-l
_______________________________
Fabrice Florin Product Manager, Multimedia Wikimedia Foundation
On 12/5/14, Andrew Lih andrew@andrewlih.com wrote:
Brian, thanks yes that would be what I'd be looking for.
In fact, a monthly report on a regular basis would be really interesting to see.
Alright, here is my first attempt:
http://tools.wmflabs.org/bawolff/usedVideos.htm (Data formatted as tsv if anyone wants to do further processing: http://tools.wmflabs.org/bawolff/usedVideos.txt )
It gives a mostly alphabetical list of articles with videos on them. A video is defined as follows: *A webm file *An ogg file, registered as video in the database (This roughly means that it has the string "theora" somewhere in the first 256 bytes of the file, not counting the string "ffmpeg2theora", except for some older files might still count the ffmpeg2theora, and also there's no garuntee that an ogg theora file has a theora data packet in the first 255 bytes, and its also very possible for non-theora files to have that string in the header. Consider this a "rough" metric. In practise I think it works most of the time, but do your own checking before using for anything serious). *An animated gif file that is at least 10 seconds long. I figured this very roughly separates non-videos esque gifs from video-ish gifs.
Based on that metric, there are currently 8464 articles on enwikipedia that have videos on them (6442 if you take out the longer than 10 seconds GIF files).
Before setting this up to update itself, is this the sort of thing you are looking for? Would it be more useful with different definitions of a "video", or instead of listing it as an alphabetical list of articles, orient it around which video is used the most places? Or would some other ordering be best?
I guess I'm asking, what questions about videos are you actually looking to answer, and how could this type of report be modified to better answer them?
--bawolff
p.s. For those interested in this sort of thing, the sql query I used was:
select page_title, GROUP_CONCAT( i2.img_name separator ', ' ) as "commons videos", GROUP_CONCAT( i1.img_name separator ', ' ) as "enwiki videos", GROUP_CONCAT( i3.img_name separator ', ' ) as "commons long gifs", GROUP_CONCAT( i4.img_name separator ', ' ) as "enwiki long gifs" from page inner join imagelinks on il_from = page_id left join image i1 on il_to = i1.img_name and i1.img_media_type = 'VIDEO' left join commonswiki_p.image i2 on il_to = i2.img_name and i2.img_media_type = 'VIDEO' left join commonswiki_p.image i3 on il_to = i3.img_name and i3.img_media_type = 'BITMAP' and i3.img_major_mime = 'image' and i3.img_minor_mime = 'gif' and i3.img_metadata regexp '"duration";d:\d{2,}' left join image i4 on il_to = i4.img_name and i4.img_media_type = 'BITMAP' and i4.img_major_mime = 'image' and i4.img_minor_mime = 'gif' and i4.img_metadata regexp '"duration";d:\d{2,}' where page_namespace = 0 and (i1.img_name is not null or i2.img_name is not null or i3.img_name is not null or i4.img_name is not null) group by page_title;
Brian, thanks much for running this. I'll spend some time in the next day to run some metrics to see how it compares with our Jan 2013 results.
In general, this is what I'm looking for and I'll post some interesting stats when I process this.
-Andrew
-Andrew Lih Associate professor of journalism, American University Email: andrew@andrewlih.com WEB: http://www.andrewlih.com BOOK: The Wikipedia Revolution: http://www.wikipediarevolution.com PROJECT: Wiki Makes Video http://en.wikipedia.org/wiki/Wikipedia:WikiProject_Wiki_Makes_Video
On Sun, Dec 7, 2014 at 12:22 AM, Brian Wolff bawolff@gmail.com wrote:
On 12/5/14, Andrew Lih andrew@andrewlih.com wrote:
Brian, thanks yes that would be what I'd be looking for.
In fact, a monthly report on a regular basis would be really interesting
to
see.
Alright, here is my first attempt:
http://tools.wmflabs.org/bawolff/usedVideos.htm (Data formatted as tsv if anyone wants to do further processing: http://tools.wmflabs.org/bawolff/usedVideos.txt )
It gives a mostly alphabetical list of articles with videos on them. A video is defined as follows: *A webm file *An ogg file, registered as video in the database (This roughly means that it has the string "theora" somewhere in the first 256 bytes of the file, not counting the string "ffmpeg2theora", except for some older files might still count the ffmpeg2theora, and also there's no garuntee that an ogg theora file has a theora data packet in the first 255 bytes, and its also very possible for non-theora files to have that string in the header. Consider this a "rough" metric. In practise I think it works most of the time, but do your own checking before using for anything serious). *An animated gif file that is at least 10 seconds long. I figured this very roughly separates non-videos esque gifs from video-ish gifs.
Based on that metric, there are currently 8464 articles on enwikipedia that have videos on them (6442 if you take out the longer than 10 seconds GIF files).
Before setting this up to update itself, is this the sort of thing you are looking for? Would it be more useful with different definitions of a "video", or instead of listing it as an alphabetical list of articles, orient it around which video is used the most places? Or would some other ordering be best?
I guess I'm asking, what questions about videos are you actually looking to answer, and how could this type of report be modified to better answer them?
--bawolff
p.s. For those interested in this sort of thing, the sql query I used was:
select page_title, GROUP_CONCAT( i2.img_name separator ', ' ) as "commons videos", GROUP_CONCAT( i1.img_name separator ', ' ) as "enwiki videos", GROUP_CONCAT( i3.img_name separator ', ' ) as "commons long gifs", GROUP_CONCAT( i4.img_name separator ', ' ) as "enwiki long gifs" from page inner join imagelinks on il_from = page_id left join image i1 on il_to = i1.img_name and i1.img_media_type = 'VIDEO' left join commonswiki_p.image i2 on il_to = i2.img_name and i2.img_media_type = 'VIDEO' left join commonswiki_p.image i3 on il_to = i3.img_name and i3.img_media_type = 'BITMAP' and i3.img_major_mime = 'image' and i3.img_minor_mime = 'gif' and i3.img_metadata regexp '"duration";d:\d{2,}' left join image i4 on il_to = i4.img_name and i4.img_media_type = 'BITMAP' and i4.img_major_mime = 'image' and i4.img_minor_mime = 'gif' and i4.img_metadata regexp '"duration";d:\d{2,}' where page_namespace = 0 and (i1.img_name is not null or i2.img_name is not null or i3.img_name is not null or i4.img_name is not null) group by page_title;
Wikivideo-l mailing list Wikivideo-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikivideo-l
Brian, there were some interesting results in the data you filtered from the database. The good news is that it syncs quite well with the data we had from January 2013, in terms of ogg, ogv and webm. A few notes:
1. These are the most popular Commons videos in en.wp. Pretty much the same as January 2013 except for #2, where someone really wanted to embed that Reagan Speech in a lot of places.
Commercial-LBJ1964ElectionAdDaisyGirl.ogv 13 Reagan Speech Beirut Bombing.ogv 12 Machinima sample reindeer full size.ogg 9 1946-10-08 21 Nazi Chiefs Guilty.ogv 9 SeaSnails.ogg 8 Shakinghands high.OGG 7 The Impact Of Wikipedia.webm 6 CollateralMurder.ogv 6 1946-07-15 Philippines Independence Proclaimed.ogv 6
2. These are the most popular long GIFs on Commons, used in en.wp:
EC-EU-enlargement animation.gif 53 Linguistic map Southwestern Europe.gif 18 Canada provinces evolution 2.gif 12 Pangea animation 03.gif 11 Mohammad adil-Rashidun empire-slide.gif 10
3. We may have to tweak the GIF filter. For some reason, it picked up some odd results like classifying these LOCAL en.wp Mexico-related stub GIF icons as video. The metadata page does not suggest they should be seen as long animations. The files are, from the table listing:
Mx-actor.gif 275 Mx-singer.gif 49 Mx-actor.gif, Mx-singer.gif 43
https://en.wikipedia.org/wiki/File:Mx-actor.gif
-Andrew
-Andrew Lih Associate professor of journalism, American University Email: andrew@andrewlih.com WEB: http://www.andrewlih.com BOOK: The Wikipedia Revolution: http://www.wikipediarevolution.com PROJECT: Wiki Makes Video http://en.wikipedia.org/wiki/Wikipedia:WikiProject_Wiki_Makes_Video
On Tue, Dec 9, 2014 at 3:12 PM, Andrew Lih andrew@andrewlih.com wrote:
Brian, thanks much for running this. I'll spend some time in the next day to run some metrics to see how it compares with our Jan 2013 results.
In general, this is what I'm looking for and I'll post some interesting stats when I process this.
-Andrew
-Andrew Lih Associate professor of journalism, American University Email: andrew@andrewlih.com WEB: http://www.andrewlih.com BOOK: The Wikipedia Revolution: http://www.wikipediarevolution.com PROJECT: Wiki Makes Video http://en.wikipedia.org/wiki/Wikipedia:WikiProject_Wiki_Makes_Video
On Sun, Dec 7, 2014 at 12:22 AM, Brian Wolff bawolff@gmail.com wrote:
On 12/5/14, Andrew Lih andrew@andrewlih.com wrote:
Brian, thanks yes that would be what I'd be looking for.
In fact, a monthly report on a regular basis would be really
interesting to
see.
Alright, here is my first attempt:
http://tools.wmflabs.org/bawolff/usedVideos.htm (Data formatted as tsv if anyone wants to do further processing: http://tools.wmflabs.org/bawolff/usedVideos.txt )
It gives a mostly alphabetical list of articles with videos on them. A video is defined as follows: *A webm file *An ogg file, registered as video in the database (This roughly means that it has the string "theora" somewhere in the first 256 bytes of the file, not counting the string "ffmpeg2theora", except for some older files might still count the ffmpeg2theora, and also there's no garuntee that an ogg theora file has a theora data packet in the first 255 bytes, and its also very possible for non-theora files to have that string in the header. Consider this a "rough" metric. In practise I think it works most of the time, but do your own checking before using for anything serious). *An animated gif file that is at least 10 seconds long. I figured this very roughly separates non-videos esque gifs from video-ish gifs.
Based on that metric, there are currently 8464 articles on enwikipedia that have videos on them (6442 if you take out the longer than 10 seconds GIF files).
Before setting this up to update itself, is this the sort of thing you are looking for? Would it be more useful with different definitions of a "video", or instead of listing it as an alphabetical list of articles, orient it around which video is used the most places? Or would some other ordering be best?
I guess I'm asking, what questions about videos are you actually looking to answer, and how could this type of report be modified to better answer them?
--bawolff
p.s. For those interested in this sort of thing, the sql query I used was:
select page_title, GROUP_CONCAT( i2.img_name separator ', ' ) as "commons videos", GROUP_CONCAT( i1.img_name separator ', ' ) as "enwiki videos", GROUP_CONCAT( i3.img_name separator ', ' ) as "commons long gifs", GROUP_CONCAT( i4.img_name separator ', ' ) as "enwiki long gifs" from page inner join imagelinks on il_from = page_id left join image i1 on il_to = i1.img_name and i1.img_media_type = 'VIDEO' left join commonswiki_p.image i2 on il_to = i2.img_name and i2.img_media_type = 'VIDEO' left join commonswiki_p.image i3 on il_to = i3.img_name and i3.img_media_type = 'BITMAP' and i3.img_major_mime = 'image' and i3.img_minor_mime = 'gif' and i3.img_metadata regexp '"duration";d:\d{2,}' left join image i4 on il_to = i4.img_name and i4.img_media_type = 'BITMAP' and i4.img_major_mime = 'image' and i4.img_minor_mime = 'gif' and i4.img_metadata regexp '"duration";d:\d{2,}' where page_namespace = 0 and (i1.img_name is not null or i2.img_name is not null or i3.img_name is not null or i4.img_name is not null) group by page_title;
Wikivideo-l mailing list Wikivideo-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikivideo-l
I removed some of the translusions of the Regan video.
--Tom
On Wed, Dec 10, 2014 at 4:11 PM, Andrew Lih andrew@andrewlih.com wrote:
Brian, there were some interesting results in the data you filtered from the database. The good news is that it syncs quite well with the data we had from January 2013, in terms of ogg, ogv and webm. A few notes:
- These are the most popular Commons videos in en.wp. Pretty much the same
as January 2013 except for #2, where someone really wanted to embed that Reagan Speech in a lot of places.
Commercial-LBJ1964ElectionAdDaisyGirl.ogv 13 Reagan Speech Beirut Bombing.ogv 12 Machinima sample reindeer full size.ogg 9 1946-10-08 21 Nazi Chiefs Guilty.ogv 9 SeaSnails.ogg 8 Shakinghands high.OGG 7 The Impact Of Wikipedia.webm 6 CollateralMurder.ogv 6 1946-07-15 Philippines Independence Proclaimed.ogv 6
- These are the most popular long GIFs on Commons, used in en.wp:
EC-EU-enlargement animation.gif 53 Linguistic map Southwestern Europe.gif 18 Canada provinces evolution 2.gif 12 Pangea animation 03.gif 11 Mohammad adil-Rashidun empire-slide.gif 10
- We may have to tweak the GIF filter. For some reason, it picked up some
odd results like classifying these LOCAL en.wp Mexico-related stub GIF icons as video. The metadata page does not suggest they should be seen as long animations. The files are, from the table listing:
Mx-actor.gif 275 Mx-singer.gif 49 Mx-actor.gif, Mx-singer.gif 43
https://en.wikipedia.org/wiki/File:Mx-actor.gif
-Andrew
-Andrew Lih Associate professor of journalism, American University Email: andrew@andrewlih.com WEB: http://www.andrewlih.com BOOK: The Wikipedia Revolution: http://www.wikipediarevolution.com PROJECT: Wiki Makes Video http://en.wikipedia.org/wiki/Wikipedia:WikiProject_Wiki_Makes_Video
On Tue, Dec 9, 2014 at 3:12 PM, Andrew Lih andrew@andrewlih.com wrote:
Brian, thanks much for running this. I'll spend some time in the next day to run some metrics to see how it compares with our Jan 2013 results.
In general, this is what I'm looking for and I'll post some interesting stats when I process this.
-Andrew
-Andrew Lih Associate professor of journalism, American University Email: andrew@andrewlih.com WEB: http://www.andrewlih.com BOOK: The Wikipedia Revolution: http://www.wikipediarevolution.com PROJECT: Wiki Makes Video http://en.wikipedia.org/wiki/Wikipedia:WikiProject_Wiki_Makes_Video
On Sun, Dec 7, 2014 at 12:22 AM, Brian Wolff bawolff@gmail.com wrote:
On 12/5/14, Andrew Lih andrew@andrewlih.com wrote:
Brian, thanks yes that would be what I'd be looking for.
In fact, a monthly report on a regular basis would be really interesting to see.
Alright, here is my first attempt:
http://tools.wmflabs.org/bawolff/usedVideos.htm (Data formatted as tsv if anyone wants to do further processing: http://tools.wmflabs.org/bawolff/usedVideos.txt )
It gives a mostly alphabetical list of articles with videos on them. A video is defined as follows: *A webm file *An ogg file, registered as video in the database (This roughly means that it has the string "theora" somewhere in the first 256 bytes of the file, not counting the string "ffmpeg2theora", except for some older files might still count the ffmpeg2theora, and also there's no garuntee that an ogg theora file has a theora data packet in the first 255 bytes, and its also very possible for non-theora files to have that string in the header. Consider this a "rough" metric. In practise I think it works most of the time, but do your own checking before using for anything serious). *An animated gif file that is at least 10 seconds long. I figured this very roughly separates non-videos esque gifs from video-ish gifs.
Based on that metric, there are currently 8464 articles on enwikipedia that have videos on them (6442 if you take out the longer than 10 seconds GIF files).
Before setting this up to update itself, is this the sort of thing you are looking for? Would it be more useful with different definitions of a "video", or instead of listing it as an alphabetical list of articles, orient it around which video is used the most places? Or would some other ordering be best?
I guess I'm asking, what questions about videos are you actually looking to answer, and how could this type of report be modified to better answer them?
--bawolff
p.s. For those interested in this sort of thing, the sql query I used was:
select page_title, GROUP_CONCAT( i2.img_name separator ', ' ) as "commons videos", GROUP_CONCAT( i1.img_name separator ', ' ) as "enwiki videos", GROUP_CONCAT( i3.img_name separator ', ' ) as "commons long gifs", GROUP_CONCAT( i4.img_name separator ', ' ) as "enwiki long gifs" from page inner join imagelinks on il_from = page_id left join image i1 on il_to = i1.img_name and i1.img_media_type = 'VIDEO' left join commonswiki_p.image i2 on il_to = i2.img_name and i2.img_media_type = 'VIDEO' left join commonswiki_p.image i3 on il_to = i3.img_name and i3.img_media_type = 'BITMAP' and i3.img_major_mime = 'image' and i3.img_minor_mime = 'gif' and i3.img_metadata regexp '"duration";d:\d{2,}' left join image i4 on il_to = i4.img_name and i4.img_media_type = 'BITMAP' and i4.img_major_mime = 'image' and i4.img_minor_mime = 'gif' and i4.img_metadata regexp '"duration";d:\d{2,}' where page_namespace = 0 and (i1.img_name is not null or i2.img_name is not null or i3.img_name is not null or i4.img_name is not null) group by page_title;
Wikivideo-l mailing list Wikivideo-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikivideo-l
Wikivideo-l mailing list Wikivideo-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikivideo-l
Some more fun with stats and video. Here's a list of articles with the most "videos."
Number one is a catalog of Ronald Reagan speeches which explains its top rank, but number two is almost entirely animated GIFs showing how to do visual algorithms from China's Ming dynasty.
-Andrew
Speeches_and_debates_of_Ronald_Reagan 20 Rod_calculus 18 Meter_(music) 15 Time_signature 15 Ebola_virus_disease 14 Prevention_of_viral_hemorrhagic_fever 14 Atacama_Large_Millimeter_Array 13 Solar_cycle_24 12 Apollo_15 11 Hasta_Vinyasas 11 Multi-touch 11 Suez_Crisis 11 Les_Vampires 10 Notes_inégales 10 Private_Snafu 10 Winsor_McCay 10 Principles_of_Hindu_Reckoning 9 Behind_the_Screen 8 Biology_of_Diptera 8 China:_The_Roots_of_Madness 8 Colpoda 8 European_Extremely_Large_Telescope 8 Festa_del_Santissimo_Salvatore_a_Pazzano 8 First_Motion_Picture_Unit 8 Glossary_of_ballet 8 La_Silla_Observatory 8 Solar_flare 8 Why_We_Fight 8 Dwight_Buycks 7 History_of_the_Delft_University_of_Technology 7 Luke_Harangody 7 Rede_Tupi 7 Rep-tile 7 Shire_Hall,_Monmouth 7 STS-131 7 Yevgeni_Bauer 7
-Andrew Lih Associate professor of journalism, American University Email: andrew@andrewlih.com WEB: http://www.andrewlih.com BOOK: The Wikipedia Revolution: http://www.wikipediarevolution.com PROJECT: Wiki Makes Video http://en.wikipedia.org/wiki/Wikipedia:WikiProject_Wiki_Makes_Video
On Wed, Dec 10, 2014 at 6:16 PM, Tom Fish guerillero.wikipedia@gmail.com wrote:
I removed some of the translusions of the Regan video.
--Tom
On Wed, Dec 10, 2014 at 4:11 PM, Andrew Lih andrew@andrewlih.com wrote:
Brian, there were some interesting results in the data you filtered from
the
database. The good news is that it syncs quite well with the data we had from January 2013, in terms of ogg, ogv and webm. A few notes:
- These are the most popular Commons videos in en.wp. Pretty much the
same
as January 2013 except for #2, where someone really wanted to embed that Reagan Speech in a lot of places.
Commercial-LBJ1964ElectionAdDaisyGirl.ogv 13 Reagan Speech Beirut Bombing.ogv 12 Machinima sample reindeer full size.ogg 9 1946-10-08 21 Nazi Chiefs Guilty.ogv 9 SeaSnails.ogg 8 Shakinghands high.OGG 7 The Impact Of Wikipedia.webm 6 CollateralMurder.ogv 6 1946-07-15 Philippines Independence Proclaimed.ogv 6
- These are the most popular long GIFs on Commons, used in en.wp:
EC-EU-enlargement animation.gif 53 Linguistic map Southwestern Europe.gif 18 Canada provinces evolution 2.gif 12 Pangea animation 03.gif 11 Mohammad adil-Rashidun empire-slide.gif 10
- We may have to tweak the GIF filter. For some reason, it picked up
some
odd results like classifying these LOCAL en.wp Mexico-related stub GIF
icons
as video. The metadata page does not suggest they should be seen as long animations. The files are, from the table listing:
Mx-actor.gif 275 Mx-singer.gif 49 Mx-actor.gif, Mx-singer.gif 43
https://en.wikipedia.org/wiki/File:Mx-actor.gif
-Andrew
-Andrew Lih Associate professor of journalism, American University Email: andrew@andrewlih.com WEB: http://www.andrewlih.com BOOK: The Wikipedia Revolution: http://www.wikipediarevolution.com PROJECT: Wiki Makes Video http://en.wikipedia.org/wiki/Wikipedia:WikiProject_Wiki_Makes_Video
On Tue, Dec 9, 2014 at 3:12 PM, Andrew Lih andrew@andrewlih.com wrote:
Brian, thanks much for running this. I'll spend some time in the next
day
to run some metrics to see how it compares with our Jan 2013 results.
In general, this is what I'm looking for and I'll post some interesting stats when I process this.
-Andrew
-Andrew Lih Associate professor of journalism, American University Email: andrew@andrewlih.com WEB: http://www.andrewlih.com BOOK: The Wikipedia Revolution: http://www.wikipediarevolution.com PROJECT: Wiki Makes Video http://en.wikipedia.org/wiki/Wikipedia:WikiProject_Wiki_Makes_Video
On Sun, Dec 7, 2014 at 12:22 AM, Brian Wolff bawolff@gmail.com wrote:
On 12/5/14, Andrew Lih andrew@andrewlih.com wrote:
Brian, thanks yes that would be what I'd be looking for.
In fact, a monthly report on a regular basis would be really interesting to see.
Alright, here is my first attempt:
http://tools.wmflabs.org/bawolff/usedVideos.htm (Data formatted as tsv if anyone wants to do further processing: http://tools.wmflabs.org/bawolff/usedVideos.txt )
It gives a mostly alphabetical list of articles with videos on them. A video is defined as follows: *A webm file *An ogg file, registered as video in the database (This roughly means that it has the string "theora" somewhere in the first 256 bytes of the file, not counting the string "ffmpeg2theora", except for some older files might still count the ffmpeg2theora, and also there's no garuntee that an ogg theora file has a theora data packet in the first 255 bytes, and its also very possible for non-theora files to have that string in the header. Consider this a "rough" metric. In practise I think it works most of the time, but do your own checking before using for anything serious). *An animated gif file that is at least 10 seconds long. I figured this very roughly separates non-videos esque gifs from video-ish gifs.
Based on that metric, there are currently 8464 articles on enwikipedia that have videos on them (6442 if you take out the longer than 10 seconds GIF files).
Before setting this up to update itself, is this the sort of thing you are looking for? Would it be more useful with different definitions of a "video", or instead of listing it as an alphabetical list of articles, orient it around which video is used the most places? Or would some other ordering be best?
I guess I'm asking, what questions about videos are you actually looking to answer, and how could this type of report be modified to better answer them?
--bawolff
p.s. For those interested in this sort of thing, the sql query I used was:
select page_title, GROUP_CONCAT( i2.img_name separator ', ' ) as "commons videos", GROUP_CONCAT( i1.img_name separator ', ' ) as "enwiki videos", GROUP_CONCAT( i3.img_name separator ', ' ) as "commons long gifs", GROUP_CONCAT( i4.img_name separator ', ' ) as "enwiki long gifs" from page inner join imagelinks on il_from = page_id left join image i1 on il_to = i1.img_name and i1.img_media_type = 'VIDEO' left join commonswiki_p.image i2 on il_to = i2.img_name and i2.img_media_type = 'VIDEO' left join commonswiki_p.image i3 on il_to = i3.img_name and i3.img_media_type = 'BITMAP' and i3.img_major_mime = 'image' and i3.img_minor_mime = 'gif' and i3.img_metadata regexp '"duration";d:\d{2,}' left join image i4 on il_to = i4.img_name and i4.img_media_type = 'BITMAP' and i4.img_major_mime = 'image' and i4.img_minor_mime = 'gif' and i4.img_metadata regexp '"duration";d:\d{2,}' where page_namespace = 0 and (i1.img_name is not null or i2.img_name is not null or i3.img_name is not null or i4.img_name is not null) group by page_title;
Wikivideo-l mailing list Wikivideo-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikivideo-l
Wikivideo-l mailing list Wikivideo-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikivideo-l
Wikivideo-l mailing list Wikivideo-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikivideo-l
Thanks for Sharing Andrew. This is fascinating. I imagine the CDC may have produced those clips with Wikipedia in mind?
Is there a similar list of articles that community has voted *should* have video?
Perhaps film, video and motion image topics themselves?
On Fri, Dec 12, 2014 at 12:53 PM, Andrew Lih andrew@andrewlih.com wrote:
Some more fun with stats and video. Here's a list of articles with the most "videos."
Number one is a catalog of Ronald Reagan speeches which explains its top rank, but number two is almost entirely animated GIFs showing how to do visual algorithms from China's Ming dynasty.
-Andrew
Speeches_and_debates_of_Ronald_Reagan 20 Rod_calculus 18 Meter_(music) 15 Time_signature 15 Ebola_virus_disease 14 Prevention_of_viral_hemorrhagic_fever 14 Atacama_Large_Millimeter_Array 13 Solar_cycle_24 12 Apollo_15 11 Hasta_Vinyasas 11 Multi-touch 11 Suez_Crisis 11 Les_Vampires 10 Notes_inégales 10 Private_Snafu 10 Winsor_McCay 10 Principles_of_Hindu_Reckoning 9 Behind_the_Screen 8 Biology_of_Diptera 8 China:_The_Roots_of_Madness 8 Colpoda 8 European_Extremely_Large_Telescope 8 Festa_del_Santissimo_Salvatore_a_Pazzano 8 First_Motion_Picture_Unit 8 Glossary_of_ballet 8 La_Silla_Observatory 8 Solar_flare 8 Why_We_Fight 8 Dwight_Buycks 7 History_of_the_Delft_University_of_Technology 7 Luke_Harangody 7 Rede_Tupi 7 Rep-tile 7 Shire_Hall,_Monmouth 7 STS-131 7 Yevgeni_Bauer 7
-Andrew Lih Associate professor of journalism, American University Email: andrew@andrewlih.com WEB: http://www.andrewlih.com BOOK: The Wikipedia Revolution: http://www.wikipediarevolution.com PROJECT: Wiki Makes Video http://en.wikipedia.org/wiki/Wikipedia:WikiProject_Wiki_Makes_Video
On Wed, Dec 10, 2014 at 6:16 PM, Tom Fish guerillero.wikipedia@gmail.com wrote:
I removed some of the translusions of the Regan video.
--Tom
On Wed, Dec 10, 2014 at 4:11 PM, Andrew Lih andrew@andrewlih.com wrote:
Brian, there were some interesting results in the data you filtered
from the
database. The good news is that it syncs quite well with the data we had from January 2013, in terms of ogg, ogv and webm. A few notes:
- These are the most popular Commons videos in en.wp. Pretty much the
same
as January 2013 except for #2, where someone really wanted to embed that Reagan Speech in a lot of places.
Commercial-LBJ1964ElectionAdDaisyGirl.ogv 13 Reagan Speech Beirut Bombing.ogv 12 Machinima sample reindeer full size.ogg 9 1946-10-08 21 Nazi Chiefs Guilty.ogv 9 SeaSnails.ogg 8 Shakinghands high.OGG 7 The Impact Of Wikipedia.webm 6 CollateralMurder.ogv 6 1946-07-15 Philippines Independence Proclaimed.ogv 6
- These are the most popular long GIFs on Commons, used in en.wp:
EC-EU-enlargement animation.gif 53 Linguistic map Southwestern Europe.gif 18 Canada provinces evolution 2.gif 12 Pangea animation 03.gif 11 Mohammad adil-Rashidun empire-slide.gif 10
- We may have to tweak the GIF filter. For some reason, it picked up
some
odd results like classifying these LOCAL en.wp Mexico-related stub GIF
icons
as video. The metadata page does not suggest they should be seen as long animations. The files are, from the table listing:
Mx-actor.gif 275 Mx-singer.gif 49 Mx-actor.gif, Mx-singer.gif 43
https://en.wikipedia.org/wiki/File:Mx-actor.gif
-Andrew
-Andrew Lih Associate professor of journalism, American University Email: andrew@andrewlih.com WEB: http://www.andrewlih.com BOOK: The Wikipedia Revolution: http://www.wikipediarevolution.com PROJECT: Wiki Makes Video http://en.wikipedia.org/wiki/Wikipedia:WikiProject_Wiki_Makes_Video
On Tue, Dec 9, 2014 at 3:12 PM, Andrew Lih andrew@andrewlih.com
wrote:
Brian, thanks much for running this. I'll spend some time in the next
day
to run some metrics to see how it compares with our Jan 2013 results.
In general, this is what I'm looking for and I'll post some interesting stats when I process this.
-Andrew
-Andrew Lih Associate professor of journalism, American University Email: andrew@andrewlih.com WEB: http://www.andrewlih.com BOOK: The Wikipedia Revolution: http://www.wikipediarevolution.com PROJECT: Wiki Makes Video http://en.wikipedia.org/wiki/Wikipedia:WikiProject_Wiki_Makes_Video
On Sun, Dec 7, 2014 at 12:22 AM, Brian Wolff bawolff@gmail.com
wrote:
On 12/5/14, Andrew Lih andrew@andrewlih.com wrote:
Brian, thanks yes that would be what I'd be looking for.
In fact, a monthly report on a regular basis would be really interesting to see.
Alright, here is my first attempt:
http://tools.wmflabs.org/bawolff/usedVideos.htm (Data formatted as
tsv
if anyone wants to do further processing: http://tools.wmflabs.org/bawolff/usedVideos.txt )
It gives a mostly alphabetical list of articles with videos on them. A video is defined as follows: *A webm file *An ogg file, registered as video in the database (This roughly means that it has the string "theora" somewhere in the first 256 bytes of the file, not counting the string "ffmpeg2theora", except for some older files might still count the ffmpeg2theora, and also there's no garuntee that an ogg theora file has a theora data packet in the first 255 bytes, and its also very possible for non-theora files to have that string in the header. Consider this a "rough" metric. In practise I think it works most of the time, but do your own checking before using for anything serious). *An animated gif file that is at least 10 seconds long. I figured this very roughly separates non-videos esque gifs from video-ish gifs.
Based on that metric, there are currently 8464 articles on enwikipedia that have videos on them (6442 if you take out the longer than 10 seconds GIF files).
Before setting this up to update itself, is this the sort of thing you are looking for? Would it be more useful with different definitions of a "video", or instead of listing it as an alphabetical list of articles, orient it around which video is used the most places? Or would some other ordering be best?
I guess I'm asking, what questions about videos are you actually looking to answer, and how could this type of report be modified to better answer them?
--bawolff
p.s. For those interested in this sort of thing, the sql query I used was:
select page_title, GROUP_CONCAT( i2.img_name separator ', ' ) as "commons videos", GROUP_CONCAT( i1.img_name separator ', ' ) as "enwiki videos", GROUP_CONCAT( i3.img_name separator ', ' ) as "commons long gifs", GROUP_CONCAT( i4.img_name separator ', ' ) as "enwiki long gifs" from page inner join imagelinks on il_from = page_id left join image i1 on il_to = i1.img_name and i1.img_media_type = 'VIDEO' left join commonswiki_p.image i2 on il_to = i2.img_name and i2.img_media_type = 'VIDEO' left join commonswiki_p.image i3 on il_to = i3.img_name and i3.img_media_type = 'BITMAP' and i3.img_major_mime = 'image' and i3.img_minor_mime = 'gif' and i3.img_metadata regexp '"duration";d:\d{2,}' left join image i4 on il_to = i4.img_name and i4.img_media_type = 'BITMAP' and i4.img_major_mime = 'image' and i4.img_minor_mime = 'gif' and i4.img_metadata regexp '"duration";d:\d{2,}' where page_namespace = 0 and (i1.img_name is not null or i2.img_name is not null or i3.img_name is not null or i4.img_name is not null) group by page_title;
Wikivideo-l mailing list Wikivideo-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikivideo-l
Wikivideo-l mailing list Wikivideo-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikivideo-l
Wikivideo-l mailing list Wikivideo-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikivideo-l
Wikivideo-l mailing list Wikivideo-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikivideo-l
On 12/10/14, Andrew Lih andrew@andrewlih.com wrote:
Brian, there were some interesting results in the data you filtered from the database. The good news is that it syncs quite well with the data we had from January 2013, in terms of ogg, ogv and webm. A few notes:
- These are the most popular Commons videos in en.wp. Pretty much the same
as January 2013 except for #2, where someone really wanted to embed that Reagan Speech in a lot of places.
Commercial-LBJ1964ElectionAdDaisyGirl.ogv 13 Reagan Speech Beirut Bombing.ogv 12 Machinima sample reindeer full size.ogg 9 1946-10-08 21 Nazi Chiefs Guilty.ogv 9 SeaSnails.ogg 8 Shakinghands high.OGG 7 The Impact Of Wikipedia.webm 6 CollateralMurder.ogv 6 1946-07-15 Philippines Independence Proclaimed.ogv 6
- These are the most popular long GIFs on Commons, used in en.wp:
EC-EU-enlargement animation.gif 53 Linguistic map Southwestern Europe.gif 18 Canada provinces evolution 2.gif 12 Pangea animation 03.gif 11 Mohammad adil-Rashidun empire-slide.gif 10
- We may have to tweak the GIF filter. For some reason, it picked up some
odd results like classifying these LOCAL en.wp Mexico-related stub GIF icons as video. The metadata page does not suggest they should be seen as long animations. The files are, from the table listing:
Mx-actor.gif 275 Mx-singer.gif 49 Mx-actor.gif, Mx-singer.gif 43
https://en.wikipedia.org/wiki/File:Mx-actor.gif
-Andrew
According to the metadata, Mx-actor.gif is an animated gif consisting of 1 frame that's shown for 10 seconds... Which is odd. I've excluding all animated GIFs that are only a single frame long.
This report should automatically update once a week on tuesdays at roughly 7am UTC.
One thing I should note about that report is that the columns will get cut off if they exceed 4096 characters.
I also created a second report for videos on commons that are used on any wiki in any namespace. Its at https://tools.wmflabs.org/bawolff/usedVideosCommons.htm (The query for this report is actually a lot more efficient than the query of the other one. This suggests that if performance ever became an issue, the other query could probably be optimized, but I don't see it being an issue.) That report is updated every Wednesday at about 7am,
Cheers, --bawolff
p.s. The regan videos being everywhere is amusing.
Hi all,
Interesting statistics, would it be possible to generate these for other language versions as well? Wikimedia Netherlands held its new years gathering at the Netherlands Institute for Sound and Vision, a great opportunity to generate buzz around video on Wikipedia. Some Wikipedians indicated that they would be interested in picking up the challenge of adding more video on Wikipedia. So I am setting up a projectpage (much like the English WP:video https://en.wikipedia.org/wiki/Wikipedia:Videos and WP:Wiki makes video https://en.wikipedia.org/wiki/Wikipedia:WikiProject_Wiki_Makes_Video on which we can share knowledge about the production, transcoding and use of video. I would like to generate a list of Commons categories containing video, but the search function doesn't allow me to make such a list. Does anyone have a suggestion how to proceed?
Cheers, Jesse
2014-12-16 1:50 GMT+01:00 Brian Wolff bawolff@gmail.com:
On 12/10/14, Andrew Lih andrew@andrewlih.com wrote:
Brian, there were some interesting results in the data you filtered from the database. The good news is that it syncs quite well with the data we had from January 2013, in terms of ogg, ogv and webm. A few notes:
- These are the most popular Commons videos in en.wp. Pretty much the
same
as January 2013 except for #2, where someone really wanted to embed that Reagan Speech in a lot of places.
Commercial-LBJ1964ElectionAdDaisyGirl.ogv 13 Reagan Speech Beirut Bombing.ogv 12 Machinima sample reindeer full size.ogg 9 1946-10-08 21 Nazi Chiefs Guilty.ogv 9 SeaSnails.ogg 8 Shakinghands high.OGG 7 The Impact Of Wikipedia.webm 6 CollateralMurder.ogv 6 1946-07-15 Philippines Independence Proclaimed.ogv 6
- These are the most popular long GIFs on Commons, used in en.wp:
EC-EU-enlargement animation.gif 53 Linguistic map Southwestern Europe.gif 18 Canada provinces evolution 2.gif 12 Pangea animation 03.gif 11 Mohammad adil-Rashidun empire-slide.gif 10
- We may have to tweak the GIF filter. For some reason, it picked up
some
odd results like classifying these LOCAL en.wp Mexico-related stub GIF icons as video. The metadata page does not suggest they should be seen as long animations. The files are, from the table listing:
Mx-actor.gif 275 Mx-singer.gif 49 Mx-actor.gif, Mx-singer.gif 43
https://en.wikipedia.org/wiki/File:Mx-actor.gif
-Andrew
According to the metadata, Mx-actor.gif is an animated gif consisting of 1 frame that's shown for 10 seconds... Which is odd. I've excluding all animated GIFs that are only a single frame long.
This report should automatically update once a week on tuesdays at roughly 7am UTC.
One thing I should note about that report is that the columns will get cut off if they exceed 4096 characters.
I also created a second report for videos on commons that are used on any wiki in any namespace. Its at https://tools.wmflabs.org/bawolff/usedVideosCommons.htm (The query for this report is actually a lot more efficient than the query of the other one. This suggests that if performance ever became an issue, the other query could probably be optimized, but I don't see it being an issue.) That report is updated every Wednesday at about 7am,
Cheers, --bawolff
p.s. The regan videos being everywhere is amusing.
Wikivideo-l mailing list Wikivideo-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikivideo-l
On 1/22/15, Jesse de Vos jdvos@beeldengeluid.nl wrote:
Hi all,
Interesting statistics, would it be possible to generate these for other language versions as well?
Which language are you interested in? Here it is for nl: https://tools.wmflabs.org/bawolff/usedVideosNl.htm
Wikimedia Netherlands held its new years gathering at the Netherlands Institute for Sound and Vision, a great opportunity to generate buzz around video on Wikipedia. Some Wikipedians indicated that they would be interested in picking up the challenge of adding more video on Wikipedia. So I am setting up a projectpage (much like the English WP:video https://en.wikipedia.org/wiki/Wikipedia:Videos and WP:Wiki makes video https://en.wikipedia.org/wiki/Wikipedia:WikiProject_Wiki_Makes_Video on which we can share knowledge about the production, transcoding and use of video. I would like to generate a list of Commons categories containing video, but the search function doesn't allow me to make such a list. Does anyone have a suggestion how to proceed?
This is actually a surprisingly fast sql query (under 1 minute): https://tools.wmflabs.org/bawolff/categoryVid.htm
Let me know if those are what you need.
Cheers, Bawolff
Interesting to see an Italian video is the #1 video used across the projects, as it's included in the welcome template to new/anonymous editors on Italian Wikipedia: https://commons.wikimedia.org/wiki/File:Wikipedia_ridotto.ogv
Similarly, #2 is an Arabic introduction video used in an ar.wp welcome message: https://commons.wikimedia.org/wiki/File:Arabic_Wikipedia_Introduction.ogg
And... the introduction to Tamil Wikipedia is #3: https://commons.wikimedia.org/wiki/File:Bala_talks_about_Tamil_Wikipedia.ogv
However, I do wonder how often these videos are actually played, given how OGV and WebM are not widely supported by default.
Full list: https://tools.wmflabs.org/bawolff/usedVideosCommons.htm
-Andrew
-Andrew Lih Associate professor of journalism, American University Email: andrew@andrewlih.com WEB: http://www.andrewlih.com BOOK: The Wikipedia Revolution: http://www.wikipediarevolution.com PROJECT: Wiki Makes Video http://en.wikipedia.org/wiki/Wikipedia:WikiProject_Wiki_Makes_Video
On Thu, Jan 22, 2015 at 11:50 AM, Brian Wolff bawolff@gmail.com wrote:
On 1/22/15, Jesse de Vos jdvos@beeldengeluid.nl wrote:
Hi all,
Interesting statistics, would it be possible to generate these for other language versions as well?
Which language are you interested in? Here it is for nl: https://tools.wmflabs.org/bawolff/usedVideosNl.htm
Wikimedia Netherlands held its new years gathering at the Netherlands Institute for Sound and Vision, a great opportunity to generate buzz
around
video on Wikipedia. Some Wikipedians indicated that they would be interested in picking up the challenge of adding more video on Wikipedia. So I am setting up a projectpage (much like the English WP:video https://en.wikipedia.org/wiki/Wikipedia:Videos and WP:Wiki makes video https://en.wikipedia.org/wiki/Wikipedia:WikiProject_Wiki_Makes_Video
on
which we can share knowledge about the production, transcoding and use of video. I would like to generate a list of Commons categories containing video,
but
the search function doesn't allow me to make such a list. Does anyone
have
a suggestion how to proceed?
This is actually a surprisingly fast sql query (under 1 minute): https://tools.wmflabs.org/bawolff/categoryVid.htm
Let me know if those are what you need.
Cheers, Bawolff
Wikivideo-l mailing list Wikivideo-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikivideo-l
Andrew Lih, 10/12/2014 22:11:
- These are the most popular long GIFs on Commons, used in en.wp:
EC-EU-enlargement animation.gif 53 Linguistic map Southwestern Europe.gif18
Cute, language issues win the laurel of complexity requiring animation?
Andrew Lih, 22/01/2015 20:06:
However, I do wonder how often these videos are actually played, given how OGV and WebM are not widely supported by default.
We'd certainly love to know that as well (although the video cost only few thousands euro). We only have collateral metrics i.e. page description visits, as usual.
Full list: https://tools.wmflabs.org/bawolff/usedVideosCommons.htm
And the first by usages in articles is probably https://commons.wikimedia.org/wiki/File:Second_world_war_europe_animation_sm... Still less than EC-EU-enlargement animation.gif perhaps? Peace prevails. ;-)
Nemo
wikivideo-l@lists.wikimedia.org