Some more fun with stats and video. Here's a list
of articles with the
most "videos."
Number one is a catalog of Ronald Reagan speeches which explains its top
rank, but number two is almost entirely animated GIFs showing how to do
visual algorithms from China's Ming dynasty.
-Andrew
Speeches_and_debates_of_Ronald_Reagan 20 Rod_calculus 18 Meter_(music)
15 Time_signature 15 Ebola_virus_disease 14
Prevention_of_viral_hemorrhagic_fever 14 Atacama_Large_Millimeter_Array
13 Solar_cycle_24 12 Apollo_15 11 Hasta_Vinyasas 11 Multi-touch 11
Suez_Crisis 11 Les_Vampires 10 Notes_inégales 10 Private_Snafu 10
Winsor_McCay 10 Principles_of_Hindu_Reckoning 9 Behind_the_Screen 8
Biology_of_Diptera 8 China:_The_Roots_of_Madness 8 Colpoda 8
European_Extremely_Large_Telescope 8
Festa_del_Santissimo_Salvatore_a_Pazzano 8 First_Motion_Picture_Unit 8
Glossary_of_ballet 8 La_Silla_Observatory 8 Solar_flare 8 Why_We_Fight
8 Dwight_Buycks 7 History_of_the_Delft_University_of_Technology 7
Luke_Harangody 7 Rede_Tupi 7 Rep-tile 7 Shire_Hall,_Monmouth 7 STS-131
7 Yevgeni_Bauer 7
-Andrew Lih
Associate professor of journalism, American University
Email: andrew(a)andrewlih.com
WEB:
On Wed, Dec 10, 2014 at 6:16 PM, Tom Fish <guerillero.wikipedia(a)gmail.com>
wrote:
I removed some of the translusions of the Regan
video.
--Tom
On Wed, Dec 10, 2014 at 4:11 PM, Andrew Lih <andrew(a)andrewlih.com> wrote:
Brian, there were some interesting results in the
data you filtered
from the
database. The good news is that it syncs quite
well with the data we had
from January 2013, in terms of ogg, ogv and webm. A few notes:
1. These are the most popular Commons videos in en.wp. Pretty much the
same
as January 2013 except for #2, where someone
really wanted to embed that
Reagan Speech in a lot of places.
Commercial-LBJ1964ElectionAdDaisyGirl.ogv 13
Reagan Speech Beirut Bombing.ogv 12
Machinima sample reindeer full size.ogg 9
1946-10-08 21 Nazi Chiefs Guilty.ogv 9
SeaSnails.ogg 8
Shakinghands high.OGG 7
The Impact Of Wikipedia.webm 6
CollateralMurder.ogv 6
1946-07-15 Philippines Independence Proclaimed.ogv 6
2. These are the most popular long GIFs on Commons, used in en.wp:
EC-EU-enlargement animation.gif 53
Linguistic map Southwestern Europe.gif 18
Canada provinces evolution 2.gif 12
Pangea animation 03.gif 11
Mohammad adil-Rashidun empire-slide.gif 10
3. We may have to tweak the GIF filter. For some reason, it picked up
some
odd results like classifying these LOCAL en.wp
Mexico-related stub GIF
icons
as video. The metadata page does not suggest they
should be seen as long
animations. The files are, from the table listing:
Mx-actor.gif 275
Mx-singer.gif 49
Mx-actor.gif, Mx-singer.gif 43
https://en.wikipedia.org/wiki/File:Mx-actor.gif
-Andrew
-Andrew Lih
Associate professor of journalism, American University
Email: andrew(a)andrewlih.com
WEB:
http://www.andrewlih.com
BOOK: The Wikipedia Revolution:
http://www.wikipediarevolution.com
PROJECT: Wiki Makes Video
http://en.wikipedia.org/wiki/Wikipedia:WikiProject_Wiki_Makes_Video
On Tue, Dec 9, 2014 at 3:12 PM, Andrew Lih <andrew(a)andrewlih.com>
wrote:
>
> Brian, thanks much for running this. I'll spend some time in the next
day
> to run some metrics to see how it compares
with our Jan 2013 results.
>
> In general, this is what I'm looking for and I'll post some interesting
> stats when I process this.
>
> -Andrew
>
>
> -Andrew Lih
> Associate professor of journalism, American University
> Email: andrew(a)andrewlih.com
> WEB:
http://www.andrewlih.com
> BOOK: The Wikipedia Revolution:
http://www.wikipediarevolution.com
> PROJECT: Wiki Makes Video
>
http://en.wikipedia.org/wiki/Wikipedia:WikiProject_Wiki_Makes_Video
>
> On Sun, Dec 7, 2014 at 12:22 AM, Brian Wolff <bawolff(a)gmail.com>
wrote:
>>
>> On 12/5/14, Andrew Lih <andrew(a)andrewlih.com> wrote:
>> > Brian, thanks yes that would be what I'd be looking for.
>> >
>> > In fact, a monthly report on a regular basis would be really
>> > interesting to
>> > see.
>> >
>>
>> Alright, here is my first attempt:
>>
>>
http://tools.wmflabs.org/bawolff/usedVideos.htm (Data formatted as
tsv
> if
anyone wants to do further processing:
>
http://tools.wmflabs.org/bawolff/usedVideos.txt )
>
> It gives a mostly alphabetical list of articles with videos on them. A
> video is defined as follows:
> *A webm file
> *An ogg file, registered as video in the database (This roughly means
> that it has the string "theora" somewhere in the first 256 bytes of
> the file, not counting the string "ffmpeg2theora", except for some
> older files might still count the ffmpeg2theora, and also there's no
> garuntee that an ogg theora file has a theora data packet in the first
> 255 bytes, and its also very possible for non-theora files to have
> that string in the header. Consider this a "rough" metric. In practise
> I think it works most of the time, but do your own checking before
> using for anything serious).
> *An animated gif file that is at least 10 seconds long. I figured this
> very roughly separates non-videos esque gifs from video-ish gifs.
>
> Based on that metric, there are currently 8464 articles on enwikipedia
> that have videos on them (6442 if you take out the longer than 10
> seconds GIF files).
>
> Before setting this up to update itself, is this the sort of thing you
> are looking for? Would it be more useful with different definitions of
> a "video", or instead of listing it as an alphabetical list of
> articles, orient it around which video is used the most places? Or
> would some other ordering be best?
>
> I guess I'm asking, what questions about videos are you actually
> looking to answer, and how could this type of report be modified to
> better answer them?
>
> --bawolff
>
> p.s. For those interested in this sort of thing, the sql query I used
> was:
>
> select page_title, GROUP_CONCAT( i2.img_name separator ', ' ) as
> "commons videos", GROUP_CONCAT( i1.img_name separator ', ' ) as
> "enwiki videos", GROUP_CONCAT( i3.img_name separator ', ' ) as
> "commons long gifs", GROUP_CONCAT( i4.img_name separator ', ' ) as
> "enwiki long gifs" from page inner join imagelinks on il_from =
> page_id left join image i1 on il_to = i1.img_name and
> i1.img_media_type = 'VIDEO' left join commonswiki_p.image i2 on il_to
> = i2.img_name and i2.img_media_type = 'VIDEO' left join
> commonswiki_p.image i3 on il_to = i3.img_name and i3.img_media_type =
> 'BITMAP' and i3.img_major_mime = 'image' and i3.img_minor_mime =
'gif'
> and i3.img_metadata regexp '"duration";d:\\d{2,}' left join image
i4
> on il_to = i4.img_name and i4.img_media_type = 'BITMAP' and
> i4.img_major_mime = 'image' and i4.img_minor_mime = 'gif' and
> i4.img_metadata regexp '"duration";d:\\d{2,}' where page_namespace
= 0
> and (i1.img_name is not null or i2.img_name is not null or i3.img_name
> is not null or i4.img_name is not null) group by page_title;
>
> _______________________________________________
> Wikivideo-l mailing list
> Wikivideo-l(a)lists.wikimedia.org
>
https://lists.wikimedia.org/mailman/listinfo/wikivideo-l
_______________________________________________
Wikivideo-l mailing list
Wikivideo-l(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikivideo-l
_______________________________________________
Wikivideo-l mailing list
Wikivideo-l(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikivideo-l
_______________________________________________
Wikivideo-l mailing list
Wikivideo-l(a)lists.wikimedia.org