Hi,
I wanted to let you know that I extended youtube-dl [1] to allow upload to Wikimedia Commons (or any MediaWiki).
Youtube-dl is a utility written in Python to download videos from a variety of websites. It handles the actual download and extraction of metadata - thus most of the work.
The extension adds: * License retrieving from YouTube and Vimeo (AFAIK the only services providing CC-licensing). * Converting the extracted video to Ogg Theora (using ffmpeg2theora [2]) * Formatting the metadata to {{Information}} * License checking to see if compatible with Wikimedia Commons * Upload to Wikimedia Commons using Pywikipedia
In the end, my extension merely glues together youtube-dl (with added license handling), ffmpeg2theora and Pywikipedia.
It is called from command line like this : ./youtube-dl --wikimedia-commons-export --convert-theora --theora-audio-quality 8 --theora-video-quality 8 --theora-optimise http://vimeo.com/46348011
Code is available on GitHub : https://github.com/JeanFred/youtube-dl/tree/WikimediaCommonsPP
And some test uploads : < https://test.wikipedia.org/wiki/Category:Uploaded_with_youtube-dl/WikimediaC...
Afterthoughts: it certainly makes more sense to have a dedicated pywikipedia module only relying on youtube-dl for the video and metadata extraction, and taking care of the rest, and not the other way around like here. I went for this because it was the most straightforward way to do it, maybe I will rewrite it if I find the time.
Any feedback is welcome! :)
[1] http://rg3.github.com/youtube-dl/ [2] http://v2v.cc/~j/ffmpeg2theora/
Jean-Frédéric, 08/09/2012 14:15:
I wanted to let you know that I extended youtube-dl [1] to allow upload to Wikimedia Commons (or any MediaWiki).
Nice! YouTube can be a very useful source, I've imported several thousands videos to archive.org on behalf of emijrp with his https://code.google.com/p/emijrp/source/browse/trunk/scrapers/youtube2internetarchive.py In that case it's faster because the video conversion is done by archive.org. I don't know if such scripts should be coordinated in some way.
Nemo