On Wed, Feb 10, 2010 at 12:51 AM, tstarling@svn.wikimedia.org wrote:
http://www.mediawiki.org/wiki/Special:Code/MediaWiki/62223
Revision: 62223 Author: tstarling Date: 2010-02-10 05:51:56 +0000 (Wed, 10 Feb 2010)
Log Message:
- In preparation for deployment, revert the bulk of Michael's unreviewed work. Time for review has run out. The code has many obvious problems with it. Comparing against r38714 will give you an idea of which changes I am accepting. Fixes bug 22388.
- Removed magic word hook, doesn't do anything useful.
- OggPlayer.js still needs some work.
Looks like this change removed both the Oggthumb support as well as the code that handles the cases where ffmpeg fails.
Ffmpeg will fail to generate a thumb if there is no keyframe in the file after the point in time that you requested a thumb. This was causing a failure to generate thumbs for many files because they are short and only have a single keyframe at the beginning.
Gregory Maxwell wrote:
On Wed, Feb 10, 2010 at 12:51 AM, tstarling@svn.wikimedia.org wrote:
http://www.mediawiki.org/wiki/Special:Code/MediaWiki/62223
Revision: 62223 Author: tstarling Date: 2010-02-10 05:51:56 +0000 (Wed, 10 Feb 2010)
Log Message:
- In preparation for deployment, revert the bulk of Michael's unreviewed work. Time for review has run out. The code has many obvious problems with it. Comparing against r38714 will give you an idea of which changes I am accepting. Fixes bug 22388.
- Removed magic word hook, doesn't do anything useful.
- OggPlayer.js still needs some work.
Looks like this change removed both the Oggthumb support as well as the code that handles the cases where ffmpeg fails.
Ffmpeg will fail to generate a thumb if there is no keyframe in the file after the point in time that you requested a thumb. This was causing a failure to generate thumbs for many files because they are short and only have a single keyframe at the beginning.
That sounds like it needs a one-line fix in OggHandler::normaliseParams(), not 50 lines of code and a new decoder. Do you have a test file or a bug report or something?
-- Tim Starling
On Wed, Feb 10, 2010 at 8:38 AM, Tim Starling tstarling@wikimedia.org wrote:
That sounds like it needs a one-line fix in OggHandler::normaliseParams(), not 50 lines of code and a new decoder. Do you have a test file or a bug report or something?
Just switching the thumbnailer should be sufficient, I agree the pile of code and the retries were fairly lame (and I think I complained about it on IRC). I'm not sure why any support for thumbnailing ogv's with ffmpeg was retained.
I don't see how you can fix it in normalizeParams call unless you've scanned the stream and know where the keyframes are. Ffmpeg could be fixed, of course, but the ogg demuxer basically needs a rewrite... I think the patch you did to ffmpeg a while back was a lot better than the code they ultimately included.
Here is a file that won't thumbnail under the current code: http://myrandomnode.dyndns.org:8080/~gmaxwell/theora/only_one_keyframe.ogv
ffmpeg -y -ss 5 -i only_one_keyframe.ogv -f mjpeg -an -vframes 1 foo.jpeg throws a pile of errors then foo.jpeg is a zero byte file.
Gregory Maxwell wrote:
I don't see how you can fix it in normalizeParams call unless you've scanned the stream and know where the keyframes are.
Well, we do actually scan the stream. We interpret the granule position of each page and record the last granule position in the stream for the purposes of calculating stream length. It would be easy enough to make a note of the last keyframe position as well.
But when I said a one-line fix, I was thinking along the lines of a fixed heuristic distance between keyframes, either in time or frame count, that roughly matches the characteristics of common encoders. If the requested seek time is too close to the end, it would be moved back, possibly right back to the start. This is already implemented, but there's a bug in it for short streams which may be causing trouble. Or the period may not be long enough.
Obviously if there is some encoder that, with the default settings, is generating 10 second files with a single keyframe and 240 inter frames, and this encoder is significant for Wikimedia, then I will have to rethink this approach.
-- Tim Starling
On Wed, Feb 10, 2010 at 8:17 PM, Tim Starling tstarling@wikimedia.org wrote:
Obviously if there is some encoder that, with the default settings, is generating 10 second files with a single keyframe and 240 inter frames, and this encoder is significant for Wikimedia, then I will have to rethink this approach.
256-frame keyframe interval is the default for anything libtheora 1.1 based in two-pass mode... So, yes. I'm pretty sure thats relevant.
I wouldn't have even realized the existing processing chain was failing on files that contained no keyframes past the midpoint were it not for the fact that I'd bumped into failed thumbnailing on short files. ::shrugs::
There isn't really a reasonable "fixed heuristic distance between keyframes", the encoder can put them anywhere subject to the limitation that they not be more than KEYINT away from the last one. So even knowing the interval the locations can't be reliably determined without scanning the file.
The alternative "one line fix" is switching to a thumbnailer that actually thumbnails the requested location. All the better when the thumbnailer correctly handles 4:2:2 and 4:4:4 colorspaces. Just saying——
Gregory Maxwell wrote:
Looks like this change removed both the Oggthumb support as well as the code that handles the cases where ffmpeg fails.
The usual problem with deploying new solutions for equivalent tasks is that you substitute known issues with unknown ones.
I've looked at oggThumb now, I downloaded the latest tarball. Here are some of the ways in which it sucks:
* The output filename cannot be specified on the command line, it is generated from the input filename. OggHandler uses a -n option for destination path which just gives an error for me. I don't know if it's a patch or an alpha version feature, but it's not documented either way.
* Unlike the current version of FFmpeg, it does not implement bisection seeking. It scans the entire input file to find the relevant frames. For an 85MB test file, it was 30 times slower than FFmpeg.
* It unconditionally writes a progress message to stdout on every frame in the input file.
* It unconditionally pollutes stderr with verbose stream metadata information.
* It ignores error return values from libtheora functions like th_decode_packetin(), meaning that essentially the *only* thing on stdout/stderr is useless noise.
-- Tim Starling
On Wed, Feb 10, 2010 at 9:36 PM, Tim Starling tstarling@wikimedia.org wrote:
Gregory Maxwell wrote:
Looks like this change removed both the Oggthumb support as well as the code that handles the cases where ffmpeg fails.
The usual problem with deploying new solutions for equivalent tasks is that you substitute known issues with unknown ones.
I've looked at oggThumb now, I downloaded the latest tarball. Here are some of the ways in which it sucks:
So a couple months back mdale suggested using oggThumb, with the old installed ffmpeg was making spaghetti of some files (old ffmpeg not completely implementing the Theora spec) and the new one that (whomever) tried installing mage spaghetti in a different way (failing to thumb because ffmpeg didn't take your seeking patch eons ago).
I'd never heard of it, went to look, and recoiled in horror. Then I sent a patch.
- Unlike the current version of FFmpeg, it does not implement
bisection seeking. It scans the entire input file to find the relevant frames. For an 85MB test file, it was 30 times slower than FFmpeg.
Of the issues I raised, seeking was the only I didn't fix. Unfortunately oggvideotools reimplements libogg in C++ so it could use C++ memory management, my patience ran out before I got around to implementing it.
If you search the archive you can see how strongly opposed I am to tools that linear scan unnecessarily. But 30x slower on a file that small sounds a bit odd.
- The output filename cannot be specified on the command line, it is
generated from the input filename. OggHandler uses a -n option for destination path which just gives an error for me. I don't know if it's a patch or an alpha version feature, but it's not documented either way.
It's in SVN.
After the author of the package applied my patches (on the same day I sent them) Mdale asked if he should delay Wikimedia deployment until the fixes I sent in went in, the author offered to simply do a new release. No one took him up on the author.
- It unconditionally writes a progress message to stdout on every
frame in the input file.
- It unconditionally pollutes stderr with verbose stream metadata
information.
- It ignores error return values from libtheora functions like
th_decode_packetin(), meaning that essentially the *only* thing on stdout/stderr is useless noise.
I'm also not especially keen on its rather non-unixy style. Then again, I think C++ is pretty much crap too, so you can see what my opinion is worth. What I can say is that speaking from personal experience the author of this package is friendly, pleasant to work with, and responsive. Though 'submit patches' takes me out of the 'one line fix' I advertised, — sorry, I'd assumed that Mdale had already worked out the operational angles and my only concerns were correct output and not allowing it to be an enormous DOS vector.
Cheers.
wikitech-l@lists.wikimedia.org