This reverts commit f8bed30d8b. The reason
for this is that the overlap filter, which runs after IDCT, should run
on unclamped values, and thus IDCT and put_pixels() cannot be merged if
we want to attempt to be bitexact.
PROFILE_ADVANCED doesn't set res_fasttx, so make that a special case
in the condition that decides which IDCT to use (and whether to read
coefficients transposed or not).
Signed-off-by: Kostya Shishkov <kostya.shishkov@gmail.com>
IDCT coefficients are read transposed, but simple_idct does not expect
this. Therefore, only do tranposed coefficient reading if we're not
using simple_idct.
Fixes http://forum.videolan.org/viewtopic.php?f=14&t=89651
Advantage is that it allows us to combine several loops into a single
one, and these can eventually be merged into the IDCT itself. Also, it
allows us to remove vc1_put_block(), and makes CODEC_FLAG_GRAY faster.
Advanced profile never uses "range reduction", so vc1_put_block() quite
literally just calls put_pixels_clamped() from vc1_decode_i_blocks_adv().
By inlining the function, we can prevent calling IDCT8x8 if
CODEC_FLAG_GRAY is set, and we don't have to scale the coeffs in the
[0,256] range, but can instead use put_signed_pixels_clamped().
Passing an explicit filename to this command is only necessary if the
documentation in the @file block refers to a file different from the
one the block resides in.
Originally committed as revision 22921 to svn://svn.ffmpeg.org/ffmpeg/trunk
MPV_common_init(), so calling both is redundant and leads to memory
leaks in WMV3/VC-1 decoder. Thus use only the first function in
WMV3/VC-1 decoder initialization.
Originally committed as revision 22024 to svn://svn.ffmpeg.org/ffmpeg/trunk
search for real extradata start instead of always skipping one byte.
Patch by Andrew Dennison gmailify(${name}d, lists)
Thread: [PATCH] Fix VC1 "Incomplete extradata" for mkv files generated by eac3to
Originally committed as revision 20178 to svn://svn.ffmpeg.org/ffmpeg/trunk
VLC tables should only be initialized from one place.
This initializes/calculates more VLC tables than necessary for VC1 decoding,
but this is only done once and only a small overhead in time and space (maybe
30 kB) it seems not worth adding a separate function.
Originally committed as revision 20010 to svn://svn.ffmpeg.org/ffmpeg/trunk
These are only supposed to be called once per row, not once per macroblock.
~1.5% faster according to oprofile.
Originally committed as revision 19213 to svn://svn.ffmpeg.org/ffmpeg/trunk
~8% faster VC-1 decoding.
Possible future optimization: clear blocks after use instead of before, and for
DC-only blocks, only clear the DC coefficient.
Originally committed as revision 19205 to svn://svn.ffmpeg.org/ffmpeg/trunk
Includes mmx2 asm for the various functions.
Note that the actual idct still does not have an x86 SIMD implemtation.
For wmv3 files using regular idct, the decoder just falls back to simple_idct,
since simple_idct_dc doesn't exist (yet).
Originally committed as revision 19204 to svn://svn.ffmpeg.org/ffmpeg/trunk