Jason Garrett-Glaser
6c32576548
H.264: optimize CABAC x86 asm for Atom
13 years ago
Diego Biurrun
657ccb5ac7
Eliminate FF_COMMON_FRAME macro.
...
FF_COMMON_FRAME holds the contents of the AVFrame structure and is also copied
to struct Picture. Replace by an embedded AVFrame structure in struct Picture.
14 years ago
Jason Garrett-Glaser
99b6d2c065
H.264: use fill_rectangle in CABAC decoding
14 years ago
Jason Garrett-Glaser
556f8a066c
H.264: template left MB handling
...
Faster H.264 decoding with ALLOW_INTERLACE off.
14 years ago
Jason Garrett-Glaser
3b7ebeb4d5
H.264: faster write_back_*
...
Avoid aliasing, unroll loops, and inline more functions.
14 years ago
Jason Garrett-Glaser
c90b94424c
4:4:4 H.264 decoding support
...
Note: this is 4:4:4 from the 2007 spec revision, not the previous (now deprecated) 4:4:4 mode in H.264.
14 years ago
Jason Garrett-Glaser
504811baea
Roll back 4:4:4 H.264 for now
...
Needs some ARM/PPC asm modifications.
14 years ago
Jason Garrett-Glaser
c9c493872c
4:4:4 H.264 decoding support
...
Note: this is 4:4:4 from the 2007 spec revision, not the previous (now deprecated) 4:4:4 mode in H.264.
14 years ago
Oskar Arvidsson
fcc0224e4f
Add support for higher QP values in h264.
...
In high bit depth, the QP values may now be up to (51 + 6*(bit_depth-8)).
Preparatory patch for high bit depth h264 decoding support.
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
14 years ago
Oskar Arvidsson
6e3ef511d7
Add the notion of pixel size in h264 related functions.
...
In high bit depth the pixels will not be stored in uint8_t like in the
normal case, but in uint16_t. The pixel size is thus 1 in normal bit
depth and 2 in high bit depth.
Preparatory patch for high bit depth h264 decoding support.
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
14 years ago
Stefano Sabatini
975a1447f7
Replace deprecated FF_*_TYPE symbols with AV_PICTURE_TYPE_*.
...
Signed-off-by: Diego Biurrun <diego@biurrun.de>
14 years ago
Mans Rullgard
2912e87a6c
Replace FFmpeg with Libav in licence headers
...
Signed-off-by: Mans Rullgard <mans@mansr.com>
14 years ago
Ronald S. Bultje
66c6b5e2a5
Revert 2a1f431d38
, it broke H264 lossless.
14 years ago
Ronald S. Bultje
8bcfe7f7fd
Set gray (128) U/V planes for chroma-less samples. Fixes two fate samples
...
when played with -flags emu_edge.
14 years ago
Jason Garrett-Glaser
b9af15402d
Remove evil timers that snuck their way into r26375.
...
Originally committed as revision 26377 to svn://svn.ffmpeg.org/ffmpeg/trunk
14 years ago
Jason Garrett-Glaser
fb2734c8a6
Fix r26375 on non-x86.
...
Originally committed as revision 26376 to svn://svn.ffmpeg.org/ffmpeg/trunk
14 years ago
Jason Garrett-Glaser
f14bdd8e75
H.264: Partially inline CABAC residual decoding
...
Improves CABAC performance about ~1.2%.
Trick originates from x264 and has also been used in ffvp8. It's useful because
coded block flags are usually zero, so it helps to have the early termination
inlined into the main function.
Originally committed as revision 26375 to svn://svn.ffmpeg.org/ffmpeg/trunk
14 years ago
Jason Garrett-Glaser
2a1f431d38
H.264/SVQ3: make chroma DC work the same way as luma DC
...
No speed improvement, but necessary for some future stuff.
Also opens up the possibility of asm chroma dc idct/dequant.
Originally committed as revision 26349 to svn://svn.ffmpeg.org/ffmpeg/trunk
14 years ago
Jason Garrett-Glaser
5657d14094
H.264: switch to x264-style tracking of luma/chroma DC NNZ
...
Useful so that we don't have to run the hierarchical DC iDCT if there aren't
any coefficients. Opens up some future opportunities for optimization as well.
Originally committed as revision 26337 to svn://svn.ffmpeg.org/ffmpeg/trunk
14 years ago
Jason Garrett-Glaser
19fb234e4a
H.264: split luma dc idct out and implement MMX/SSE2 versions
...
About 2.5x the speed.
NOTE: the way that the asm code handles large qmuls is a bit suboptimal.
If x264-style dequant was used (separate shift and qmul values), it might
be possible to get some extra speed.
Originally committed as revision 26336 to svn://svn.ffmpeg.org/ffmpeg/trunk
14 years ago
Diego Biurrun
ba87f0801d
Remove explicit filename from Doxygen @file commands.
...
Passing an explicit filename to this command is only necessary if the
documentation in the @file block refers to a file different from the
one the block resides in.
Originally committed as revision 22921 to svn://svn.ffmpeg.org/ffmpeg/trunk
15 years ago
Benoit Fouet
32e543f866
Replace @returns by @return .
...
Originally committed as revision 22729 to svn://svn.ffmpeg.org/ffmpeg/trunk
15 years ago
Alexander Strange
767738f7a3
h264: Use + instead of | in some places
...
6 insns less on x86-64/gcc 4.2.
Originally committed as revision 22692 to svn://svn.ffmpeg.org/ffmpeg/trunk
15 years ago
Alexander Strange
601ca8c55c
h264: Remove unused function argument
...
Originally committed as revision 22690 to svn://svn.ffmpeg.org/ffmpeg/trunk
15 years ago
Alexander Strange
f7ba470d58
h264: Simplify decode_cabac_residual() specialization
...
Gives more consistent inlining with some compilers (such as llvm).
Originally committed as revision 22689 to svn://svn.ffmpeg.org/ffmpeg/trunk
15 years ago
Michael Niedermayer
8897b247a5
Remove some unneeded fill_rectangle() for 16x16 blocks.
...
Originally committed as revision 22124 to svn://svn.ffmpeg.org/ffmpeg/trunk
15 years ago
Zhou Zongyi
821fe7f3e6
Optimize (amvd>2)+(amvd>32), about 1 cpu cycles faster.
...
patch by Zhou Zongyi @ zhouzy () os punkt pku dot edu speck cn
Originally committed as revision 22084 to svn://svn.ffmpeg.org/ffmpeg/trunk
15 years ago
Michael Niedermayer
b5bd070029
Change mvd_cache & mvd_table to 8bit, this is overall a bit faster
...
for high resolution videos.
about 20cycles faster per MB for cathederal.
Originally committed as revision 22038 to svn://svn.ffmpeg.org/ffmpeg/trunk
15 years ago
Michael Niedermayer
81b5e4ee92
Calculate mvd without abs()
...
same speed (ask gcc why, i dont know)
Originally committed as revision 22035 to svn://svn.ffmpeg.org/ffmpeg/trunk
15 years ago
Michael Niedermayer
855a1ba5e8
switch back to (amvd>2)+(amvd>32), its 5 cpu cycles faster now.
...
Originally committed as revision 22032 to svn://svn.ffmpeg.org/ffmpeg/trunk
15 years ago
Michael Niedermayer
01b35be14a
Factorize common code from the top of decode_cabac_mb_mvd()
...
10-15 cpu cycles faster.
Originally committed as revision 22029 to svn://svn.ffmpeg.org/ffmpeg/trunk
15 years ago
Michael Niedermayer
6d0155c79c
Replace mvd>2 + mvd>32 by MIN((mvd+28)*17>>9, 2)
...
same speed as far as i can meassure but it might have fewer branches on some
archs.
Idea from x264 / jason
Originally committed as revision 22027 to svn://svn.ffmpeg.org/ffmpeg/trunk
15 years ago
Michael Niedermayer
90332debfe
Replace ad-hoc fill rectangle by fill_rectangle().
...
Originally committed as revision 22025 to svn://svn.ffmpeg.org/ffmpeg/trunk
15 years ago
Michael Niedermayer
f4ce853125
get rid of an if() 1 cpu cycle faster.
...
Originally committed as revision 21889 to svn://svn.ffmpeg.org/ffmpeg/trunk
15 years ago
Michael Niedermayer
e69bfde6b2
Get rid of a local variable, 10 cpu cycles faster.
...
Originally committed as revision 21888 to svn://svn.ffmpeg.org/ffmpeg/trunk
15 years ago
Michael Niedermayer
a305449df6
Move abs() from decode_cabac_mb_mvd() to the code that writes mvd_cache.
...
4-8 cycles faster
Originally committed as revision 21887 to svn://svn.ffmpeg.org/ffmpeg/trunk
15 years ago
Michael Niedermayer
90a5849efd
Speedup decode_cabac_field_decoding_flag() by 9 cpu cycles.
...
Originally committed as revision 21875 to svn://svn.ffmpeg.org/ffmpeg/trunk
15 years ago
Michael Niedermayer
69cc31832f
Move check for and call of predict_field_decoding_flag() from the mb code to
...
the row code. This function would only be needed on a MB basis for MBAFF+FMO
Originally committed as revision 21860 to svn://svn.ffmpeg.org/ffmpeg/trunk
15 years ago
Michael Niedermayer
59f733d1b1
2x faster ff_h264_init_cabac_states(), 4k cpu cycles less.
...
Sadly this is just per slice so the speedup with normal files should be negligible.
Originally committed as revision 21859 to svn://svn.ffmpeg.org/ffmpeg/trunk
15 years ago
Michael Niedermayer
37a9719a97
2 cpu cycles faster context calculation for decode_cabac_intra_mb_type()
...
Originally committed as revision 21845 to svn://svn.ffmpeg.org/ffmpeg/trunk
15 years ago
Michael Niedermayer
5806e8cd1f
Drop a few redundant slice_num checks.
...
Originally committed as revision 21844 to svn://svn.ffmpeg.org/ffmpeg/trunk
15 years ago
Michael Niedermayer
053074276b
Drop compute_mb_neighbors() and move fill_decode_neighbors() up to take its
...
role.
Should be faster as this is a strict code removial.
Originally committed as revision 21843 to svn://svn.ffmpeg.org/ffmpeg/trunk
15 years ago
Michael Niedermayer
c1bb66ac19
Split setting neighboring MBs from fill_decode_caches()
...
no speed change.
Originally committed as revision 21842 to svn://svn.ffmpeg.org/ffmpeg/trunk
15 years ago
Michael Niedermayer
cf55f59d5e
Simplify decode_cabac_mb_intra4x4_pred_mode().
...
same speed
Originally committed as revision 21839 to svn://svn.ffmpeg.org/ffmpeg/trunk
15 years ago
Michael Niedermayer
f4060611e9
Merge decode_cabac_mb_type_b() into calling code.
...
This avoids a conditional branch and is about 3 cpu cyclues faster.
Originally committed as revision 21838 to svn://svn.ffmpeg.org/ffmpeg/trunk
15 years ago
Michael Niedermayer
64dd1b0a1d
Merge the single line function decode_cabac_mb_transform_size()
...
into the calling code.
8 cpu cycles faster
Originally committed as revision 21828 to svn://svn.ffmpeg.org/ffmpeg/trunk
15 years ago
Michael Niedermayer
8b38d10761
indent
...
Originally committed as revision 21827 to svn://svn.ffmpeg.org/ffmpeg/trunk
15 years ago
Michael Niedermayer
f4b8b82514
Merge decode_cabac_mb_dqp() with surronding code.
...
~20 cpu cycles faster
Originally committed as revision 21826 to svn://svn.ffmpeg.org/ffmpeg/trunk
15 years ago
Michael Niedermayer
a59b9ee33d
Set sub_mb_type in direct_cache instead of just the direct flag.
...
Simpler, cleaner and faster.
Originally committed as revision 21822 to svn://svn.ffmpeg.org/ffmpeg/trunk
15 years ago
Michael Niedermayer
2dc380ca8e
Store sub_mb_type in direct_cache/direct_table.
...
This is equal complexity but could be more usefull.
Originally committed as revision 21821 to svn://svn.ffmpeg.org/ffmpeg/trunk
15 years ago