FFmpeg

Commit Graph

Author	SHA1	Message	Date
Alex Converse	81824fe059	aacdec: Only load and write each predictor variable once. This is slightly faster and opens the door for further optimization. Originally committed as revision 24475 to svn://svn.ffmpeg.org/ffmpeg/trunk	15 years ago
Alex Converse	70c99adb48	aacdec: 4% faster main profile decoding. Originally committed as revision 24474 to svn://svn.ffmpeg.org/ffmpeg/trunk	15 years ago
Alex Converse	51ffd3a62f	aacenc: Favor log2f() and sqrtf() over log2() and sqrt(). Originally committed as revision 24473 to svn://svn.ffmpeg.org/ffmpeg/trunk	15 years ago
Alex Converse	04d72abf17	aacenc: Factorize some scalefactor utilities. Originally committed as revision 24472 to svn://svn.ffmpeg.org/ffmpeg/trunk	15 years ago
Eli Friedman	3611e7a309	Inline asm for VP56 arith coder This is a lot more reliable to get cmov rather than trying to trick gcc into generating it, useful since it's 2% faster overall. Patch by Eli Friedman <eli.friedman at gmail> Originally committed as revision 24471 to svn://svn.ffmpeg.org/ffmpeg/trunk	15 years ago
David Conrad	ca18a478e3	VP8: Inline traversing vp8_small_mvtree Much faster read_mv_component, slightly faster overall Originally committed as revision 24470 to svn://svn.ffmpeg.org/ffmpeg/trunk	15 years ago
David Conrad	7697cdcf95	VP8: Use vp56_rac_get_prob_branchy when the bit is only used by an if() Originally committed as revision 24469 to svn://svn.ffmpeg.org/ffmpeg/trunk	15 years ago
David Conrad	fe1b5d974a	Decode DCT tokens by branching to a different code path for each branch on the huffman tree, instead of traversing the tree in a while loop. Based on the similar optimization in libvpx's detokenize.c 10% faster at normal bitrates, and 30% faster for high-bitrate intra-only Originally committed as revision 24468 to svn://svn.ffmpeg.org/ffmpeg/trunk	15 years ago
David Conrad	5474ec2ac8	Move renormalization of the VP56 arith decoder to before decoding a bit No difference at the moment, but allows a future branchy variant of vp56_rac_get_prob to be significantly faster Originally committed as revision 24467 to svn://svn.ffmpeg.org/ffmpeg/trunk	15 years ago
David Conrad	b3d755ec8b	Split renorm of vp56 arith decoder to its own function Originally committed as revision 24466 to svn://svn.ffmpeg.org/ffmpeg/trunk	15 years ago
David Conrad	24675b8093	vp56's arith decoder's code_word is only 16 bits, no need for unsigned long Originally committed as revision 24465 to svn://svn.ffmpeg.org/ffmpeg/trunk	15 years ago
Jason Garrett-Glaser	13a1304bb3	Add myself to VP8 copyright and maintainers. Also add Ronald to maintainers. Originally committed as revision 24464 to svn://svn.ffmpeg.org/ffmpeg/trunk	15 years ago
Jason Garrett-Glaser	414ac27d8f	VP8: always_inline some things to force gcc to do the right thing Mostly seems to help in the MC code, which gets a hundred cycles faster. Originally committed as revision 24463 to svn://svn.ffmpeg.org/ffmpeg/trunk	15 years ago
Jason Garrett-Glaser	06d50ca804	VP8: use AV_RL24 instead of defining a new RL24. Originally committed as revision 24462 to svn://svn.ffmpeg.org/ffmpeg/trunk	15 years ago
Jason Garrett-Glaser	9fddd14a8e	VP8: Slightly faster MV selection Don't clamp best mv unless it's actually used. Originally committed as revision 24461 to svn://svn.ffmpeg.org/ffmpeg/trunk	15 years ago
Jason Garrett-Glaser	14767f35ed	VP8: use AV_ZERO32 instead of AV_WN32A where relevant Originally committed as revision 24460 to svn://svn.ffmpeg.org/ffmpeg/trunk	15 years ago
Jason Garrett-Glaser	09959ec46e	VP8: eliminate redundant code in r24458 Originally committed as revision 24459 to svn://svn.ffmpeg.org/ffmpeg/trunk	15 years ago
Jason Garrett-Glaser	a71abb714e	VP8: shave a few clocks off check_intra_pred_mode Originally committed as revision 24458 to svn://svn.ffmpeg.org/ffmpeg/trunk	15 years ago
Jason Garrett-Glaser	0087aa47d0	VP8: fix broken sign bias code in MV pred Apparently the official conformance test vectors don't test this feature, even though libvpx uses it. Originally committed as revision 24456 to svn://svn.ffmpeg.org/ffmpeg/trunk	15 years ago
Jason Garrett-Glaser	3ae079a3c8	VP8: optimize DC-only chroma case in the same way as luma. Add MMX idct_dc_add4uv function for this case. ~40% faster chroma idct. Originally committed as revision 24455 to svn://svn.ffmpeg.org/ffmpeg/trunk	15 years ago
Jason Garrett-Glaser	3df56f4118	VP8: Clean up some variable shadowing. Originally committed as revision 24454 to svn://svn.ffmpeg.org/ffmpeg/trunk	15 years ago
Jason Garrett-Glaser	51c9156438	VP8 asm: cosmetics (spacing) Originally committed as revision 24453 to svn://svn.ffmpeg.org/ffmpeg/trunk	15 years ago
Jason Garrett-Glaser	8a467b2d44	VP8: 30% faster idct_mb Take shortcuts based on statistically common situations. Add 4-at-a-time idct_dc function (mmx and sse2) since rows of 4 DC-only DCT blocks are common. TODO: tie this more directly into the MB mode, since the DC-level transform is only used for non-splitmv blocks? Originally committed as revision 24452 to svn://svn.ffmpeg.org/ffmpeg/trunk	15 years ago
Jason Garrett-Glaser	ef38842f0b	VP8: smarter prefetching Don't prefetch reference frames that were used less than 1/32th of the time so far in the frame. This helps speed up to ~2% on videos that, in many frames, make near-zero (but not entirely zero) use of golden and/or alt-refs. This is a very common property of videos encoded by libvpx. Originally committed as revision 24451 to svn://svn.ffmpeg.org/ffmpeg/trunk	15 years ago
Baptiste Coudurier	9479415e4e	In h264 parser, return immediately if buf_size is 0, avoid printing erroneous message for last frame. Originally committed as revision 24450 to svn://svn.ffmpeg.org/ffmpeg/trunk	15 years ago
Jason Garrett-Glaser	c25c776708	VP8: clear DCT blocks in iDCT instead of using clear_blocks. ~0.3% faster overall. Originally committed as revision 24448 to svn://svn.ffmpeg.org/ffmpeg/trunk	15 years ago
Jason Garrett-Glaser	b74f70d646	VP8: avoid a memset for non-i4x4 blocks with no coefficients Originally committed as revision 24447 to svn://svn.ffmpeg.org/ffmpeg/trunk	15 years ago
Jason Garrett-Glaser	145d31865d	Get rid of more unnecessary dereferences in VP8 deblocking Originally committed as revision 24446 to svn://svn.ffmpeg.org/ffmpeg/trunk	15 years ago
Jason Garrett-Glaser	867215336d	Shut up an uninitialized variable GCC warning in VP8. Originally committed as revision 24445 to svn://svn.ffmpeg.org/ffmpeg/trunk	15 years ago
Jason Garrett-Glaser	c4211046d2	Smarter VP8 prefetching Prefetch all refs (including altref), but only if they've been used so far this frame. ~2.5% faster overall. TODO: Do something even smarter, like using how often each ref has been used so far, so that a couple blocks of a rarely-used ref don't force us to prefetch it. Originally committed as revision 24444 to svn://svn.ffmpeg.org/ffmpeg/trunk	15 years ago
Jason Garrett-Glaser	8cfae560ad	Fix stupid bug in VP8 prefetching code Originally committed as revision 24443 to svn://svn.ffmpeg.org/ffmpeg/trunk	15 years ago
Jason Garrett-Glaser	2a38c2e99a	Eliminate a LUT in escape decoding in VP8 decode_block_coeffs Originally committed as revision 24441 to svn://svn.ffmpeg.org/ffmpeg/trunk	15 years ago
Jason Garrett-Glaser	d292c3455e	Eliminate some repeated dereferences in VP8 inter_predict Originally committed as revision 24438 to svn://svn.ffmpeg.org/ffmpeg/trunk	15 years ago
Ronald S. Bultje	dc5eec8085	Use pextrw for SSE4 mbedge filter result writing, speedup 5-10cycles on CPUs supporting it. Originally committed as revision 24437 to svn://svn.ffmpeg.org/ffmpeg/trunk	15 years ago
James Zern	7eb185e0a3	Map settings for 2-pass libvpx encoding. Patch by James Zern, jzern at google Originally committed as revision 24430 to svn://svn.ffmpeg.org/ffmpeg/trunk	15 years ago
Jason Garrett-Glaser	b946111fde	Eliminate a pointless memset for intra blocks in P-frames in VP8 Originally committed as revision 24429 to svn://svn.ffmpeg.org/ffmpeg/trunk	15 years ago
Jason Garrett-Glaser	b9a7186bf4	VP8: Don't store segment in macroblock struct anymore. Not necessary with the previous patch. Originally committed as revision 24427 to svn://svn.ffmpeg.org/ffmpeg/trunk	15 years ago
Jason Garrett-Glaser	c55e0d34ba	Convert VP8 macroblock structures to a ring buffer. Uses a slightly nonintuitive ring buffer size of (width+height*2) to simplify addressing logic. Also split out the segmentation map to a separate structure, necessary to implement the ring buffer. Originally committed as revision 24426 to svn://svn.ffmpeg.org/ffmpeg/trunk	15 years ago
Jason Garrett-Glaser	968570d65f	Calculate deblock strength per-MB instead of per-row Gives better cache locality, since the VP8Macroblock structs are still in cache. Inspired by the way x264 does it. Originally committed as revision 24417 to svn://svn.ffmpeg.org/ffmpeg/trunk	15 years ago
Jason Garrett-Glaser	d1c58fce20	Avoid tracking i4x4 modes in P-frames in VP8 As in the previous commit, they aren't used for context selection, so it saves memory this way. Originally committed as revision 24416 to svn://svn.ffmpeg.org/ffmpeg/trunk	15 years ago
Jason Garrett-Glaser	158e062c95	Avoid useless fill_rectangle in P-frames in VP8 In VP8, i4x4 only uses contexts based on neighbors in I-frames. Originally committed as revision 24415 to svn://svn.ffmpeg.org/ffmpeg/trunk	15 years ago
Jason Garrett-Glaser	7bf254c41d	Optimize partition mv decoding in VP8 Originally committed as revision 24414 to svn://svn.ffmpeg.org/ffmpeg/trunk	15 years ago
Jason Garrett-Glaser	c0498b3031	Take shortcuts for mv0 case in VP8 MC Avoid edge emulation -- it isn't needed if there isn't any subpel. Originally committed as revision 24413 to svn://svn.ffmpeg.org/ffmpeg/trunk	15 years ago
Jason Garrett-Glaser	702e8d3376	Much faster VP8 mv and mode prediction Originally committed as revision 24412 to svn://svn.ffmpeg.org/ffmpeg/trunk	15 years ago
Jason Garrett-Glaser	d229ae2b62	Convert vp56_mv to 16-bit. Saves nothing except a bit of memory/cache now, but will allow future optimizations. Originally committed as revision 24411 to svn://svn.ffmpeg.org/ffmpeg/trunk	15 years ago
Jason Garrett-Glaser	d864dee8ab	Add prefetching to VP8 decoder ~5% faster overall, probably depends on CPU and resolution. Originally committed as revision 24410 to svn://svn.ffmpeg.org/ffmpeg/trunk	15 years ago
Ronald S. Bultje	003243c3c2	Fix and enable horizontal >=SSE2 mbedge loopfilter. Originally committed as revision 24409 to svn://svn.ffmpeg.org/ffmpeg/trunk	15 years ago
Loren Merritt	c7b1d9768c	relicense h264 deblock sse2 to lgpl Originally committed as revision 24408 to svn://svn.ffmpeg.org/ffmpeg/trunk	15 years ago
Loren Merritt	532e769701	sync yasm macros from x264 Originally committed as revision 24406 to svn://svn.ffmpeg.org/ffmpeg/trunk	15 years ago
Jason Garrett-Glaser	8731dbd890	Eliminate one instruction in VP8 dc_add_sse4 Originally committed as revision 24405 to svn://svn.ffmpeg.org/ffmpeg/trunk	15 years ago

... 3 4 5 6 7 ...

12337 Commits (0eb1a3569e17b4afb1328df12f1b72b21025ef5e)