FFmpeg

Commit Graph

Author	SHA1	Message	Date
James Almer	52ec81c67d	x86/hevc_res_add: add missing guards to hevc_transform_add32_8_avx2 Should fix compilation with old Yasm/Nasm versions. Signed-off-by: James Almer <jamrial@gmail.com>	10 years ago
James Almer	c3d2426cca	x86/hevc_res_add: add ff_hevc_transform_add32_8_avx2 ~20% faster than AVX. Reviewed-by: Michael Niedermayer <michaelni@gmx.at> Signed-off-by: James Almer <jamrial@gmail.com>	10 years ago
James Darnley	46ef45ab59	lavc/x86/v210: give cpuflag to INIT macro This lets the cglobal macro automatically append a suffix to the function name. This means that INIT_XMM avx must be used rather than INIT_AVX. Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	10 years ago
Pascal Massimino	7a1d6ddd2c	xvid: Add C IDCT Thanks to Pascal Massimino and Michael Militzer for relicensing as LGPL. Signed-off-by: Diego Biurrun <diego@biurrun.de>	10 years ago
Diego Biurrun	95c0cec03a	idctdsp: Add global function pointers for {add\|put}_pixels_clamped functions These function pointers already existed in the ARM code. Adding them globally allows calls to the function pointers to access arch-optimized versions of the functions transparently.	10 years ago
Reimar Döffinger	d9e2aceb7f	Add missing "const" all over the place. Only "./configure --enable-gpl" on x86 was tested. Signed-off-by: Reimar Döffinger <Reimar.Doeffinger@gmx.de>	10 years ago
Diego Biurrun	8d27bf1cff	x86: xvid: K&R formatting cosmetics	10 years ago
Diego Biurrun	dcb7c868ec	cosmetics: Make naming scheme of Xvid IDCT consistent with other IDCTs	10 years ago
Diego Biurrun	1f156af427	x86: xvid_idct: Drop unused definitions	10 years ago
Christophe Gisquet	3e892b2bcd	x86: hevc_mc: split differently calls In some cases, 2 or 3 calls are performed to functions for unusual widths. Instead, perform 2 calls for different widths to split the workload. The 8+16 and 4+8 widths for respectively 8 and more than 8 bits can't be processed that way without modifications: some calls use unaligned buffers, and having branches to handle this was resulting in no micro-benchmark benefit. For block_w == 12 (around 1% of the pixels of the sequence): Before: 12758 decicycles in epel_uni, 4093 runs, 3 skips 19389 decicycles in qpel_uni, 8187 runs, 5 skips 22699 decicycles in epel_bi, 32743 runs, 25 skips 34736 decicycles in qpel_bi, 32733 runs, 35 skips After: 11929 decicycles in epel_uni, 4096 runs, 0 skips 18131 decicycles in qpel_uni, 8184 runs, 8 skips 20065 decicycles in epel_bi, 32750 runs, 18 skips 31458 decicycles in qpel_bi, 32753 runs, 15 skips Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	10 years ago
Christophe Gisquet	38e2aa3759	x86: hevc_mc: correct unneeded use of SSE4 code Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	10 years ago
Christophe Gisquet	2346f2b5db	x86: hevcdsp: use compilation-time-fixed constant The stride for some buffers is known. Reviewed-by: Mickaël Raulet <mraulet@gmail.com> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	11 years ago
Christophe Gisquet	dad7f15567	hevcdsp: remove more instances of compile-time-fixed parameters Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	11 years ago
Christophe Gisquet	d4f44b66d3	hevcdsp: remove compilation-time-fixed parameter The dststride parameter is always MAX_PB_SIZE. Reviewed-by: Mickaël Raulet <mraulet@gmail.com> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	11 years ago
Christophe Gisquet	fb1a98ec5b	x86: hevc_mc: assume 2nd source stride is 64 Reviewed-by: Mickaël Raulet <mraulet@gmail.com Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	11 years ago
James Almer	54ca4dd43b	x86/hevc_res_add: refactor ff_hevc_transform_add{16,32}_8 * Reduced xmm register count to 7 (As such they are now enabled for x86_32). * Removed four movdqa (affects the sse2 version only). * pxor is now used to clear m0 only once. ~5% faster. Reviewed-by: Christophe Gisquet <christophe.gisquet@gmail.com> Signed-off-by: James Almer <jamrial@gmail.com>	11 years ago
James Almer	76a99d467f	x86/hecv_res_add: add ff_hevc_transform_add{8,16,32}_8_avx ~15% faster than sse2 Reviewed-by: Mickaël Raulet <mraulet@gmail.com> Reviewed-by: Christophe Gisquet <christophe.gisquet@gmail.com> Signed-off-by: James Almer <jamrial@gmail.com>	11 years ago
James Almer	9f498f4e6f	x86/hevc_res_add: fix register count in hevc_transform_add{16,32}_10_avx2 Signed-off-by: James Almer <jamrial@gmail.com>	11 years ago
Pierre Edouard Lepere	a6af4bf64d	x86: hevc: adding transform_add Reviewed-by: James Almer <jamrial@gmail.com> Approved-by: Ronald S. Bultje Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	11 years ago
Diego Biurrun	efd26bedec	build: Add explanatory comments to (optimization) blocks in the Makefiles	11 years ago
Diego Biurrun	835f798c7d	mpegvideo: cosmetics: Lowercase ugly uppercase MPV_ function name prefixes	11 years ago
James Darnley	54a51d3840	lavc/flacenc: partially unroll loop in flac_enc_lpc_16 It now does 12 samples per iteration, up from 4. From 1.8 to 3.2 times faster again. 3.6 to 5.7 times faster overall. Runtime is reduced by a further 2 to 18%. Overall runtime reduced by 4 to 50%. Same conditions as before apply. Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	11 years ago
James Darnley	0081a14e7d	lavc/flacenc: add sse4 version of the 16-bit lpc encoder From 1.8 to 2.4 times faster. Runtime is reduced by 2 to 39%. The speed-up generally increases with compression_level. This lpc encoder is not used with levels < 3 so it provides no speed-up in these cases. Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	11 years ago
Ronald S. Bultje	45bed0ab30	vp9/x86: fix bug in intra_pred_hd_32x32. Fixes mismatch in first keyframe in sample ffvp9_fails_where_libvpx.succeeds.webm from ticket 3849. There's still a second mismatch a few frames into the sample. Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	11 years ago
James Almer	c97870d1a1	x86/dca: remove unused header Signed-off-by: James Almer <jamrial@gmail.com> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	11 years ago
James Almer	e20ff251a6	x86/ttadsp: remove an unnecessary mova Signed-off-by: James Almer <jamrial@gmail.com> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	11 years ago
Diego Biurrun	d35b94fbab	avcodec: Rename xvidmmx IDCT to xvid The Xvid IDCT is not MMX-specific.	11 years ago
Diego Biurrun	84d173d3de	xvididct: Ensure that the scantable permutation is always set correctly This fixes cases where the scantable permuation would get overwritten by the general idctdsp initialization.	11 years ago
Christophe Gisquet	75837e9add	x86: sbrdsp/fft: reuse ps_neg constant Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	11 years ago
Christophe Gisquet	51dd80e751	x86: diracdsp: reuse constants Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	11 years ago
Christophe Gisquet	6622a6cff3	x86: dwt: better share constants Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	11 years ago
Christophe Gisquet	71db2d08b1	x86: better share ff_pw_2 Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	11 years ago
Christophe Gisquet	4e128ab0b1	x86: vpx/h264/hevc/mpeg2: share constants Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	11 years ago
Michael Niedermayer	305f72aee7	avcodec: Change get_pixels() to ptrdiff_t linesize Found-by: ubitux Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	11 years ago
Christophe Gisquet	6786848585	hevc_deblock: change tc type The x86 asm expects int32_t so use that type. Reviewed-by: Mickaël Raulet <mraulet@insa-rennes.fr> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	11 years ago
James Almer	de417982e8	x86/vp9lpf: use fewer instructions in SPLATB_MIX Signed-off-by: James Almer <jamrial@gmail.com> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	11 years ago
Christophe Gisquet	e8c003edd2	x86: hevc_deblock: remove unnecessary masking The unpacks/shuffles later on makes it unnecessary. Before: 1508 decicycles in h, 2096759 runs, 393 skips 2512 decicycles in v, 2095422 runs, 1730 skips After: 1477 decicycles in h, 2096745 runs, 407 skips 2484 decicycles in v, 2095297 runs, 1855 skips Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	11 years ago
James Almer	b7863c972c	x86/hevc_mc: use fewer instructions in hevc_put_hevc_{uni, bi}_w[24]_{8, 10, 12} Signed-off-by: James Almer <jamrial@gmail.com> Reviewed-by: Christophe Gisquet <christophe.gisquet@gmail.com> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	11 years ago
James Almer	b1a44e6bf5	x86/hevc_mc: remove an unnecessary pxor Signed-off-by: James Almer <jamrial@gmail.com> Reviewed-by: Mickaël Raulet <mraulet@gmail.com> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	11 years ago
James Almer	d0f56ca071	x86/hevc_deblock: improve 8bit transpose store macros Up to four instructions less depending on function and instruction set. Signed-off-by: James Almer <jamrial@gmail.com> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	11 years ago
Diego Biurrun	a786c8259d	idct: Split off Xvid IDCT The Xvid IDCT is only required to decode some Xvid-encoded MPEG-4 files, so there is no point in having it as an unconditional part of idctdsp.	11 years ago
James Almer	62baf5b853	x86/hevc_deblock: use existing x86util transpose macro in chroma_{10, 12} Cosmetic change. No measurable difference in speed. Signed-off-by: James Almer <jamrial@gmail.com> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	11 years ago
Christophe Gisquet	a507623bad	x86: hevc_mc: fix register count usage A macro was using a fixed register, causing too many GPRs to be declared as used. Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	11 years ago
James Almer	73c4f63ba5	x86/hevc_deblock: add add ff_hevc_[hv]_loop_filter_luma_{8, 10, 12}_avx ~5% faster than SSSE3 Signed-off-by: James Almer <jamrial@gmail.com> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	11 years ago
James Almer	88ba821f23	x86/hevc_deblock: improve luma functions register allocation Signed-off-by: James Almer <jamrial@gmail.com> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	11 years ago
James Almer	c74b08c5c6	x86/hevc_deblock: remove some unnecessary instructions Signed-off-by: James Almer <jamrial@gmail.com> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	11 years ago
James Almer	4f91bb0ff0	x86/hevc_deblock: use psignw instead of pmullw where possible It's slightly faster Signed-off-by: James Almer <jamrial@gmail.com> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	11 years ago
Diego Biurrun	4f8cf0dc4e	x86: build: Restore ordering of OBJS lines	11 years ago
James Almer	664e9e4331	x86/hevc_deblock: load less data in hevc_h_loop_filter_luma_8 Reading 8 bytes is enough. Signed-off-by: James Almer <jamrial@gmail.com> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	11 years ago
James Almer	f137876182	x86/hevc_idct: add a colon to labels This fixes a warning spam when using NASM Signed-off-by: James Almer <jamrial@gmail.com> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	11 years ago

1 2 3 4 5 ...

1825 Commits (2762323c37511fbbc98b164c07620b9ebc59ec68)