FFmpeg

Commit Graph

Author	SHA1	Message	Date
Michael Niedermayer	b3ab281027	avcodec/x86/cabac: workaround llvm 4.2.1 bug x86_64 is affected by this too Fixes Ticket2156 Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	12 years ago
Michael Niedermayer	66bdc58550	get_cabac_inline_x86: workaround clang bug with disabled optimizations gcc produces binary identical output relative to before this change Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	12 years ago
Mans Rullgard	8ec0204ee4	x86: cabac: allow building with suncc This fixes two issues preventing suncc from building this code. The undocumented 'a' operand modifier, causing gcc to omit a $ in front of immediate operands (as required in addresses), is not supported by suncc. Luckily, the also undocumented 'c' modifer has the same effect and is supported. On some asm statements with a large number of operands, suncc for no obvious reason fails to correctly substitute some of the operands. Fortunately, some of the operands in these statements are plain numbers which can be inserted directly into the code block instead of passed as operands. With these changes, the code builds correctly with both gcc and suncc. Signed-off-by: Mans Rullgard <mans@mansr.com>	12 years ago
Mans Rullgard	c318626ce2	x86: rename libavutil/x86_cpu.h to libavutil/x86/asm.h This puts x86-specific things in the x86/ subdirectory where they belong. Signed-off-by: Mans Rullgard <mans@mansr.com>	12 years ago
Ronald S. Bultje	8123e0901f	x86: place some inline asm under #if HAVE_INLINE_ASM Signed-off-by: Mans Rullgard <mans@mansr.com>	13 years ago
Roland Scheidegger	82c71913e4	h264: new assembly version of get_cabac for x86_64 with PIC This adds a hand-optimized assembly version for get_cabac much like the existing one, but it works if the table offsets are RIP-relative. Compared to the non-RIP-relative version this adds 2 lea instructions and it needs one extra register. There is a surprisingly large performance improvement over the c version (more so than the generated assembly seems to suggest) just in get_cabac, I measured roughly 40% faster for get_cabac on a K8. However, overall the difference is not that big, I measured roughly 5% on a test clip on a K8 and a Core2. Hopefully it still compiles on x86 32bit... Now that only one table is used, there's some chance even darwin as compiles this (apparently the label arithmetic used previously doesn't work if it involves symbols defined in a different file, thanks to Ronald S. Bultje for helping me with this). Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	13 years ago
Roland Scheidegger	7f668cd2b5	h264: use one table instead of several for cabac functions The reason is this is easier for PIC code (in particular on darwin...). Keep the old names as pointers (static in cabac_functions.h so gcc knows these are just immediate offsets) so the c code can nicely stay the same (alternatively could use offsets directly in the functions needing the tables). This should produce the same code as before with non-pic and better code (confirmed) with pic. The assembly uses the new table but still won't work for PIC case. Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	13 years ago
Roland Scheidegger	5520df6a8f	h264: (trivial) remove unneeded macro argument in x86/cabac.h Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	13 years ago
Roland Scheidegger	9b9df1cdff	h264: new assembly version of get_cabac for x86_64 with PIC This adds a hand-optimized assembly version for get_cabac much like the existing one, but it works if the table offsets are RIP-relative. Compared to the non-RIP-relative version this adds 2 lea instructions and it needs one extra register. get_cabac() gets about 40% faster, for an overall speedup of about 5%. Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>	13 years ago
Roland Scheidegger	14e9ffc1e4	h264: use one table instead of several for cabac functions The reason is this is easier for PIC code (in particular on darwin...). Keep the old names as pointers (static in cabac_functions.h so gcc knows these are just immediate offsets) so the c code can nicely stay the same (alternatively could use offsets directly in the functions needing the tables). This should produce the same code as before with non-pic and better code (confirmed) with pic. The assembly uses the new table but still won't work for PIC case. Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>	13 years ago
Roland Scheidegger	444f47b55c	h264: (trivial) remove unneeded macro argument in x86/cabac.h Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>	13 years ago
Michael Niedermayer	9849515214	Revert "h264: assembly version of get_cabac for x86_64 with PIC (v4)" This broke compilation on darwin, revert until a better solution is found. This reverts commit `a812b599b5`.	13 years ago
Roland Scheidegger	a812b599b5	h264: assembly version of get_cabac for x86_64 with PIC (v4) This adds a hand-optimized assembly version for get_cabac much like the existing one, but it works if the table offsets are RIP-relative. Compared to the non-RIP-relative version this adds 2 lea instructions and it needs one extra register. There is a surprisingly large performance improvement over the c version (more so than the generated assembly seems to suggest) just in get_cabac, I measured roughly 40% faster for get_cabac on a K8. However, overall the difference is not that big, I measured roughly 5% on a test clip on a K8 and a Core2. Hopefully it still compiles on x86 32bit... v2: incorporated feedback from Loren Merritt to avoid rip-relative movs for every table, and got rid of unnecessary @GOTPCREL. v3: apply similar fixes to the the decode_significance functions, and use same macro arguments for non-pic case. v4: prettify inline asm arguments, add a non-fast-cmov version (as I expect the c code to be faster otherwise since both cmov and sbb suck hard on a Prescott, even can't construct the mask with a 64bit shift as that's just as terrible - it's quite difficult to find usable instructions on that chip...). This is tested to work but not on a P4, in theory it _should_ be fast there. Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	13 years ago
Ronald S. Bultje	a940198130	cabac: add overread protection to BRANCHLESS_GET_CABAC(). Found-by: Mateusz "j00ru" Jurczyk and Gynvael Coldwind	13 years ago
Ronald S. Bultje	16f6e83f74	cabac: remove unused argument from BRANCHLESS_GET_CABAC_UPDATE().	13 years ago
Ronald S. Bultje	951014e5bb	cabac: use struct+offset instead of memory operand in BRANCHLESS_GET_CABAC().	13 years ago
Ronald S. Bultje	a0bdcb019e	h264: add overread protection to get_cabac_bypass_sign_x86().	13 years ago
Ronald S. Bultje	95bfa4ead7	h264: reindent get_cabac_bypass_sign_x86().	13 years ago
Ronald S. Bultje	db025929f2	h264: use struct offsets in get_cabac_bypass_sign_x86().	13 years ago
Michael Niedermayer	5387f9917f	cabac: Try to disable problematic ASM for gcc-llvm 4.2.1 This should fix compilation with gcc-llvm (see darwin fate box) Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	13 years ago
Michael Niedermayer	f247f4cf47	cabac: 3rd try at working around a compiler bug in clang. Switch to a broader detection of versions. Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	13 years ago
Michael Niedermayer	444632eae6	cabac: Disable get_cabac_inline_x86() for clang 2.9 on x86_32 This should finally fix the compilation issue on darwin Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	13 years ago
Michael Niedermayer	2138a89e71	Revert "Revert commit 599b4c6efddaed33b1667c386b34b07729ba732b" This reverts commit `c4f237a981`. This didnt fix compilation on darwin with current clang.	13 years ago
Michael Niedermayer	c4f237a981	Revert commit `599b4c6efd` Author: Mans Rullgard <mans@mansr.com> Date: Sun Dec 11 21:41:59 2011 +0000 x86: cabac: replace explicit memory references with "m" operands This replaces the explicit offset(reg) memory references with "m" operands for the same locations. As a result, one fewer register operand is needed for these inline asm statements. This change appears to have broken compilation on darwin, and subsequent fixes by martin (which did not fix compilation) removed the register advantage, thus this change seems not a good idea to keep. See: http://fate.ffmpeg.org/log.cgi?time=20120103122446&log=compile&slot=i386-darwin-llvm-gcc-4.2.1 Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	13 years ago
Martin Storsjö	8349dbfe46	x86: Require 7 registers for the cabac asm The change in `599b4c6ef` didn't turn out to work properly on i386 on OS X, where it broke building with PIC enabled. Signed-off-by: Martin Storsjö <martin@martin.st> (cherry picked from commit `f1dba9e498`) Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	13 years ago
Martin Storsjö	f1dba9e498	x86: Require 7 registers for the cabac asm The change in `599b4c6ef` didn't turn out to work properly on i386 on OS X, where it broke building with PIC enabled. Signed-off-by: Martin Storsjö <martin@martin.st>	13 years ago
Mans Rullgard	599b4c6efd	x86: cabac: replace explicit memory references with "m" operands This replaces the explicit offset(reg) memory references with "m" operands for the same locations. As a result, one fewer register operand is needed for these inline asm statements. Signed-off-by: Mans Rullgard <mans@mansr.com>	13 years ago
Diego Biurrun	276b995d85	x86: drop pointless ARCH_X86 #ifdef from files in x86 subdirectory	13 years ago
Carl Eugen Hoyos	324b8adca4	Fix a possible miscompilation of cabac with old (broken) compilers.	13 years ago
Mans Rullgard	3ad1684126	x86: cabac: add operand size suffixes missing from `6c32576` This fixes build with clang. Signed-off-by: Mans Rullgard <mans@mansr.com>	13 years ago
Mans Rullgard	f5f004bc5a	x86: cabac: don't load/store context values in asm Inspection of compiled code shows gcc handles these fine on its own. Benchmarking also shows no measurable speed difference. Removing the remaining cases in get_cabac_bypass_sign_x86() does cause more substantial changes to the compiled code with uncertain impact. Signed-off-by: Mans Rullgard <mans@mansr.com>	13 years ago
Jason Garrett-Glaser	6c32576548	H.264: optimize CABAC x86 asm for Atom	13 years ago
Mans Rullgard	c5ee740745	x86: cabac: fix register constraints for 32-bit mode Some operands need to be accessed in byte mode, which restricts the available registers in 32-bit mode. Using the 'q' constraint selects a suitable register. Signed-off-by: Mans Rullgard <mans@mansr.com>	14 years ago
Mans Rullgard	2143d69bdd	cabac: move x86 asm to libavcodec/x86/cabac.h Signed-off-by: Mans Rullgard <mans@mansr.com>	14 years ago

37 Commits (fb0df5c113e0dff2311201dc828db1648174972b)