Pierre Edouard Lepere
1a880b2fb8
hevc: SSE2 and SSSE3 loop filters
...
Additional contributions by James Almer <jamrial@gmail.com>,
Carl Eugen Hoyos <cehoyos@ag.or.at>, Fiona Glaser <fiona@x264.com> and
Anton Khirnov <anton@khirnov.net>
Signed-off-by: Anton Khirnov <anton@khirnov.net>
10 years ago
Christophe Gisquet
036f11bdb5
x86: hevc_mc: replace simple leas by adds
...
lea is detrimental for those simple cases. No impact overall to
the change though.
Before:
15017 decicycles in q, 1016152 runs, 32424 skips
15382 decicycles in q_bi, 1013673 runs, 34903 skips
3713 decicycles in e, 2074534 runs, 22618 skips
3901 decicycles in e_bi, 2065509 runs, 31643 skips
7852 decicycles in q_uni, 520165 runs, 4123 skips
2398 decicycles in e_uni, 1043339 runs, 5237 skips
After:
14898 decicycles in q, 1016295 runs, 32281 skips
15119 decicycles in q_bi, 1015392 runs, 33184 skips
3682 decicycles in e, 2073224
runs, 23928 skips
3720 decicycles in e_bi, 2065043 runs, 32109 skips
7643 decicycles in q_uni, 520280 runs, 4008 skips
2363 decicycles in e_uni, 1043780 runs, 4796 skips
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
10 years ago
Mickaël Raulet
bd0f2d316f
x86/hevc: add 12bits support for MC
...
cherry picked from commit 3fcb7a4595a6f40100a22110a5805e3b7510c0fd
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
10 years ago
Mickaël Raulet
7df98d8c4d
x86/hevc: remove unused constant in deblocking filter
...
cherry picked from commit a3f7282eaa6f1ab0524fb966c6eade50c3025f99
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
10 years ago
Mickaël Raulet
7bdcf5c934
x86/hevc: add 12bits support for deblocking filter
...
cherry picked from commit 97d46afe320c7d61d7b9525e5f5588355cde4bb0
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
10 years ago
Diego Biurrun
7fb993d338
qpeldsp: Mark source pointer in qpel_mc_func function pointer const
10 years ago
Christophe Gisquet
670b7f203a
x86: hevcdsp: align
...
Reviewed-by: Mickaël Raulet <mraulet@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
10 years ago
Carl Eugen Hoyos
c75fdee747
avcodec/x86/hevc_deblock: Fix compilation with nasm.
10 years ago
Michael Niedermayer
ca6b33b8bd
avcodec/x86/hevcdsp_init: Fix "warning: assignment from incompatible pointer type"
10 years ago
Anton Khirnov
d7e162d46b
hevcdsp: remove an unneeded variable in the loop filter
...
beta0 and beta1 will always be the same within a CU
Signed-off-by: Mickaël Raulet <mraulet@insa-rennes.fr>
cherry picked from commit 4a23d824741a289c7d2d2f2871d1e2621b63fa1b
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
10 years ago
Anton Khirnov
ae2f048fd7
avcodec/x86/hevc_deblock: cosmetics
...
cherry picked from commit f7843356253459e6010320292dbbc1e888a5249b
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
10 years ago
Anton Khirnov
b435043abb
hevc: cleanups in SSE2 and SSSE3 loop filters, use fewer instructions
...
cherry picked from commit f7843356253459e6010320292dbbc1e888a5249b
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
10 years ago
Anton Khirnov
e8581b17a8
avcodec/x86/hevc_deblock: use test instead of cmp 0
...
cherry picked from commit f7843356253459e6010320292dbbc1e888a5249b
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
10 years ago
Anton Khirnov
dc69247de4
avcodec/x86/hevc_deblock: use of paddw instead of psllw
...
cherry picked from commit f7843356253459e6010320292dbbc1e888a5249b
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
10 years ago
Anton Khirnov
500a0394d5
avcodec/x86/hevc_deblock: add %ifs to avoid "do nothing instructions"
...
cherry picked from commit f7843356253459e6010320292dbbc1e888a5249b
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
10 years ago
Anton Khirnov
7a4cf67117
hevc: cleaning up SSE2 and SSSE3 deblocking filters
...
Signed-off-by: Mickaël Raulet <mraulet@insa-rennes.fr>
cherry picked from commit b432041d7d1eca38831590f13b4e5baffff8186f
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
10 years ago
Diego Biurrun
81b9bf3192
dct-test: Move arch-specific bits into arch-specific subdirectories
10 years ago
Diego Biurrun
5dcc201505
simple_idct: Move x86-specific declarations to a header in the x86 directory
10 years ago
Diego Biurrun
85cabb8d00
fdct: Move x86-specific declarations to a header in the x86 directory
10 years ago
Diego Biurrun
9e0b29911f
x86: dnxhdenc: Eliminate some unnecessary ifdefs
10 years ago
Diego Biurrun
8b0dd4942a
idctdsp: prettyprinting cosmetics
10 years ago
Diego Biurrun
b4987f7219
idct: Convert IDCT permutation #defines to an enum
...
Also rename the enum values to be consistent with other DCT permutations.
10 years ago
Diego Biurrun
2d60444331
dsputil: Split motion estimation compare bits off into their own context
10 years ago
Diego Biurrun
c23ce454b3
x86: dsputil: Coalesce all init files
...
This makes the init files match the structure of the dsputil split.
10 years ago
Diego Biurrun
acf91215c7
x86: dsputil: Avoid pointless CONFIG_ENCODERS indirection
...
The remaining dsputil bits are encoding-specific anyway.
10 years ago
James Almer
276bef5340
x86/hevc_deblock: add ff_hevc_[hv]_loop_filter_luma_{8, 10}_sse2
...
Signed-off-by: James Almer <jamrial@gmail.com>
Reviewed-by: Kieran Kunhya <kierank@obe.tv>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
10 years ago
James Almer
123649dd19
x86/dsputilenc: remove some empty if statements
...
Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
10 years ago
Diego Biurrun
1173320249
dsputil: Drop unused bit_depth parameter from all init functions
11 years ago
Diego Biurrun
f46bb608d9
dsputil: Split off pixel block routines into their own context
11 years ago
Diego Biurrun
a9aee08d90
dsputil: Split off FDCT bits into their own context
11 years ago
Diego Biurrun
3c650efb81
dsputil: Move draw_edges() to mpegvideoencdsp
11 years ago
Diego Biurrun
c166148409
dsputil: Move pix_sum, pix_norm1, shrink function pointers to mpegvideoenc
11 years ago
Diego Biurrun
8d686ca59d
dsputil: Split off *_8x8basis to a separate context
11 years ago
James Almer
195f7bd23d
x86/svq1enc: use unaligned mov on SSE2
...
Might fix fate failures on some systems
Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
11 years ago
James Almer
dad31083ae
x86/svq1enc: port ssd_int8_vs_int16 to yasm
...
Also add an SSE2 version
Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
11 years ago
Diego Biurrun
b0de1c7663
x86: build: Only compile FDCT code if MMX is enabled
...
All other files containing purely inline assembly are treated the same way.
11 years ago
Diego Biurrun
12f129e545
x86: Unconditionally compile blockdsp and svq1enc init files
...
This avoids a link failure with MMX disabled as the init functions
are referenced unconditionally.
11 years ago
Diego Biurrun
009331303a
x86: huffyuvdsp: Move inline assembly to init file
...
This avoids a link failure with MMX disabled as now code and
initialization are compiled under the same condition.
11 years ago
Diego Biurrun
391ecc961c
x86: mpegvideoenc: Change SIMD optimization name suffixes to lowercase
11 years ago
James Almer
a441a2437b
x86: rename dsputil.asm to idctdsp.asm
...
Its only function is no longer part of dsputil.
Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
11 years ago
Diego Biurrun
79793f8337
Update Fiona's name in copyright statements.
11 years ago
Diego Biurrun
e3fcb14347
dsputil: Split off IDCT bits into their own context
11 years ago
Michael Niedermayer
5bca5f87d1
Revert "x86/videodsp: add emulated_edge_mc_mmxext"
...
The commit causes minor out of array reads and was mainly intended for
future optimizations which turned out not to be meassurably faster.
Itself it was just 1 cpu cycle faster
Approved-by: jamrial
This reverts commit 057d2704e7
.
11 years ago
Diego Biurrun
d2869aea04
dsputil: Move MMX/SSE2-optimized IDCT bits to the x86 subdirectory
11 years ago
James Almer
057d2704e7
x86/videodsp: add emulated_edge_mc_mmxext
...
This also changes hfix8_mmx and above to use mmx regs instead of
gprs, and makes emulated_edge_mc_sse and emulated_edge_mc_sse2 use
mmxext hfix and hvar functions instead of mmx where possible.
This is mostly in preparation for an ssse3 version.
Signed-off-by: James Almer <jamrial@gmail.com>
code is about 1 cpu cycle faster approximately
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
11 years ago
Diego Biurrun
5ab03e41e5
x86: h264dsp: Fix link failure with optimizations disabled
...
With optimzations disabled compilers have trouble doing dead code
elimination on 'if (foo && 0)' expressions, while 'if (0 && foo)'
still works, so use the latter to avoid problems.
Bug-Id: 707
11 years ago
Michael Niedermayer
1ace0ca60f
avcodec/x86/hevc_idct: fix function name in comment
...
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
11 years ago
plepere
9ba6b17add
avcodec/x86/hevc_idct: fix number of sse registers
...
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
11 years ago
plepere
942e22c651
avcodec/x86/hevc: add avx2 dc idct
...
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
11 years ago
Michael Niedermayer
eab2509f8c
avcodec/x86/h264_qpel_10bit: locally define pb_0
...
somehow old llvm-gcc manages to ignore the alignment from ff_pb_0 causing a crash on freebsd
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
11 years ago