Diego Biurrun
9c112a6158
x86: dsputil: Move avg_pixels8_mmx() out of rnd_template.c
...
The function is only instantiated once, so there is no point
in keeping it in a template file.
12 years ago
Diego Biurrun
9b3a04d306
x86: Move duplicated put_pixels{8|16}_mmx functions into their own file
12 years ago
Diego Biurrun
f2e9d44a57
x86: Drop unnecessary ff_ name prefixes from static functions
12 years ago
Diego Biurrun
643e433bf7
mpegaudiosp: More consistent names for ppc/x86 optimization files
12 years ago
Diego Biurrun
97c56ad796
x86: dsputil: Remove a set of pointless #ifs around function declarations
12 years ago
Diego Biurrun
85f2f82af6
x86: dsputil: cosmetics: Group ff_{avg|put}_pixels16_mmxext() declarations
12 years ago
Diego Biurrun
20784aa678
x86: hpeldsp: Remove unused macro definitions
12 years ago
Diego Biurrun
7c00e9d8ae
x86: ac3dsp: Remove 3dnow version of ff_ac3_extract_exponents
...
The function requires increasing the fuzz factor for the ac3/eac3 encode
tests and even so makes fate fail. It only provides a slight encoding
speedup for legacy CPUs that do not support SS2. Thus its benefit is not
worth the trouble it creates and fixing it would be a waste of time.
12 years ago
Martin Storsjö
74685f6783
x86: Rename dsputil_rnd_template.c to rnd_template.c
...
This makes it less confusing when this template is shared both by
dsputil and by hpeldsp.
Signed-off-by: Martin Storsjö <martin@martin.st>
12 years ago
Michael Niedermayer
fc69033371
avcodec/x86/sbrdsp_init: disable using the noise code in x86_64 MSVC, Try #2
...
This should fix building with MSVC until someone can change the
code so it works with MSVC
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
12 years ago
Martin Storsjö
486f76f029
x86: Get rid of duplication between *_rnd_template.c
...
Signed-off-by: Martin Storsjö <martin@martin.st>
12 years ago
Martin Storsjö
6a8561dbd7
x86: Factorize duplicated inline assembly snippets
...
Signed-off-by: Diego Biurrun <diego@biurrun.de>
12 years ago
Michael Niedermayer
7a617d6c17
avcodec/x86/sbrdsp_init: disable using the noise code in x86_64 MSVC
...
This should fix building with MSVC until someone can change the
code so it works with MSVC
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
12 years ago
Diego Biurrun
c1ad70c3cb
x86: Move some conditional code around to avoid unused variable warnings
12 years ago
Diego Biurrun
b4ad7c54c8
x86: cavs: Refactor duplicate dspfunc macro
12 years ago
Diego Biurrun
78fa0bd0f7
x86: cavs: Put mmx-specific code into its own init function
...
Before, this code was labeled as mmxext and enabled both for the
3dnow and the mmxext case.
12 years ago
Diego Biurrun
311a592dfc
x86: Remove some duplicate function declarations
12 years ago
Martin Storsjö
b71a0507b0
x86: Remove unused inline asm instruction defines
...
Signed-off-by: Martin Storsjö <martin@martin.st>
12 years ago
Ronald S. Bultje
8db00081a3
x86: hpeldsp: Move half-pel assembly from dsputil to hpeldsp
...
Signed-off-by: Martin Storsjö <martin@martin.st>
12 years ago
Christophe Gisquet
76c7277385
x86: sbrdsp: implement SSE2 hf_apply_noise
...
233 to 105 cycles on Arrandale and Win64.
Replacing the multiplication by s_m[m] by a pand and a pxor with
appropriate vectors is slower. Unrolling is a 15 cycles win.
A SSE version was 4 cycles slower.
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
12 years ago
Ronald S. Bultje
015821229f
vp3: Use full transpose for all IDCTs
...
This way, the special IDCT permutations are no longer needed. This
is similar to how H264 does it, and removes the dsputil dependency
imposed by the scantable code.
Also remove the unused type == 0 cases from the plain C version
of the idct.
Signed-off-by: Martin Storsjö <martin@martin.st>
12 years ago
Ronald S. Bultje
c46819f229
x86: Move constants to the only place where they are used
...
Signed-off-by: Martin Storsjö <martin@martin.st>
12 years ago
Diego Biurrun
a3cb865310
x86: dsputil: Move some ifdefs to avoid unused variable warnings
12 years ago
Diego Biurrun
2004c7c8f7
x86: dsputil: cosmetics: Remove two pointless variable indirections
12 years ago
Diego Biurrun
c51a3a5bd9
x86: dsputil: Refactor some ff_{avg|put}_pixels function declarations
12 years ago
Diego Biurrun
e027032fc6
x86: dsputil: ff_h263_*_loop_filter declarations to a more suitable place
12 years ago
Diego Biurrun
a89c05500f
x86: h264qpel: int --> ptrdiff_t for some line_size parameters
12 years ago
Diego Biurrun
ac9362c5d9
Move misplaced file author information where it belongs
12 years ago
Ronald S. Bultje
b93b27edb0
dsputil: Make dsputil selectable
...
Signed-off-by: Martin Storsjö <martin@martin.st>
12 years ago
Ronald S. Bultje
62844c3fd6
h264: Integrate clear_blocks calls with IDCT
...
The non-intra-pcm branch in hl_decode_mb (simple, 8bpp) goes from 700
to 672 cycles, and the complete loop of decode_mb_cabac and hl_decode_mb
(in the decode_slice loop) goes from 1759 to 1733 cycles on the clip
tested (cathedral), i.e. almost 30 cycles per mb faster.
Signed-off-by: Martin Storsjö <martin@martin.st>
12 years ago
Christophe Gisquet
2383068cbf
x86: sbrdsp: implement SSE2 qmf_pre_shuffle
...
From 253 to 51 cycles on Arrandale and Win64.
44 cycles on SandyBridge.
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
12 years ago
Ronald S. Bultje
610b18e2e3
x86: qpel: Move fullpel and l2 functions to a separate file
...
This way, they can be shared between mpeg4qpel and h264qpel without
requiring either one to be compiled unconditionally.
Signed-off-by: Martin Storsjö <martin@martin.st>
12 years ago
Christophe Gisquet
e2946e5c34
x86: sbrdsp: implement SSE qmf_deint_bfly
...
From 312 to 89/68 (sse/sse2) cycles on Arrandale and Win64.
Sandybridge: 68/47 cycles.
Having a loop counter is a 7 cycle gain.
Unrolling is another 7 cycle gain.
Working in reverse scan is another 6 cycles.
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
12 years ago
Christophe Gisquet
f4b0d12f5b
x86: sbrdsp: Implement SSE neg_odd_64
...
Timing on Arrandale:
C SSE
Win32: 57 44
Win64: 47 38
Unrolling and not storing mask both save some cycles.
Signed-off-by: Diego Biurrun <diego@biurrun.de>
12 years ago
Christophe Gisquet
37a9708391
x86: sbrdsp: implement SSE neg_odd_64
...
Timing on Arrandale:
C SSE
Win32: 57 44
Win64: 47 38
Unrolling and not storing mask both save some cycles.
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
12 years ago
Carl Eugen Hoyos
670bb1c979
Fix compilation with --enable-decoder=webp --disable-decoder=vp8
12 years ago
Diego Biurrun
b6649ab503
cosmetics: Remove unnecessary extern keywords from function declarations
12 years ago
Martin Storsjö
a2acadd058
x86: vc1dsp: Fix indentation
...
Signed-off-by: Martin Storsjö <martin@martin.st>
12 years ago
Michael Niedermayer
9b9205e760
x86/dsputil.asm: make unaligned bswap actually work
...
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
12 years ago
Michael Niedermayer
ea7b96af96
avcodec/x86/dsputil_qns_template: use av_assert
...
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
12 years ago
Janne Grunau
e5c2794a71
x86: consistently use unaligned movs in the unaligned bswap
...
Fixes fate errors in asv1, ffvhuff and huffyuv on x86_32.
12 years ago
Martin Storsjö
285ff14413
x86: Change a missed occurrance of int to ptrdiff_t for strides
...
Signed-off-by: Martin Storsjö <martin@martin.st>
12 years ago
Martin Storsjö
352dbdb96c
x86: Remove win64 xmm clobbering wrappers for the now removed avcodec_encode_video function
...
Signed-off-by: Martin Storsjö <martin@martin.st>
12 years ago
Michael Niedermayer
b3e9f266e8
x86/mpegvideo: switch to av_assert
...
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
12 years ago
Michael Niedermayer
cdbf8409ef
x86/h264_qpel: switch to av_assert
...
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
12 years ago
Carl Eugen Hoyos
d98a5318fd
Fix compilation with --disable-mmx.
12 years ago
Michael Niedermayer
0f95534669
h264_qpel: fix another forgotten int stride
...
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
12 years ago
Michael Niedermayer
c3bb2f7296
dsputil_mmx: remove unused variables
...
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
12 years ago
Ronald S. Bultje
3ced55d51c
Move x86 half-pel assembly from dsputil to hpeldsp.
12 years ago
Ronald S. Bultje
d1293512cf
vp3: use hpeldsp instead of dsputil for half-pel functions.
...
This makes vp3 independent of dsputil.
12 years ago