Justin Ruggles
b57e38f52c
ac3dsp: x86: Replace inline asm for in-decoder downmixing with standalone asm
...
Adds a wrapper function for downmixing which detects channel count changes
and updates the selected downmix function accordingly.
Simplification and porting to current x86inc infrastructure by Diego Biurrun.
Signed-off-by: Diego Biurrun <diego@biurrun.de>
8 years ago
Justin Ruggles
43717469f9
ac3dsp: Reverse matrix in/out order in downmix()
...
Also use (float **) instead of (float (*)[2]). This matches the matrix
layout in libavresample so we can reuse assembly code between the two.
Signed-off-by: Diego Biurrun <diego@biurrun.de>
8 years ago
Hendrik Leppkes
8d1267932c
x86/h264_weight: use appropriate register size for weight parameters
...
This fixes decoding corruption on 64 bit windows.
Signed-off-by: Martin Storsjö <martin@martin.st>
8 years ago
Diego Biurrun
2caa93b813
mpegaudiodsp: Change type of array stride parameters to ptrdiff_t
...
This avoids SIMD-optimized functions having to sign-extend their
stride argument manually to be able to do pointer arithmetic.
8 years ago
Diego Biurrun
e4a94d8b36
h264chroma: Change type of stride parameters to ptrdiff_t
...
This avoids SIMD-optimized functions having to sign-extend their
stride argument manually to be able to do pointer arithmetic.
8 years ago
Diego Biurrun
2ec9fa5ec6
idct: Change type of array stride parameters to ptrdiff_t
...
ptrdiff_t is the correct type for array strides and similar.
8 years ago
Diego Biurrun
009adfd4fb
x86: fpel: Remove unnecessary sign extend
8 years ago
Anton Khirnov
de2ae3c1fa
lavc: add clobber tests for the new encoding/decoding API
8 years ago
Anton Khirnov
12004a9a7f
audiodsp/x86: yasmify vector_clipf_sse
8 years ago
Anton Khirnov
683da86aab
audiodsp: reorder arguments for vector_clipf
...
This will make the x86 asm simpler.
ARM conversion by Martin Storsjö <martin@martin.st> and Janne Grunau
<janne-libav@jannau.net>
8 years ago
Anton Khirnov
eea9857bfd
blockdsp: drop the high_bit_depth parameter
...
It has no effect, since the code is supposed to operate the same way for
any bit depth.
8 years ago
Anton Khirnov
75d98e30af
audiodsp/x86: clear the high bits of the order parameter on 64bit
...
Also change shl to add, since it can be faster on some CPUs.
CC: libav-stable@libav.org
8 years ago
Anton Khirnov
1d6c76e11f
audiodsp/x86: fix ff_vector_clip_int32_sse2
...
This version, which is the only one doing two processing cycles per loop
iteration, computes the load/store indices incorrectly for the second
cycle.
CC: libav-stable@libav.org
8 years ago
Diego Biurrun
de452e5037
pixblockdsp: Change type of stride parameters to ptrdiff_t
...
This avoids SIMD-optimized functions having to sign-extend their
line size argument manually to be able to do pointer arithmetic.
Also adjust parameter names to be "stride" everywhere.
9 years ago
Diego Biurrun
721d57e608
vp56: Separate VP5 and VP6 dsp initialization
...
VP5 has no arch-specific optimizations (nor will it get some in the
future), so it makes no sense to try to share dsp init code with VP6.
9 years ago
Diego Biurrun
3fd22538bc
prores: Change type of stride parameters to ptrdiff_t
...
This avoids SIMD-optimized functions having to sign-extend their
line size argument manually to be able to do pointer arithmetic.
Also adjust parameter names to be "linesize" everywhere.
9 years ago
Diego Biurrun
f81be06cf6
cavs: Change type of stride parameters to ptrdiff_t
...
ptrdiff_t is the correct type for array strides and similar.
9 years ago
Diego Biurrun
802727b538
vp8: Update some assembly comments left unchanged in bd66f073fe
9 years ago
Diego Biurrun
d9d26a3674
vp56: Change type of stride parameters to ptrdiff_t
...
This avoids SIMD-optimized functions having to sign-extend their
line size argument manually to be able to do pointer arithmetic.
9 years ago
Diego Biurrun
6892df9294
vp3: Change type of stride parameters to ptrdiff_t
...
This avoids SIMD-optimized functions having to sign-extend their
stride argument manually to be able to do pointer arithmetic.
Also adjust parameter names to be "stride" everywhere.
9 years ago
Diego Biurrun
e2b9993558
simple_idct: x86: Drop disabled IDCT implementation
...
This gem has been disabled since 2001.
9 years ago
Ronald S. Bultje
9790b44a89
vp9mc/x86: sse2 MC assembly.
...
Also a slight change to the ssse3 code, which prevents a theoretical
overflow in the sharp filter.
Signed-off-by: Anton Khirnov <anton@khirnov.net>
9 years ago
James Almer
67922b4ee4
vp9mc/x86: add AVX and AVX2 MC
...
Roughly 25% faster MC than ssse3 for blocksizes 32 and 64.
Reviewed-by: Ronald S. Bultje <rsbultje@gmail.com>
Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Anton Khirnov <anton@khirnov.net>
9 years ago
Clément Bœsch
3cda179f18
vp9mc/x86: rename ff_* to ff_vp9_*
...
Signed-off-by: Anton Khirnov <anton@khirnov.net>
9 years ago
James Almer
8be8444d01
vp9mc/x86: rename ff_avg[48]_sse to ff_avg[48]_mmxext
...
pavgb is an sse integer instruction, so the mmxext flag is enough
Signed-off-by: James Almer <jamrial@gmail.com>
Reviewed-by: "Ronald S. Bultje" <rsbultje@gmail.com>
Signed-off-by: Anton Khirnov <anton@khirnov.net>
9 years ago
Clément Bœsch
6ab642d69d
vp9mc/x86: simplify a few inits.
...
Signed-off-by: Anton Khirnov <anton@khirnov.net>
9 years ago
Ronald S. Bultje
3a09494939
vp9mc/x86: add 16px functions (64bit only).
...
Signed-off-by: Anton Khirnov <anton@khirnov.net>
9 years ago
Anton Khirnov
89466de4ae
vp9/x86: rename vp9dsp to vp9mc
...
It only contains the MC SIMD, other SIMD will go into different files.
9 years ago
Christophe Gisquet
3c504bc359
x86: deduplicate some constants
...
Signed-off-by: Anton Khirnov <anton@khirnov.net>
9 years ago
Diego Biurrun
d06dfaa5cb
x86: huffyuv: Use EXTERNAL_SSSE3_FAST convenience macro where appropriate
9 years ago
Diego Biurrun
4efab89332
x86: Use *_FAST/*_SLOW CPU feature detection macros where appropriate
9 years ago
Diego Biurrun
0a39c9ac0b
x86: hpeldsp: Don't check for bitexact flag when initializing VP3-specific code
...
That code is only ever initialized with that flag set.
9 years ago
Diego Biurrun
95c1df929b
x86: hpeldsp: Drop unused function parameters
9 years ago
Diego Biurrun
c3e83ad3b7
x86: hpeldsp: Use EXTERNAL_SSE2_FAST where appropriate
9 years ago
Diego Biurrun
1dfc3cf89d
x86: hpeldsp: Split off VP3-specific bits into a separate file
9 years ago
James Almer
fca3c3b619
hevc: Add AVX2 DC IDCT
...
Originally written by Pierre Edouard Lepere <pierre-edouard.lepere@insa-rennes.fr>.
Integrated to Libav by Josh de Kock <josh@itanimul.li>.
Signed-off-by: Alexandra Hájková <alexandra@khirnov.net>
9 years ago
Clément Bœsch
4a081f224e
libavcodec: fix constness in clobber test avcodec_open2() wrappers
...
Signed-off-by: Martin Storsjö <martin@martin.st>
9 years ago
Anton Khirnov
9df889a5f1
h264: rename h264.[ch] to h264dec.[ch]
...
This is more consistent with the naming of other decoders.
9 years ago
Martin Storsjö
f1a9eee41c
x86: Add missing movsxd for the int stride parameter
...
Signed-off-by: Martin Storsjö <martin@martin.st>
9 years ago
Diego Biurrun
1e9c5bf4c1
asm: FF_-prefix internal macros used in inline assembly
...
These warnings conflict with system macros on Solaris, producing
truckloads of warnings about macro redefinition.
9 years ago
Diego Biurrun
dc40a70c57
Drop unnecessary libavutil/x86/asm.h #includes
9 years ago
Diego Biurrun
a6a750c7ef
tests: Move all test programs to a subdirectory
9 years ago
Vittorio Giovara
41ed7ab45f
cosmetics: Fix spelling mistakes
...
Signed-off-by: Diego Biurrun <diego@biurrun.de>
9 years ago
Diego Biurrun
01621202aa
build: miscellaneous cosmetics
...
Restore alphabetical order in lists, break overly long lines, do some
prettyprinting, add some explanatory section comments, group parts
together that belong together logically.
9 years ago
Diego Biurrun
1a094af638
fft: Split MDCT bits off from FFT
9 years ago
Diego Biurrun
73ff983e8d
fft: x86: cosmetics: Drop silly comments, add comment, whitespace
9 years ago
Diego Biurrun
257b30af8e
x86: hevc: Fix linking with both yasm and optimizations disabled
...
Some optimized functions reference optimized symbols, so the functions
must be explicitly disabled when those symbols are unavailable.
9 years ago
Diego Biurrun
15a24614ae
build: Add vc1dsp component for more fine-grained dependencies
9 years ago
Luca Barbato
e280fe1329
v210: Use separate sample_factors
...
The 10bit and the 8bit functions can now be implemented to process
a different amount of samples.
And while at it simplify a little the code.
9 years ago
James Darnley
15ec7aa417
v210: Add avx2 version of the 10-bit line encoder
...
Around 25% faster than the ssse3 version.
Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
9 years ago