Henrik Gramner
a344e5d094
x86: bswapdsp: Don't treat 32-bit integers as 64-bit
...
The upper halves are not guaranteed to be zero in x86-64.
Also use `test` instead of `and` when the result isn't used for anything other
than as a branch condition, this allows some register moves to be eliminated.
Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
10 years ago
Vittorio Giovara
d42191c78b
configure: Factor out vp8dsp module
10 years ago
Vittorio Giovara
5cb4bdb2a0
configure: Factor out rv34dsp module
10 years ago
Michael Niedermayer
b8c438e762
videodsp: assert that linesize is larger than width
...
Suggested-by: Andreas Cadhalpun <andreas.cadhalpun@googlemail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
10 years ago
Andreas Cadhalpun
28efeb6502
doc: avoid incorrect phrase 'allows to'
...
Also fix typo found by Lou Logan:
Sacrifying -> Sacrificing
Reviewed-by: Lou Logan <lou@lrcd.com>
Signed-off-by: Andreas Cadhalpun <Andreas.Cadhalpun@googlemail.com>
10 years ago
James Almer
9f815bc2c2
avcodec/jpeg200dsp: add ff_rct_int_{sse2,avx2}
...
Reviewed-by: Michael Niedermayer <michaelni@gmx.at>
Signed-off-by: James Almer <jamrial@gmail.com>
10 years ago
James Almer
7912a6830d
avcodec/jpeg200dsp: add ff_ict_float_{sse,avx}
...
Original intrinsics version by Nicolas Bertrand.
Reviewed-by: Michael Niedermayer <michaelni@gmx.at>
Signed-off-by: James Almer <jamrial@gmail.com>
10 years ago
Vittorio Giovara
b7a4127a45
h264_qpel: Use the correct header
10 years ago
Michael Niedermayer
5e87080f2c
h264_weight: Fix SSSE3 biweight code with weights of 128
...
CC: libav-stable@libav.org
Sample-Id: test_bref.mp4
Signed-off-by: Vittorio Giovara <vittorio.giovara@gmail.com>
10 years ago
Michael Niedermayer
e100966575
avcodec/x86/h264_weight: handle weight1=128
...
Fix ticket4596
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
10 years ago
James Almer
c16e99e3b3
x86: check for AV_CPU_FLAG_AVXSLOW where useful
...
Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
10 years ago
James Almer
d68c05380c
x86: check for AV_CPU_FLAG_AVXSLOW where useful
...
Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
10 years ago
Michael Niedermayer
e4610300de
x86: cavs: Remove an unneeded scratch buffer
...
Simplifies the code and makes it build on certain compilers
running out of registers on x86.
CC: libav-stable@libav.org
Reported-By: mudler
10 years ago
Timothy Gu
2b388e6dde
Revert "Move struc FFTContext below SECTION_RODATA"
...
This reverts commit 599888a480
.
The commit does not silence the warning on ELF-based systems, and will be
fixed in the subsequent commit.
Conflicts:
libavcodec/x86/fft_mmx.asm
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
10 years ago
Vittorio Giovara
848e86f74d
mpegvideo: Drop flags and flags2
...
They are just duplicates of AVCodecContext members so use those instead.
10 years ago
Carl Eugen Hoyos
e609cfd697
lavc/flac: Fix encoding and decoding with high lpc.
...
Based on an analysis by trac user lvqcl.
Fixes ticket #4421 , reported by Chase Walker.
10 years ago
Ronald S. Bultje
d32d0593f1
vp9: disable more pmulhrsw optimizations in idct16/32.
...
For idct16, only when called from a adst16x16 variant, so impact is
minor. For idct32, for all, so relatively major impact.
10 years ago
Ronald S. Bultje
96d30c3495
vp9: disable all pmulhrsw in 8/16 iadst x86 optimizations.
...
They all overflow in various samples that are considered valid input.
10 years ago
Michael Niedermayer
cc77bb09e4
avcodec/x86/vp9dsp_init: Fix mix of declaration and statement
...
Reviewed-by: "Ronald S. Bultje" <rsbultje@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
10 years ago
Ronald S. Bultje
b224b165cb
vp9: add keyframe profile 2/3 support.
10 years ago
Michael Niedermayer
6ef3426d90
avcodec/x86/deinterlace: use INIT_MMX like other asm code does too
10 years ago
Michael Niedermayer
dfc0708e23
avcodec/x86/dct-test: Use uint8_t for idct_simple_mmx_perm
...
The table contains no element outside the unsigned 8bit range
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
10 years ago
Michael Niedermayer
270e647adc
avcodec/x86/dct-test: Make static table const
...
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
10 years ago
Ronald S. Bultje
3de13d5212
vp9: remove another optimization branch in iadst16 which causes overflows.
...
See sample vp90-2-14-resize-fp-tiles-16-8.webm from the vp9 test vector
set to reproduce the issue.
10 years ago
Ronald S. Bultje
d02d04a18f
vp9: remove one optimization branch in iadst16 which causes overflows.
...
See sample vp90-2-14-resize-fp-tiles-16-8-4-2-1.webm from the vp9 test
vector set which reproduces the issue. This probably costs a few cycles,
but I don't think there's an easy way to workaround that.
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
10 years ago
Michael Niedermayer
0245abc7c1
avcodec/x86/hpeldsp_init: Put CONFIG_* first in if()
...
This is more consistent and may fix a build failure
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
10 years ago
James Almer
6b940b8c99
x86/xvididct: add some yasm guards
...
Should fix compilation on compilers with less-than-ideal dead code elimination
Signed-off-by: James Almer <jamrial@gmail.com>
10 years ago
James Almer
b0fea4ad7e
x86/xvididct: remove obsolete function prototypes
...
Signed-off-by: James Almer <jamrial@gmail.com>
10 years ago
Luca Barbato
48aef27f52
x86: Put COPY3_IF_LT under HAVE_6REGS
...
It uses 6 registers, unbreaks building on hardened x86 system.
Bug-Id: gentoo/541930
CC: libav-stable@libav.org
10 years ago
Michael Niedermayer
d79f7bf0d6
avcodec/x86/cavsdsp: remove incorrect LOCAL_ALIGN tmp
...
This is faster and simpler as well
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
10 years ago
James Almer
e8374d7202
x86/proresdsp: remove ff_prores_idct_put_10_sse4
...
It's exactly the same as the sse2 version.
Reviewed-by: Michael Niedermayer <michaelni@gmx.at>
Signed-off-by: James Almer <jamrial@gmail.com>
10 years ago
James Almer
bdd179c8cb
x86/proresdsp: remove unused macro
...
Reviewed-by: Michael Niedermayer <michaelni@gmx.at>
Signed-off-by: James Almer <jamrial@gmail.com>
10 years ago
Christophe Gisquet
238db7cc56
x86: lavc: use LOCAL_ALIGNED instead of DECLARE_ALIGNED
...
The later may yield incorrect code for on-stack variables.
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
10 years ago
Christophe Gisquet
15ce160183
x86: xvid_idct: SSE2 merged add version
...
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
10 years ago
Christophe Gisquet
decd5193e1
x86: xvid_idct: merged idct_put SSE2 versions
...
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
10 years ago
Christophe Gisquet
8200575d84
x86: dct-test: evaluate prores idct avx version
...
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
10 years ago
Christophe Gisquet
4eb4451be1
x86: dct-test: fix compilation for prores
...
When the decoder is deactivated, the x86-optimized versions are
not compiled, resulting in a link error.
The C version is unaffected, as it is part of the idctdsp
subsystem.
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
10 years ago
Christophe Gisquet
c3bf52713a
x86: xvid_idct: port MMX iDCT to yasm
...
Also reduce the table duplication with SSE2 code, remove duplicated
macro parameters.
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
10 years ago
Christophe Gisquet
2999bd7da2
x86: xvid_idct: port SSE2 iDCT to yasm
...
The main difference consists in renaming properly labels, and
letting yasm select the gprs for skipping 1D transforms.
Previous-version-reviewed-by: James Almer <jamrial@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
10 years ago
James Almer
5c8f747085
x86/hevc_sao: use unaligned movs for sao_{band,filter} with width 8
...
Suggested-by: Christophe Gisquet <christophe.gisquet@gmail.com>
Reviewed-by: Michael Niedermayer <michaelni@gmx.at>
Signed-off-by: James Almer <jamrial@gmail.com>
10 years ago
Anton Khirnov
71f1ad37d8
lavc: do not compile fmtconvert unconditionally
...
Only ac3dec and dcadec use it.
10 years ago
Anton Khirnov
d74a8cb7e4
fmtconvert: drop unused functions
10 years ago
Michael Niedermayer
23a90768a8
avcodec/v210dec: Add ff prefix to v210_x86_init()
...
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
10 years ago
Michael Niedermayer
0e699676f9
avcodec/snow: mark dwt init as av_cold
...
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
10 years ago
Carl Eugen Hoyos
36a6fb989b
hevc_deblock: Fix compilation with nasm
...
CC: libav-stable@libav.org
Bug-Id: 795
Signed-off-by: Vittorio Giovara <vittorio.giovara@gmail.com>
10 years ago
Michael Niedermayer
03f39fbb2a
avcodec/x86/mlpdsp_init: Simplify mlp_filter_channel_x86()
...
Based on patch by Francisco Blas Izquierdo Riera
Commit message partly taken from carl
fixes a compilation
error in mlpdsp_init.c with -fstack-check and some gcc compilers (I
reproduced the issue with gcc 4.7.3) by simplifying the code.
See also https://bugs.gentoo.org/show_bug.cgi?id=471756
$ make libavcodec/x86/mlpdsp_init.o
libavcodec/x86/mlpdsp_init.c: In function ‘mlp_filter_channel_x86’:
libavcodec/x86/mlpdsp_init.c:142:5: error: can’t find a register in
class ‘GENERAL_REGS’ while reloading ‘asm’
libavcodec/x86/mlpdsp_init.c:142:5: error: ‘asm’ operand has impossible
constraints
4551 -> 4509 dezicycles
Reviewed-by: Ramiro Polla <ramiro.polla@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
10 years ago
Christophe Gisquet
398f531915
x86: hevc_mc: fewer xmm regs used in epel h/v
...
11 xmm regs seem only required for avx2.
Reviewed-by: Mickaël Raulet <mraulet@insa-rennes.fr
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
10 years ago
Christophe Gisquet
89cb4995fa
x86: hevc_mc: save 1 gpr in epel filter loading
...
The 3*stride value stored in r3src can be loaded much later,
so use r3src instead of a dedicated gpr when possible.
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
10 years ago
James Almer
03adafb318
x86/g722dsp: add ff_g722_apply_qmf_sse2
...
Reviewed-by: Michael Niedermayer <michaelni@gmx.at>
Signed-off-by: James Almer <jamrial@gmail.com>
10 years ago
Christophe Gisquet
b533949813
x86: hevc: remove a parameter to WP internals
...
The second stride is always the internal buffer one, MAX_PB_SIZE (times 2 to
get the value in bytes).
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
10 years ago