James Almer
70d685a77f
x86: use the new helper macros where useful
...
Reviewed-by: Michael Niedermayer <michael@niedermayer.cc>
Signed-off-by: James Almer <jamrial@gmail.com>
9 years ago
Ganesh Ajjanagadde
5989add4ab
lavu/x86/lls: add fma3 optimizations for update_lls
...
This improves accuracy (very slightly) and speed for processors having
fma3.
Sample benchmark (fate flac-16-lpc-cholesky, Haswell):
old:
5993610 decicycles in ff_lpc_calc_coefs, 64 runs, 0 skips
5951528 decicycles in ff_lpc_calc_coefs, 128 runs, 0 skips
new:
5252410 decicycles in ff_lpc_calc_coefs, 64 runs, 0 skips
5232869 decicycles in ff_lpc_calc_coefs, 128 runs, 0 skips
Tested with FATE and --disable-fma3, also examined contents of
lavu/lls-test.
Reviewed-by: James Almer <jamrial@gmail.com>
Reviewed-by: Henrik Gramner <henrik@gramner.com>
Signed-off-by: Ganesh Ajjanagadde <gajjanagadde@gmail.com>
9 years ago
James Almer
c16e99e3b3
x86: check for AV_CPU_FLAG_AVXSLOW where useful
...
Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
10 years ago
James Almer
d68c05380c
x86: check for AV_CPU_FLAG_AVXSLOW where useful
...
Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
10 years ago
Michael Niedermayer
579a0fdc21
avutil/lls: Make unchanged function arguments const
...
Reviewed-by: Paul B Mahol <onemda@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
10 years ago
Michael Niedermayer
70b8668fb5
drop LLS1, rename LLS2 to LLS
...
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
11 years ago
Michael Niedermayer
c3814ab654
rename new lls code to lls2 to avoid conflict with the old which has a different ABI
...
also remove failed attempt at a compatibility layer, the code simply cannot work
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
11 years ago
Michael Niedermayer
bbe66ef912
avutil: rename lls to lls2
...
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
11 years ago
Michael Niedermayer
a478e99a60
avutil/x86: reenable ff_update_lls_avx()
...
The bug has been fixed in c8b920a9b7
by Loren Merritt
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
12 years ago
Michael Niedermayer
a6e46ed51a
Revert "avutil/x86: disable ff_evaluate_lls_sse2() for 32bit"
...
This reverts commit 247425241c
.
12 years ago
Michael Niedermayer
247425241c
avutil/x86: disable ff_evaluate_lls_sse2() for 32bit
...
It just segfaults on 32bit, thus its disabled until someone fixes it.
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
12 years ago
Michael Niedermayer
a285079bc7
lls.asm: disable ff_update_lls_avx
...
The code doesnt build with yasm from ubuntu 12.04
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
12 years ago
Loren Merritt
b545179fdf
x86: lpc: simd av_evaluate_lls
...
1.5x-1.8x faster on sandybridge
Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
12 years ago
Loren Merritt
502ab21af0
x86: lpc: simd av_update_lls
...
4x-6x faster on sandybridge
Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
12 years ago
Diego Biurrun
c9f933b5b6
Add av_cold attributes to arch-specific init functions
12 years ago
Michael Niedermayer
e16bac7b33
videodsp: Fix project name
...
These are all part of splited out dsp utils from FFmpeg
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
12 years ago
Ronald S. Bultje
8c53d39e7f
lavc: introduce VideoDSPContext
...
Move some functions from dsputil. The idea is that videodsp contains
functions that are useful for a large and varied set of video decoders.
Currently, it contains emulated_edge_mc() and prefetch().
Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
12 years ago
Mans Rullgard
b692d246ea
vp8: arm: separate ARMv6 functions from NEON
...
This is a preparation for complete ARMv6 optimisations.
Signed-off-by: Mans Rullgard <mans@mansr.com>
13 years ago
Mans Rullgard
d526c5338d
ARM: allow runtime masking of CPU features
...
This allows masking CPU features with the -cpuflags avconv option
which is useful for testing different optimisations without rebuilding.
Signed-off-by: Mans Rullgard <mans@mansr.com>
13 years ago
Michael Niedermayer
c266eb1928
arm: Fix 10l typo
...
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
13 years ago
Ronald S. Bultje
bd66f073fe
vp8: change int stride to ptrdiff_t stride.
...
On 64bit platforms with 32bit int, this means we won't have to sign-
extend the integer anymore.
13 years ago
Diego Biurrun
32f3c541bc
doxygen: Do not include license boilerplates in Doxygen comment blocks.
13 years ago
Ronald S. Bultje
a5dfeb612e
VP8: armv6 optimizations.
...
From 52.503s (~40fps) to 27.973sec (~80fps) decoding of 480p sintel
trailer, i.e. a ~2x speedup overall, on a Nexus S.
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
13 years ago
Mans Rullgard
2912e87a6c
Replace FFmpeg with Libav in licence headers
...
Signed-off-by: Mans Rullgard <mans@mansr.com>
14 years ago
Mans Rullgard
ef15d71c1f
VP8: ARM NEON optimisations for dsp functions
...
This adds NEON optimised versions of all functions in VP8DSPContext.
Based on initial work by Rob Clark.
Signed-off-by: Mans Rullgard <mans@mansr.com>
(cherry picked from commit a1c1d3c003
)
14 years ago
Mans Rullgard
a1c1d3c003
VP8: ARM NEON optimisations for dsp functions
...
This adds NEON optimised versions of all functions in VP8DSPContext.
Based on initial work by Rob Clark.
Signed-off-by: Mans Rullgard <mans@mansr.com>
14 years ago