James Almer
a4f876a1a2
x86/vf_fspp: move pxor in store slice functions out of the loop
...
m7 is not overwritten, so we only need to clear it once.
Found by Christophe Gisquet.
Signed-off-by: James Almer <jamrial@gmail.com>
10 years ago
James Almer
466e32bf25
x86/vf_fspp: port inline asm to yasm
...
Reviewed-by: Michael Niedermayer <michaelni@gmx.at>
Signed-off-by: James Almer <jamrial@gmail.com>
10 years ago
James Almer
b94e85453e
avfilter/vf_fspp: add missing inline asm guards
10 years ago
Arwa Arif
bdc4db0ee3
lavfi: port mp=fspp to a native libavfilter filter
...
Signed-off-by: Stefano Sabatini <stefasab@gmail.com>
10 years ago
Michael Niedermayer
6706a2986c
avfilter/vf_spp: Fix overflow in 8bit store slice
...
Fixes regression with
ffplay -f lavfi -i testsrc=640x480 -vf format=gray,boxblur=20:10,geq="'mod(lum(X,Y),16)*15'",boxblur=10,geq="'abs(mod(lum(X,Y),15)-7)*32'",spp=4:40
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
10 years ago
Michael Niedermayer
838aa08d75
avfilter/vf_spp: support 10bit per sample
...
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
10 years ago
Michael Niedermayer
30d2ac4bf9
avfilter/vf_spp: change temporary to unsigned
...
More consistent with uspp and allows for future 10bit support
Found-by: ubitux
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
10 years ago
Kieran Kunhya
96fda42a8f
vf_interlace: get rid of useless loads
...
Signed-off-by: Anton Khirnov <anton@khirnov.net>
10 years ago
Michael Niedermayer
ca59b5b6ec
avfilter/x86/vf_interlace: remove redundant instructions
...
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
10 years ago
Michael Niedermayer
ca5c3ff909
vf_interlace: x86: improve asm performance
...
4775 decicycles -> 3688 decicycles
10 years ago
Michael Niedermayer
05e4b25e9b
avfilter/x86/vf_interlace: rewrite asm
...
4775 decicycles -> 3688 decicycles
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
10 years ago
Michael Niedermayer
fb3eb57369
avfilter/tinterlace: add Support for ff_lowpass_line_avx() & ff_lowpass_line_sse2()
...
Based-on: 2e1704059a
by Kieran Kunhya
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
10 years ago
Kieran Kunhya
2e1704059a
vf_interlace: Add SIMD for lowpass filter
...
Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
10 years ago
James Almer
864f9326fb
x86/vf_noise: move asm code to a separate file
...
Reviewed-by: Michael Niedermayer <michaelni@gmx.at>
Signed-off-by: James Almer <jamrial@gmail.com>
10 years ago
Pascal Massimino
649b7a9946
av_filter/x86/idet: use HADDD where appropriate
...
Signed-off-by: James Almer <jamrial@gmail.com>
10 years ago
Pascal Massimino
e3fd6a3a4e
av_filter/x86/idet: MMX/SSE2 implementation of 16bits filter_line()
...
tested on http://ps-auxw.de/10bit-h264-sample/10bit-eldorado.mkv
MMX: ~30% faster decoding overall
SSE2:~40% faster
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
10 years ago
James Darnley
db8970d7b6
vfi/x86/vf_idet: fix incorrect use of paddq
...
paddq is an SSE2 instruction so it cannot be used for MMX.
This was probably just a typo because the sums are dwords anyway.
Reviewed-by: Pascal Massimino <pascal.massimino@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
11 years ago
Pascal Massimino
161fc0f463
avfilter/x86/idet: fix license header (GPL -> LGPL)
...
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
11 years ago
skal
406a9ccffe
avfilter/vf_idet: MMX/MMXEXT/SSE2 implementation of idet's filter_line()
...
integration by Neil Birkbeck, with help from Vitor Sessak.
core SSE2 loop by Skal (pascal.massimino@gmail.com )
Reviewed-by: Clément Bœsch <u@pkh.me>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
11 years ago
Andreas Cadhalpun
39a6e02fd4
fix spelling errors
...
Reviewed-by: Timothy Gu <timothygu99@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
11 years ago
James Almer
ddea3b7106
x86/yadif-10: remove duplicate ABS macro
...
And use the x86util ones instead, which are optimized for mmxext/sse2.
About ~1% increase in performance on pre SSSE3 processors.
Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
11 years ago
Michael Niedermayer
a348f4befe
avfilter/x86/vf_pullup: fix "invalid combination of opcode and operands" with nasm
...
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
11 years ago
Michael Niedermayer
b8255a4c70
avfilter/x86/vf_pullup: fix old typo
...
This makes C and MMX match, no change to fate as the differences where
apparently not sufficient to show up in fate
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
11 years ago
Michael Niedermayer
6dffc8f5aa
avfilter/vf_pullup: use ptrdiff_t as stride argument for dsp functions
...
This should avoid issues on x86_64
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
11 years ago
Christophe Gisquet
9107612818
x86util: add and use RSHIFT/LSHIFT macros
...
Those macros take a byte number as shift argument, as this argument
differs between MMX and SSE2 instructions.
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
11 years ago
Diego Biurrun
01c5779f56
x86: Drop some unnecessary YASM ifdefs
...
Dead code elimination is enough to avoid undefined references in these cases.
11 years ago
Robert Krüger
194ef56ba7
Change license of yadif from GPL to LGPL
...
Signed-off-by: Robert Krüger <krueger@lesspain.de>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
11 years ago
Robert Krüger
4a38eeec38
Revert "Revert "vf_yadif: move x86 init code to x86/yadif.c""
...
This reverts commit 975110a85e
.
Signed-off-by: Robert Krüger <krueger@lesspain.de>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
11 years ago
Robert Krüger
d8e763fda7
vf_yadif: Relicense from GPL to LGPL
...
All copyright holders have agreed to the relicensing.
11 years ago
Michael Niedermayer
975110a85e
Revert "vf_yadif: move x86 init code to x86/yadif.c"
...
This reverts commit a87b17f328
.
This reduces the amount of non LGPL code, making a relicensing to LGPL
easier
Conflicts:
libavfilter/vf_yadif.c
libavfilter/x86/yadif.c
libavfilter/x86/yadif_template.c
libavfilter/yadif.h
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
11 years ago
Clément Bœsch
969329fe11
Revert "Merge commit 'ed1a11ed52bbd1f15bb9b0416d69b7924bee3191'"
...
This reverts commit fc5fe4804f
, reversing
changes made to ffe3350098
.
The factoring is broken; it's not calling the ssse3 code anymore, and
calling the mmx2 code with bad alignment. It also broke some FATE
instances.
Conflicts:
libavfilter/x86/vf_gradfun_init.c
11 years ago
Michael Niedermayer
c6125f5e1c
avfilter/x86/vf_gradfun_init: fix some consts & related warnings
...
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
11 years ago
Diego Biurrun
ed1a11ed52
gradfun: x86: Factor out common code for some gradfun_filter_line() variants
11 years ago
Diego Biurrun
ee80cf741a
avfilter: x86: K&R formatting cosmetics
11 years ago
Michael Niedermayer
a826efb55a
avfilter/x86/vf_gradfun_init: fix const and related warnings
...
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
11 years ago
Daniel Kang
0e73049416
avfilter: x86: Port gradfun filter optimizations to yasm
...
Signed-off-by: Diego Biurrun <diego@biurrun.de>
11 years ago
Diego Biurrun
f6633c55a3
avfilter: Fix typo in Loren's email address
11 years ago
Paul B Mahol
112017e990
avfilter/x86/vf_pullup: try to fix build on x64
...
Signed-off-by: Paul B Mahol <onemda@gmail.com>
11 years ago
Paul B Mahol
9c774459a9
avfilter: port pullup filter from libmpcodecs
...
Signed-off-by: Paul B Mahol <onemda@gmail.com>
11 years ago
Diego Biurrun
3ac7fa81b2
Consistently use "cpu_flags" as variable/parameter name for CPU flags
12 years ago
Clément Bœsch
a2c547ffec
lavfi: add spp filter.
12 years ago
James Darnley
b0ef0ae776
yadif: restore speed of the C filtering code
...
Always use the special filter for the first and last 3 columns (only).
Changes made in 64ed397
slowed the filter to just under 3/4 of what it
was. This commit restores the speed while maintaining identical output.
For reference, on my Athlon64:
1733222 decicycles in old
2358563 decicycles in new
1727558 decicycles in this
Signed-off-by: Anton Khirnov <anton@khirnov.net>
12 years ago
Diego Biurrun
6e9f8d6a7d
x86: vf_yadif: Remove stray dsputil_mmx #include
12 years ago
Diego Biurrun
093804a93c
avfilter: Add av_cold attributes to init/uninit functions
12 years ago
Diego Biurrun
c1ad70c3cb
x86: Move some conditional code around to avoid unused variable warnings
12 years ago
Clément Bœsch
1ae44c87c9
lavfi/gradfun: remove rounding to match C and SSE code.
...
There is no noticable benefit for such precision.
Signed-off-by: Anton Khirnov <anton@khirnov.net>
12 years ago
Clément Bœsch
38a2f88d39
lavfi/gradfun: fix dithering in MMX code.
...
Current dithering only uses the first 4 instead of the whole 8 random values.
Signed-off-by: Anton Khirnov <anton@khirnov.net>
12 years ago
Clément Bœsch
2d66fc543b
lavfi/gradfun: fix rounding in MMX code.
...
Current code divides before increasing precision.
Also reduce upper bound for strength from 255 to 64. This will prevent
an overflow in the SSSE3 and MMX filter_line code: delta is expressed as
an u16 being shifted by 2 to the left. If it overflows, having a
strength not above 64 will make sure that m is set to 0 (making the
m*m*delta >> 14 expression void).
A value above 64 should not make any sense unless gradfun is used as
a blur filter.
Signed-off-by: Anton Khirnov <anton@khirnov.net>
12 years ago
James Darnley
c9a51c29fc
yadif: remove an 'm' from the LOAD macro definition
...
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
12 years ago
James Darnley
1d3b14cac2
yadif: remove repeated check on width
...
The filter already checks that width (and height) are greater than 3.
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
12 years ago