James Almer
d897d4c12d
x86/vf_w3fdif: use aligned loads in w3fdif_complex_high
...
Found-by: Ronald S. Bultje <rsbultje@gmail.com>
Signed-off-by: James Almer <jamrial@gmail.com>
9 years ago
James Almer
224a529b44
x86/vf_w3fdif: use aligned loads in w3fdif_simple_high
...
Found-by: Ronald S. Bultje <rsbultje@gmail.com>
Signed-off-by: James Almer <jamrial@gmail.com>
9 years ago
James Almer
e8903fbf8e
x86/vf_w3fdif: simplify w3fdif_simple_high
...
Signed-off-by: James Almer <jamrial@gmail.com>
9 years ago
James Almer
d2bf2d094e
x86/vf_w3fdif: move pxor outside the loop in w3fdif_complex_low
...
Reviewed-by: Paul B Mahol <onemda@gmail.com>
Signed-off-by: James Almer <jamrial@gmail.com>
9 years ago
Paul B Mahol
c3d312bb7f
avfilter/x86/vf_w3fdif: add colons after labels
...
Signed-off-by: Paul B Mahol <onemda@gmail.com>
9 years ago
Paul B Mahol
5740dc27e1
avfilter/vf_w3fdif: add x86 SIMD
...
Signed-off-by: Paul B Mahol <onemda@gmail.com>
9 years ago
Andreas Cadhalpun
8d6625642d
doc: fix spelling errors
...
Reviewed-by: Lou Logan <lou@lrcd.com>
Signed-off-by: Andreas Cadhalpun <Andreas.Cadhalpun@googlemail.com>
9 years ago
Paul B Mahol
624a1a0e69
avfilter/x86/vf_blend.asm: hardmix: do same with two pxor instructions less
...
Signed-off-by: Paul B Mahol <onemda@gmail.com>
9 years ago
Paul B Mahol
e999210cec
avfilter/x86/vf_blend.asm: 11th register is used, update functions
...
Signed-off-by: Paul B Mahol <onemda@gmail.com>
9 years ago
Paul B Mahol
0948ba3204
avfilter/x86/vf_blend.asm: add hardmix and phoenix sse2 SIMD
...
Signed-off-by: Paul B Mahol <onemda@gmail.com>
9 years ago
Paul B Mahol
ac74e857a2
avfilter/vf_stereo3d: add x86 SIMD for anaglyph outputs
...
Signed-off-by: Paul B Mahol <onemda@gmail.com>
9 years ago
Michael Niedermayer
fd9a528523
avfilter/vf_blend: Fix argument types, fix segfault in asm
...
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
9 years ago
Paul B Mahol
9762554dd0
avfilter/vf_blend: add x86 SIMD for some modes
...
Signed-off-by: Paul B Mahol <onemda@gmail.com>
9 years ago
Paul B Mahol
160556c9ad
avfilter/vf_maskedmerge: add SIMD for maskedmerge with 8 bit depth input
...
Signed-off-by: Paul B Mahol <onemda@gmail.com>
9 years ago
Paul B Mahol
0701ff2c32
avfilter/x86/vf_psnr.asm: fix typo
...
Signed-off-by: Paul B Mahol <onemda@gmail.com>
9 years ago
Hendrik Leppkes
5d8e836d0e
Replace all remaining occurances of step/depth_minus1 and offset_plus1
9 years ago
Ronald S. Bultje
ad45121d56
options: mark av_get_{int,double,q} as deprecated.
...
Convert last users to av_opt_get_*() counterparts.
9 years ago
Henrik Gramner
ab43beefab
x86inc: Drop SECTION_TEXT macro
...
The .text section is already 16-byte aligned by default on all supported
platforms so `SECTION_TEXT` isn't any different from `SECTION .text`.
Signed-off-by: Anton Khirnov <anton@khirnov.net>
9 years ago
Henrik Gramner
f0b7882ceb
x86inc: Drop SECTION_TEXT macro
...
The .text section is already 16-byte aligned by default on all supported
platforms so `SECTION_TEXT` isn't any different from `SECTION .text`.
9 years ago
James Almer
d9e10af547
x86/vf_interlace: add missing colon to labels
...
Silences warnings with Nasm
Signed-off-by: James Almer <jamrial@gmail.com>
9 years ago
James Almer
e3851169ee
x86/vf_ssim: add ff_ssim_4x4_line_xop
...
~20% faster than ssse3. Also enabled for x86_32
Reviewed-by: Ronald S. Bultje <rsbultje@gmail.com>
Signed-off-by: James Almer <jamrial@gmail.com>
9 years ago
James Almer
e1778fb657
x86/vf_ssim: fix some instruction comments
...
Reviewed-by: Michael Niedermayer <michaelni@gmx.at>
Signed-off-by: James Almer <jamrial@gmail.com>
9 years ago
Paul B Mahol
eea08efc0d
avfilter/x86/vf_psnr.asm: split one line of license text into two
...
Signed-off-by: Paul B Mahol <onemda@gmail.com>
9 years ago
James Darnley
bff7242608
avfilter/vf_removegrain: add x86 and x86_64 SSE2 functions
...
Speed of all modes increased by a factor between 7.4 and 19.8 largely depending
on whether bytes are unpacked into words. Modes 2, 3, and 4 have been sped-up
by a factor of 43 (thanks quick sort!)
All modes are available on x86_64 but only modes 1, 10, 11, 12, 13, 14, 19, 20,
21, and 22 are available on x86 due to the number of SIMD registers used.
With a contribution from James Almer <jamrial@gmail.com>
9 years ago
Ronald S. Bultje
ae4c9ddebc
vf_psnr: sse2 optimizations for sum-squared-error.
...
The internal line accumulator for 16bit can overflow, so I changed that
from int to uint64_t in the C code. The matching assembly looks a little
weird but output looks correct.
(avx2 should be trivial to add later.)
Reviewed-by: Paul B Mahol <onemda@gmail.com>
Reviewed-by: James Almer <jamrial@gmail.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
9 years ago
Ronald S. Bultje
dfc58584b4
vf_ssim: x86 simd for ssim_4x4xN and ssim_endN.
...
Both are 2-2.5x faster than their C counterpart.
Reviewed-by: Paul B Mahol <onemda@gmail.com>
Reviewed-by: James Almer <jamrial@gmail.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
9 years ago
James Almer
c16e99e3b3
x86: check for AV_CPU_FLAG_AVXSLOW where useful
...
Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
10 years ago
James Almer
d68c05380c
x86: check for AV_CPU_FLAG_AVXSLOW where useful
...
Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
10 years ago
Michael Niedermayer
52fc3e372f
avfilter/x86/vf_hqdn3d: Fix register types
...
Fixes Ticket4301
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
10 years ago
Michael Niedermayer
5bc2c39527
avfilter/x86/vf_fspp: Fix invalid combination of opcode and operands
...
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
10 years ago
Michael Niedermayer
a6f9a5d0f6
avfilter/x86/vf_fspp: Fix loop condition for column_fidct()
...
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
10 years ago
Michael Niedermayer
f5b3257c50
avfilter/vf_eq: mark src as const
...
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
10 years ago
Michael Niedermayer
530bf8ece6
avfilter/vf_eq: Fix clipping code
...
Found-by: Christophe Gisquet <christophe.gisquet@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
10 years ago
Arwa Arif
4c38e960d0
avfilter: Port mp=eq/eq2 to lavfi
...
Code adapted from James Darnley's port
Some fixes from Paul B Mahol <onemda@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
10 years ago
James Almer
da02ee127a
x86/vf_pp7: port dctB_mmx to yasm
...
Reviewed-by: Michael Niedermayer <michaelni@gmx.at>
Signed-off-by: James Almer <jamrial@gmail.com>
10 years ago
Arwa Arif
a299cd5ab3
lavfi: port mp=pp7 to libavfilter
...
The only difference with mp=pp7 is that default mode is "medium", as stated
in the MPlayer docs, rather than "hard".
Signed-off-by: Stefano Sabatini <stefasab@gmail.com>
10 years ago
James Almer
a4f876a1a2
x86/vf_fspp: move pxor in store slice functions out of the loop
...
m7 is not overwritten, so we only need to clear it once.
Found by Christophe Gisquet.
Signed-off-by: James Almer <jamrial@gmail.com>
10 years ago
James Almer
466e32bf25
x86/vf_fspp: port inline asm to yasm
...
Reviewed-by: Michael Niedermayer <michaelni@gmx.at>
Signed-off-by: James Almer <jamrial@gmail.com>
10 years ago
James Almer
b94e85453e
avfilter/vf_fspp: add missing inline asm guards
10 years ago
Arwa Arif
bdc4db0ee3
lavfi: port mp=fspp to a native libavfilter filter
...
Signed-off-by: Stefano Sabatini <stefasab@gmail.com>
10 years ago
Michael Niedermayer
6706a2986c
avfilter/vf_spp: Fix overflow in 8bit store slice
...
Fixes regression with
ffplay -f lavfi -i testsrc=640x480 -vf format=gray,boxblur=20:10,geq="'mod(lum(X,Y),16)*15'",boxblur=10,geq="'abs(mod(lum(X,Y),15)-7)*32'",spp=4:40
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
10 years ago
Michael Niedermayer
838aa08d75
avfilter/vf_spp: support 10bit per sample
...
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
10 years ago
Michael Niedermayer
30d2ac4bf9
avfilter/vf_spp: change temporary to unsigned
...
More consistent with uspp and allows for future 10bit support
Found-by: ubitux
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
10 years ago
Kieran Kunhya
96fda42a8f
vf_interlace: get rid of useless loads
...
Signed-off-by: Anton Khirnov <anton@khirnov.net>
10 years ago
Michael Niedermayer
ca59b5b6ec
avfilter/x86/vf_interlace: remove redundant instructions
...
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
10 years ago
Michael Niedermayer
ca5c3ff909
vf_interlace: x86: improve asm performance
...
4775 decicycles -> 3688 decicycles
10 years ago
Michael Niedermayer
05e4b25e9b
avfilter/x86/vf_interlace: rewrite asm
...
4775 decicycles -> 3688 decicycles
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
10 years ago
Michael Niedermayer
fb3eb57369
avfilter/tinterlace: add Support for ff_lowpass_line_avx() & ff_lowpass_line_sse2()
...
Based-on: 2e1704059a
by Kieran Kunhya
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
10 years ago
Kieran Kunhya
2e1704059a
vf_interlace: Add SIMD for lowpass filter
...
Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
10 years ago
James Almer
864f9326fb
x86/vf_noise: move asm code to a separate file
...
Reviewed-by: Michael Niedermayer <michaelni@gmx.at>
Signed-off-by: James Almer <jamrial@gmail.com>
10 years ago