Paul B Mahol
8e1354c95d
avfilter/x86/vf_v360_init: add missing cases
5 years ago
Paul B Mahol
e4809e12ea
avfilter/vf_v360: add SIMD for lagrange9 interpolation
5 years ago
Martin Storsjö
0815a22dcc
vf_ssim: Fix loading doubles to float registers on i386
...
This fixes the tests filter-refcmp-ssim-yuv and filter-refcmp-ssim-rgb
on i386 after breaking in fcc0424c93
.
Signed-off-by: Martin Storsjö <martin@martin.st>
5 years ago
Paul B Mahol
fcc0424c93
avfilter/vf_ssim: improve precision
...
Use doubles for accumulating floats.
5 years ago
Paul B Mahol
3bf28d40e5
avfilter/vf_v360: change remaps to int16_t type
5 years ago
Marton Balint
1f8e43938b
avfilter/x86/vf_interlace: always use unaligned movs
...
Fixes crashes in command lines such as:
ffmpeg -f lavfi -i testsrc2=704x576:r=50,interlace,pad=720:576:8 -f null none
Related to ticket #6491 .
Signed-off-by: Marton Balint <cus@passwd.hu>
5 years ago
Paul B Mahol
ac0f5f4c17
avfilter/vf_maskedclamp: add x86 SIMD
5 years ago
James Almer
738bc3e742
x86/vf_transpose: make ff_transpose_8x8_16_sse2 work on x86_32
...
Reviewed-by: Paul B Mahol <onemda@gmail.com>
Signed-off-by: James Almer <jamrial@gmail.com>
5 years ago
James Almer
27bae5aaca
x86/vf_transpose: fix cpuflags check
...
Signed-off-by: James Almer <jamrial@gmail.com>
5 years ago
Paul B Mahol
ccd9bca15a
avfilter/vf_transpose: add x86 SIMD
5 years ago
Paul B Mahol
f7f4691f9f
avfilter/x86/vf_atadenoise: fix comment
5 years ago
Paul B Mahol
0ae6fb276b
avfilter/x86/vf_atadenoise: add SIMD for serial too
5 years ago
Paul B Mahol
71e33c6e01
avfilter/vf_atadenoise: add option to use additional algorithm
5 years ago
Paul B Mahol
295d99b439
avfilter/vf_adadenoise: add x86 SIMD
5 years ago
Paul B Mahol
64a805883d
avfilter/vf_gblur: fix heap-buffer overflow
...
Fixes #8282
5 years ago
Andreas Rheinhardt
361fb42e1e
avcodec/filter: Remove extra '; ' outside of functions
...
They are not allowed outside of functions. Fixes the warning
"ISO C does not allow extra ‘;’ outside of a function [-Wpedantic]"
when compiling with GCC and -pedantic.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
5 years ago
James Almer
1dbd3c6116
avfilter/vf_eq: fix compilation with x86 asm disabled
...
Signed-off-by: James Almer <jamrial@gmail.com>
5 years ago
Ting Fu
4f589d668e
avfilter/x86/vf_eq: add SSE2 version
...
Signed-off-by: Ting Fu <ting.fu@intel.com>
5 years ago
Ting Fu
6aff2042d6
avfilter/x86/vf_eq: Change inline assembly into nasm code
...
Signed-off-by: Ting Fu <ting.fu@intel.com>
5 years ago
Paul B Mahol
921eb21b1d
avfilter/x86/vf_360: add most of >8 depth asm
5 years ago
James Almer
4857688732
x86/vf_v360: use a faster horizontal add in remap4_8bit_line_avx2
...
Signed-off-by: James Almer <jamrial@gmail.com>
5 years ago
James Almer
2200cf1aca
x86/vf_v360: make remap{1,2}_8bit_line_avx2 work on x86_32
...
Signed-off-by: James Almer <jamrial@gmail.com>
5 years ago
Paul B Mahol
058bbf48c6
avfilter/vf_v360: x86 SIMD for interpolations
5 years ago
Ruiling Song
98e419cbf5
avfilter/vf_convolution: add x86 SIMD for filter_3x3()
...
Tested using a simple command (apply edge enhance):
./ffmpeg_g -i ~/Downloads/bbb_sunflower_1080p_30fps_normal.mp4 \
-vf convolution="0 0 0 -1 1 0 0 0 0:0 0 0 -1 1 0 0 0 0:0 0 0 -1 1 0 0 0 0:0 0 0 -1 1 0 0 0 0:5:1:1:1:0:128:128:128" \
-an -vframes 1000 -f null /dev/null
The fps increase from 151 to 270 on my local machine.
Signed-off-by: Ruiling Song <ruiling.song@intel.com>
5 years ago
James Almer
b8f1542dcb
avfilter/vf_gblur: add missing preprocessor check
...
Fixes compilation on x86_32
Signed-off-by: James Almer <jamrial@gmail.com>
6 years ago
Ruiling Song
83f9da7768
avfilter/vf_gblur: add x86 SIMD optimizations
...
The horizontal pass get ~2x performance with the patch
under single thread.
Tested overall performance using the command(avx2 enabled):
./ffmpeg -i 1080p.mp4 -vf gblur -f null /dev/null
./ffmpeg -i 1080p.mp4 -vf gblur=threads=1 -f null /dev/null
For single thread, the fps improves from 43 to 60, about 40%.
For multi-thread, the fps improves from 110 to 130, about 20%.
Signed-off-by: Ruiling Song <ruiling.song@intel.com>
6 years ago
Paul B Mahol
dcae5ba322
avfilter: add anlmdn filter x86 SIMD optimizations
6 years ago
James Almer
ef67af31ff
x86/af_afir: use three operand form forat some instructions
...
Fixes compilation with old yasm versions.
Signed-off-by: James Almer <jamrial@gmail.com>
6 years ago
James Almer
5402c1886b
x86/af_afir: add ff_fcmul_add_avx()
...
fcmul_add_c: 1228.8
fcmul_add_sse3: 334.3
fcmul_add_avx: 186.3
Tested on a Core i5 4460 @ 3.2GHz
Reviewed-by: Paul B Mahol <onemda@gmail.com>
Signed-off-by: James Almer <jamrial@gmail.com>
6 years ago
James Almer
82043dfd2e
avfilter/af_afir: split off fcmul_add into a DSP context
...
Reviewed-by: Paul B Mahol <onemda@gmail.com>
Signed-off-by: James Almer <jamrial@gmail.com>
6 years ago
James Almer
9b5bd665e1
x86/af_afir: fix processing the last element
...
ff_fcmul_add_sse3() is now identical to the C version.
Reviewed-by: Paul B Mahol <onemda@gmail.com>
Signed-off-by: James Almer <jamrial@gmail.com>
6 years ago
James Almer
3913d6f734
x86/scene_sad: fix link errors when HAVE_X86ASM is not defined
...
Reviewed-by: Haihao Xiang <haihao.xiang@intel.com>
Signed-off-by: James Almer <jamrial@gmail.com>
6 years ago
Paul B Mahol
c98a32e4ad
avfilter/vf_blend: add 10bit support
6 years ago
Philip Langdale
1096614c42
avfilter/vf_bwdif: Use common yadif frame management logic
...
After adding field type management to the common yadif logic, we can
remove the duplicate copy of that logic from bwdif.
6 years ago
Marton Balint
6c2a7a8e9a
avfilter/vf_framerate: factorize SAD functions which compute SAD for a whole frame
...
Also add SIMD which works on lines because it is faster then calculating it on
8x8 blocks using pixelutils.
Signed-off-by: Marton Balint <cus@passwd.hu>
6 years ago
Paul B Mahol
0f0d468fbc
avfilter/vf_overlay: exclude nv12/nv21 formats from x86 asm check
...
They are yet to be supported,
Signed-off-by: Paul B Mahol <onemda@gmail.com>
7 years ago
Paul B Mahol
6d7c63588c
avfilter/vf_overlay: add x86 SIMD
...
Specifically for yuv444, yuv422, yuv420 format when main stream has no alpha, and alpha
is straight.
Signed-off-by: Paul B Mahol <onemda@gmail.com>
7 years ago
Vasile Toncu
9c01cdb94e
avfilter/vf_interlace: remove duplicate code with same funcionality
7 years ago
Martin Vignali
f3df42e81d
avfilter/x86/vf_blend : add SIMD for 16 bit version of
...
grainextract
grainmerge
average
extremity
negation
7 years ago
Martin Vignali
8eb0bb1108
avfilter/x86/vf_blend : reorganize DIFFERENCE macro to reduce line duplication between 8bit and 16 bit version
7 years ago
Martin Vignali
53a03b5c8c
avfilter/x86/vf_blend : add 16 bit version for BLEND_SIMPLE, phoenix, difference for SSE and AVX2 (x86_64)
7 years ago
Martin Vignali
6c6c9d14a8
avfilter/x86/vf_blend : indent
7 years ago
Martin Vignali
7590d58b61
avfilter/x86/vf_blend : reorganize init in order to add 16 bit version
7 years ago
Martin Vignali
3a230ce5fa
avfilter/x86/vf_blend : avfilter/x86/vf_blend : add AVX2 version for each func except divide
...
and optimize average, grainextract, multiply, screen, grain merge
7 years ago
Marton Balint
4d95c6d5d7
avfilter/vf_framerate: add SIMD functions for frame blending
...
Blend function speedups on x86_64 Core i5 4460:
ffmpeg -f lavfi -i allyuv -vf framerate=60:threads=1 -f null none
C: 447548411 decicycles in Blend, 2048 runs, 0 skips
SSSE3: 130020087 decicycles in Blend, 2048 runs, 0 skips
AVX2: 128508221 decicycles in Blend, 2048 runs, 0 skips
ffmpeg -f lavfi -i allyuv -vf format=yuv420p12,framerate=60:threads=1 -f null none
C: 228932745 decicycles in Blend, 2048 runs, 0 skips
SSE4: 123357781 decicycles in Blend, 2048 runs, 0 skips
AVX2: 121215353 decicycles in Blend, 2048 runs, 0 skips
Signed-off-by: Marton Balint <cus@passwd.hu>
7 years ago
Martin Vignali
b94cd55155
avfilter/x86/vf_interlace : add AVX2 version
7 years ago
James Almer
8e0e4384b0
Revert "avfilter/vf_interlace : add AVX2 for lowpass_line 8 and 16"
...
This reverts commits 1a5865b6dc
and
8fb1d63d91
.
They made fate interlace tests fail when AVX2 was used.
Signed-off-by: James Almer <jamrial@gmail.com>
7 years ago
Martin Vignali
3df6e61dad
avfilter/x86/vf_hflip : indent
...
based on patch by Paul B Mahol
7 years ago
Martin Vignali
f181648176
avfilter/x86/vf_hflip : add avx2 version for hflip_byte and hflip_short
7 years ago
Martin Vignali
a4a4179e83
avfilter/x86/vf_hflip : merge hflip byte and hflip short to one macro
7 years ago