The horizontal pass get ~2x performance with the patch
under single thread.
Tested overall performance using the command(avx2 enabled):
./ffmpeg -i 1080p.mp4 -vf gblur -f null /dev/null
./ffmpeg -i 1080p.mp4 -vf gblur=threads=1 -f null /dev/null
For single thread, the fps improves from 43 to 60, about 40%.
For multi-thread, the fps improves from 110 to 130, about 20%.
Signed-off-by: Ruiling Song <ruiling.song@intel.com>
READ has already been undefined at this point; it is obviously intended
to undef WRITE.
Furthermore, leb128 (in cbs_av1) was undefined too often and
inconsistently.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
Matroska EBML IDs can be only four bytes long maximally, so it is
natural to use uint32_t for them. By doing this and rearranging the
elements of the MatroskaLevel1Element structure, one can reduce the size
of said structure.
Notice that this field is not read via the generic reading process for
EBML_UINT, so one is not forced to use an uint64_t for it.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
This error message is outdated since d31fb1a9.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Pass correct pointer to av_log() and update some error/warning message,
it's will help the debugging
Reviewed-by: Steven Liu <lq@chinaffmpeg.org>
Signed-off-by: Jun Zhao <barryjzhao@tencent.com>
Remove the rain in the input image/video by applying the derain
methods based on convolutional neural networks. Training scripts
as well as scripts for model generation are provided in the
repository at https://github.com/XueweiMeng/derain_filter.git.
Signed-off-by: Xuewei Meng <xwmeng96@gmail.com>
Fixes: signed integer overflow: -2147483648 - 12 cannot be represented in type 'int'
Fixes: 14732/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_DXV_fuzzer-5735273129836544
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Fixes: signed integer overflow: 16384 * 196607 cannot be represented in type 'int'
Fixes: 14810/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_DIRAC_fuzzer-5091232683917312
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
This should reduce the amount of timeout issues overall
Fixes: Timeout (34->10sec)
Fixes: 14682/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_WMV2_fuzzer-5728608414334976
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
The default is to dump extradata to keyframes, not all frames.
Also improve the description of the relevant AVOption.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
This way the clearing can be skipped in case of some errors.
Fixes: Timeout (11sec -> 344ms)
Fixes: 14670/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_PAF_VIDEO_fuzzer-5769534503387136
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Fixes: signed integer overflow: 2052526848 + 147237888 cannot be represented in type 'int'
Fixes: 14441/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_ARBC_fuzzer-5717632944177152
Fixes: 14453/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_ARBC_fuzzer-5739679254577152
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Its unclear if these cases have any relevance in real files
Fixes: shift exponent -2 is negative
Fixes: 14489/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_AAC_FIXED_fuzzer-5681941631729664
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
This removes the use of the nonstandard combined structures, which
generated some warnings with clang and will cause alignment problems
with some parameter buffer types.
We perfer the coding style like:
/* some stuff */
if (error) {
/* error handling */
return -(errorcode);
}
/* normal actions */
do_something()
Signed-off-by: Jun Zhao <barryjzhao@tencent.com>
benchmarking with a simple command:
ffmpeg -i 1080p.mp4 -vf unsharp=la=3:ca=3 -an -f null /dev/null
with the patch, the fps increase from 50 to 120 on my local machine (i7-6770HQ).
Signed-off-by: Ruiling Song <ruiling.song@intel.com>
Used the command for 1080p h264 clip as follow:
a). ffmpeg -i input -vf lutyuv="u=128:v=128" -f null /dev/null
b). ffmpeg -i input -vf lutrgb="g=0:b=0" -f null /dev/null
after enabled the slice threading, the fps change from:
a). 144fps to 258fps (lutyuv)
b). 94fps to 153fps (lutrgb)
in Intel(R) Core(TM) i5-8265U CPU @ 1.60GHz
Reviewed-by: Paul B Mahol <onemda@gmail.com>
Signed-off-by: Jun Zhao <barryjzhao@tencent.com>
Add slice threading support, use the command like:
./ffmpeg -i input -vf colorlevels -f null /dev/null
with 1080p h264 clip, the fps from 39 fps to 79 fps
in the local(Intel(R) Core(TM) i5-8265U CPU @ 1.60GHz)
Reviewed-by: Paul B Mahol <onemda@gmail.com>
Signed-off-by: Jun Zhao <barryjzhao@tencent.com>
Tries to find a device backed by the i915 kernel driver and loads the iHD
VAAPI driver to use with it. This reduces confusion on machines with
multiple DRM devices and removes the surprising requirement to set the
LIBVA_DRIVER_NAME environment variable to use libmfx at all.
Opening the device via X11 (DRI2/DRI3) rather than opening a DRM render
node directly is only useful if you intend to use the legacy X11 interop
functions. That's never true for the ffmpeg utility, and a library user
who does want this will likely provide their own display instance rather
than making a new one here.
For example: -init_hw_device vaapi:/dev/dri/renderD128,driver=foo
This may be more convenient that using the environment variable, and allows
loading different drivers for different devices in the same process.
Iterate over available render devices and pick the first one which looks
usable. Adds an option to specify the name of the kernel driver associated
with the desired device, so that it is possible to select a specific type
of device in a multiple-device system without knowing the card numbering.
For example: -init_hw_device vaapi:,kernel_driver=amdgpu will select only
devices using the "amdgpu" driver (as used with recent AMD graphics cards).
Kernel driver selection requires libdrm to work.
The implementation will use some default in this case. The empty string
is not a meaningful device for any existing hardware type, and indeed
OpenCL treats it identically to no device already to work around the lack
of this setting on the command line.