Fixes: Null pointer dereference
Fixes: sample1.dng
Found-by: South East <8billion.people@gmail.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
xHE-AAC is a profile where some frames depend on other key frames, named IPF.
By setting the codec as Intra Only, all frames output by decoders and all
packets output by encoders/demuxers will be unconditionally flaged as
keyframes, which is incorrect.
Should fix ticket #11272.
Signed-off-by: James Almer <jamrial@gmail.com>
For example, with
./ffmpeg -operating_rate 400 -hwaccel mediacodec -i test.mp4 -an \
-c:v h264_mediacodec -operating_rate 400 -b:v 5M -f null -
The transcoding speed is 254 FPS.
Without -operating_rate on dec/enc, the speed is 148 FPS.
With -operating_rate on decoder only, the speed is 239 FPS.
Signed-off-by: Zhao Zhili <zhilizhao@tencent.com>
The codec wants to know whether the usecase is realtime playback
or full-speed transcoding, or playback at a higher speed. The codec
runs faster when operating_rate higher than framerate.
Signed-off-by: Zhao Zhili <zhilizhao@tencent.com>
Unlike the software FFv1 encoder, none of our buffers are allocated by
FFmpeg, which supports at most 4GiB large allocations.
For really large sizes, the maximum size of the buffer can exceed 4GiB,
which the software encoder optimistically tries to allocate as 4GiB
in the hopes that the encoder will compress to under that amount.
We can just let Vulkan allocate us a larger buffer, and switch to
64-bit offsets.
As of 459a1512f1,
the code is unrolled to process two rows per iteration.
The output cursor thus needs to be incremented by twice the
stride, which is taken care of with SH1ADD. However the original
ADD from the original implemetation was incorrectly left over.
This commit implements a standard, compliant, version 3 and version 4
FFv1 encoder, entirely in Vulkan. The encoder is written in standard
GLSL and requires a Vulkan 1.3 supporting GPU with the BDA extension.
The encoder can use any amount of slices, but nominally, should use
32x32 slices (1024 in total) to maximize parallelism.
All features are supported, as well as all pixel formats.
This includes:
- Rice
- Range coding with a custom quantization table
- PCM encoding
CRC calculation is also massively parallelized on the GPU.
Encoding of unaligned dimensions on subsampled data requires
version 4, or requires oversizing the image to 64-pixel alignment
and cropping out the padding via container flags.
Performance-wise, this makes 1080p real-time screen capture possible
at 60fps on even modest GPUs.
After the branch, the expected SEW/LMUL ratio is 1 byte/vector.
So we have to set the same ratio before branching (QEMU does not care,
but real hardware does).
PIXFUNC macro is unused since d29a9c2aa6.
Signed-off-by: Kyosuke Kawakami <kawakami150708@gmail.com>
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
The add_dirac_obmc8_mmx function was the only MMX function left. This
patch migrates it to SSE2.
Here are the checkasm benchmark results:
diracdsp.add_dirac_obmc_8_c: 2299.1 ( 1.00x)
diracdsp.add_dirac_obmc_8_mmx: 237.6 ( 9.68x)
diracdsp.add_dirac_obmc_8_sse2: 109.1 (21.07x)
Signed-off-by: Kyosuke Kawakami <kawakami150708@gmail.com>
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
The JPEG XL parser has an entropy decoder inside, which supports LZ77
length-distance pairs. If the first symbol from the entropy stream is an
LZ77 pair, the bitstream is invalid, so we should abort immediately rather
than attempt to read it anyway (which would read from the uninitialized
starting window).
Reported-by: Kacper Michajłow <kasper93@gmail.com>
Found-by: ossfuzz
Fixes: 368725676/clusterfuzz-testcase-minimized-fuzzer_protocol_file-6022251122589696-cut
Fixes: 42537758/clusterfuzz-testcase-minimized-fuzzer_protocol_file-5818969469026304-cut
Signed-off-by: Leo Izen <leo.izen@gmail.com>