the default number of batch_size is 1
Signed-off-by: Xie, Lin <lin.xie@intel.com>
Signed-off-by: Wu Zhiwen <zhiwen.wu@intel.com>
Signed-off-by: Guo, Yejun <yejun.guo@intel.com>
In order to fine-control referencing schemes in VP9 encoding, there
is a need to use VP9E_SET_SVC_REF_FRAME_CONFIG method. This commit
provides a way to use the API through frame metadata.
DAV files may contain a variable length padding in between chunks
filled with 0xff bytes. The current skipping logic is incorrect as it
may skip over DHAV chunks not appearing sequentially in the file.
We now look for the 'DHAV' tag using a byte-by-byte search in order
to handle such situations. Also the dhav->last_good_pos field will
not be updated while skipping unrecognized data.
No longer used by anything.
Unfortunately the old FFT_FLOAT/FFT_FIXED_32 is left as-is. It's
simply too much work for code meant to be all removed anyway.
In either encoder, its impossible for the coefficients to go past 25 bits
right after the MDCT. Our MDCT is numerically stable.
For the floating point encoder, in case a NaN is contained, lrintf() will
raise a floating point exception during the conversion.
The AC3 encoder used to be a separate library called "Aften", which
got merged into libavcodec (literally, SVN commits and all).
The merge preserved as much features from the library as possible.
The code had two versions - a fixed point version and a floating
point version. FFmpeg had floating point DSP code used by other
codecs, the AC3 decoder including, so the floating-point DSP was
simply replaced with FFmpeg's own functions.
However, FFmpeg had no fixed-point audio code at that point. So
the encoder brought along its own fixed-point DSP functions,
including a fixed-point MDCT.
The fixed-point MDCT itself is trivially just a float MDCT with a
different type and each multiply being a fixed-point multiply.
So over time, it got refactored, and the FFT used for all other codecs
was templated.
Due to design decisions at the time, the fixed-point version of the
encoder operates at 16-bits of precision. Although convenient, this,
even at the time, was inadequate and inefficient. The encoder is noisy,
does not produce output comparable to the float encoder, and even
rings at higher frequencies due to the badly approximated winow function.
Enter MIPS (owned by Imagination Technologies at the time). They wanted
quick fixed-point decoding on their FPUless cores. So they contributed
patches to template the AC3 decoder so it had both a fixed-point
and a floating-point version. They also did the same for the AAC decoder.
They however, used 32-bit samples. Not 16-bits. And we did not have
32-bit fixed-point DSP functions, including an MDCT. But instead of
templating our MDCT to output 3 versions (float, 32-bit fixed and 16-bit fixed),
they simply copy-pasted their own MDCT into ours, and completely
ifdeffed our own MDCT code out if a 32-bit fixed point MDCT was selected.
This is also the status quo nowadays - 2 separate MDCTs, one which
produces floating point and 16-bit fixed point versions, and one
sort-of integrated which produces 32-bit MDCT.
MIPS weren't all that interested in encoding, so they left the encoder
as-is, and they didn't care much about the ifdeffery, mess or quality - it's
not their problem.
So the MDCT/FFT code has always been a thorn in anyone looking to clean up
code's eye.
Backstory over. Internally AC3 operates on 25-bit fixed-point coefficients.
So for the floating point version, the encoder simply runs the float MDCT,
and converts the resulting coefficients to 25-bit fixed-point, as AC3 is inherently
a fixed-point codec. For the fixed-point version, the input is 16-bit samples,
so to maximize precision the frame samples are analyzed and the highest set
bit is detected via ac3_max_msb_abs_int16(), and the coefficients are then
scaled up via ac3_lshift_int16(), so the input for the FFT is always at least 14 bits,
computed in normalize_samples(). After FFT, the coefficients are scaled up to 25 bits.
This patch simply changes the encoder to accept 32-bit samples, reusing
the already well-optimized 32-bit MDCT code, allowing us to clean up and drop
a large part of a very messy code of ours, as well as prepare for the future lavu/tx
conversion. The coefficients are simply scaled down to 25 bits during windowing,
skipping 2 separate scalings, as the hacks to extend precision are simply no longer
necessary. There's no point in running the MDCT always at 32 bits when you're
going to drop 6 bits off anyway, the headroom is plenty, and the MDCT rounds
properly.
This also makes the encoder even slightly more accurate over the float version,
as there's no coefficient conversion step necessary.
SIZE SAVINGS:
ARM32:
HARDCODED TABLES:
BASE - 10709590
DROP DSP - 10702872 - diff: -6.56KiB
DROP MDCT - 10667932 - diff: -34.12KiB - both: -40.68KiB
DROP FFT - 10336652 - diff: -323.52KiB - all: -364.20KiB
SOFTCODED TABLES:
BASE - 9685096
DROP DSP - 9678378 - diff: -6.56KiB
DROP MDCT - 9643466 - diff: -34.09KiB - both: -40.65KiB
DROP FFT - 9573918 - diff: -67.92KiB - all: -108.57KiB
ARM64:
HARDCODED TABLES:
BASE - 14641112
DROP DSP - 14633806 - diff: -7.13KiB
DROP MDCT - 14604812 - diff: -28.31KiB - both: -35.45KiB
DROP FFT - 14286826 - diff: -310.53KiB - all: -345.98KiB
SOFTCODED TABLES:
BASE - 13636238
DROP DSP - 13628932 - diff: -7.13KiB
DROP MDCT - 13599866 - diff: -28.38KiB - both: -35.52KiB
DROP FFT - 13542080 - diff: -56.43KiB - all: -91.95KiB
x86:
HARDCODED TABLES:
BASE - 12367336
DROP DSP - 12354698 - diff: -12.34KiB
DROP MDCT - 12331024 - diff: -23.12KiB - both: -35.46KiB
DROP FFT - 12029788 - diff: -294.18KiB - all: -329.64KiB
SOFTCODED TABLES:
BASE - 11358094
DROP DSP - 11345456 - diff: -12.34KiB
DROP MDCT - 11321742 - diff: -23.16KiB - both: -35.50KiB
DROP FFT - 11276946 - diff: -43.75KiB - all: -79.25KiB
PERFORMANCE (10min random s32le):
ARM32 - before - 39.9x - 0m15.046s
ARM32 - after - 28.2x - 0m21.525s
Speed: -30%
ARM64 - before - 36.1x - 0m16.637s
ARM64 - after - 36.0x - 0m16.727s
Speed: -0.5%
x86 - before - 184x - 0m3.277s
x86 - after - 190x - 0m3.187s
Speed: +3%
As we get a new set of objects each frame anyway, we
do not gain anything by keeping the modifier constant.
This helps with capturing when switching your setup a
bit, e.g. from ingame to desktop or from X11 to wayland.
Signed-off-by: Mark Thompson <sw@jkqxz.net>
The kernel defaults to initializing the field to 0 when modifiers
are not used and this happens to be linear. If we end up actually
passing the modifier to a driver, tiling issues happen.
So if the kernel doesn't return a modifier set it explicitly to
INVALID. That way later processing knows there is no explicit
modifier.
Signed-off-by: Mark Thompson <sw@jkqxz.net>
This patch adds support for arbitrary-point FFTs and all even MDCT
transforms.
Odd MDCTs are not supported yet as they're based on the DCT-II and DCT-III
and they're very niche.
With this we can now write tests.
Two tests check the opposite pointer before using it. If only one of these
is set to a valid pointer, one of these functions will crash, the other will
ignore the pointer.
Signed-off-by: James Almer <jamrial@gmail.com>
Fixes: division by zero
Fixes: 26451/clusterfuzz-testcase-minimized-ffmpeg_dem_VIVO_fuzzer-4756955832516608
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Fixes: shift exponent 64 is too large for 64-bit type 'unsigned long long'
Fixes: 26497/clusterfuzz-testcase-minimized-ffmpeg_dem_AVI_fuzzer-5690188355076096
Fixes: 26903/clusterfuzz-testcase-minimized-ffmpeg_dem_LUODAT_fuzzer-5641466929741824
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
CBS doesn't change its contents in any way whatsoever internally, and most
users already set it to a const array.
Signed-off-by: James Almer <jamrial@gmail.com>
These fields were added to support -merge_pmt_versions, but the mpegts demuxer
is also keeping track its programs internally, so that should be a better place
to handle it.
Also it is not a very good idea to keep fields like program_num or
pmt_stream_idx in an AVStream, because a single stream can be part of multiple
programs, multiple PMTs, so the stream attributes can refer to any program the
stream is part of.
Since they are not part of public API, lets simply remove them, or rather
replace them with placeholders for ABI compatibility with libavdevice.
Signed-off-by: Marton Balint <cus@passwd.hu>
Also make sure we are checking the old state of the streams because otherwise
some streams might already have the newly parsed stream identifiers which
corrupts matching.
Fixes streams having the same identifier mixed up on pmt version change.
Fixes ticket #9006.
Signed-off-by: Marton Balint <cus@passwd.hu>
Otherwise there can be a small period when the programs only contain the PMT
pid.
Also make sure skip_clear only affects AVProgram clear, and that pmt_pid is
always kept as the first entry of the PID list of the programs. Also reject
PMTs for programs on the wrong PID.
Signed-off-by: Marton Balint <cus@passwd.hu>
PID 0 was removed from the pid list when then PMT was parsed, it is better
to explictly avoid it from being discarded instead of keeing it in the list of
every program.
Signed-off-by: Marton Balint <cus@passwd.hu>
av_new_program returns the existing program if that already exists, in that
case it makes no sense to overwrite existing attributes.
Signed-off-by: Marton Balint <cus@passwd.hu>
INT32_MAX (2147483647) isn't exactly representable by a floating point
value, with the closest being 2147483648.0. So when rescaling a value
of 1.0, this could overflow when casting the 64-bit value returned from
lrintf() into 32 bits.
Unfortunately the properties of integer overflows don't match up well
with how a Fourier Transform operates. So clip the value before
casting to a 32-bit int.
Should be noted we don't have overflows with the table values we're
currently using. However, converting a Kaiser-Bessel window function
with a length of 256 and a parameter of 5.0 to fixed point did create
overflows. So this is more of insurance to save debugging time
in case something changes in the future.
The macro is only used during init, so it being a little slower is
not a problem.
Commit bdd31feec9 changed the SBC decoder to only set the output
sample format on init, instead of setting it explicitly on each frame,
which is correct. But the SBC parser overrides the sample format to S16,
which triggers a crash when combining the parser and the decoder.
Fix the issue by not setting the sample format anymore in the parser,
which is wrong.
Signed-off-by: James Almer <jamrial@gmail.com>
The function is not used anywhere else and is causing mingw-w64 clang
builds to fail with
ffmpeg-git/libavdevice/decklink_dec.cpp:792:5: error: no previous prototype for function 'get_bmd_timecode' [-Werror,-Wmissing-prototypes]
int get_bmd_timecode(AVFormatContext *avctx, AVTimecode *tc, AVRational frame_rate, BMDTimecodeFormat tc_format, IDeckLinkVideoInputFrame *videoFrame)
Signed-off-by: Christopher Degawa <ccom@randomderp.com>
Signed-off-by: Marton Balint <cus@passwd.hu>
Currently skip_samples is set to start_pad if sample_time is lesser or
equal to 0. This can cause issues if the stream starts with packets that
have negative pts. Calling avformat_seek_file() with ts set to 0 on such
streams makes the mov demuxer return the right corresponding packets
(near the 0 timestamp) but set skip_samples to start_pad which is
incorrect as the audio decoder will discard the returned samples
according to skip_samples from the first packet it receives (which has
its timestamp near 0).
For example, considering the following audio stream with start_pad=1344:
[PKT pts=-1344] [PKT pts=-320] [PKT pts=704] [PKT pts=1728] [...]
Calling avformat_seek_file() with ts=0 makes the next call to
av_read_frame() return the packet with pts=-320 and a skip samples
side data set to 1344 (start_pad). This makes the audio decoder
incorrectly discard (1344 - 320) samples.
This commit makes the move demuxer adjust skip_samples according to the
stream start_pad, seek timestamp and first sample timestamp.
The above example will now result in av_read_frame() still returning the
packet with pts=-320 but with a skip samples side data set to 320
(src_pad - (seek_timestamp - first_timestamp)). This makes the audio
decoder only discard 320 samples (from pts=-320 to pts=0).
Signed-off-by: Marton Balint <cus@passwd.hu>
Runtime checks for whether the encoder is fixed-point or not are
unnecessary here as this is a template; furthermore, there is no
fixed-point EAC-3 encoder, so some checks for whether one is in EAC-3
mode can be omitted when doing fixed-point encoding.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
ff_eac3_exponent_init() set values twice when initializing a static
table; ergo the initialization code must not run concurrently with
a running EAC-3 encoder. Yet this code is executed every time an EAC-3
encoder is initialized. So use ff_thread_once() for this and also for a
similar initialization performed for all AC-3 encoders to make them all
init-threadsafe.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
Fixes: left shift of negative value -25824
Fixes: 27754/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_XMA2_fuzzer-5760255962906624
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Fixes: signed integer overflow: 9223372036854775807 + 32768 cannot be represented in type 'long'
Fixes: 27744/clusterfuzz-testcase-minimized-ffmpeg_dem_DHAV_fuzzer-5179319491756032
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Also do it for FFT_FLOAT only, as this is the only combination for which
it can be set.
Reviewed-by: Lynne <dev@lynne.ee>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>