x4 - x25 faster.
check_deps() recursively enables/disables components, and its loop is
iterated nearly 6000 times. It's particularly slow in bash - currently
consuming more than 50% of configure runtime, and about 20% with other
shells.
This commit applies few local optimizations, most effective first:
- Use $1 $2 ... instead of pushvar/popvar, and same at enable_deep*
- Abort early in one notable case - empty deps, to avoid costly no-op.
- Smaller changes which do add up:
- Handle ${cfg}_checking locally instead of via enable[d]/disable
- ${cfg}_checking: test done before inprogress - x2 faster in 50%+
- one eval instead of several at the empty-deps early abort path.
- The "actual work" part is unmodified - just its surroundings.
Biggest speedups (relative and absolute) are observed with bash.
Tested-by: Michael Niedermayer <michael@niedermayer.cc>
Tested-by: Helmut K. C. Tessarek <tessarek@evermeet.cx>
Tested-by: Dave Yeo <daveryeo@telus.net>
Tested-by: Reino Wijnsma <rwijnsma@xs4all.nl>
Signed-off-by: James Almer <jamrial@gmail.com>
x4 - x10 faster.
Inside print_enabled components, the filter_list case invokes sed
about 350 times to parse the same source file and extract different
info for each arg. This is never instant, and on systems where fork is
slow (notably MSYS2/Cygwin on windows) it takes many seconds.
Change it to use sed once on the source file and set env vars with the
parse results, then use these results inside the loop.
Additionally, the cases of indev_list and outdev_list are very
infrequent, but nevertheless they're faster, and arguably cleaner, with
shell parameter substitutions than with command substitutions.
Tested-by: Michael Niedermayer <michael@niedermayer.cc>
Tested-by: Helmut K. C. Tessarek <tessarek@evermeet.cx>
Tested-by: Dave Yeo <daveryeo@telus.net>
Tested-by: Reino Wijnsma <rwijnsma@xs4all.nl>
Signed-off-by: James Almer <jamrial@gmail.com>
x50 - x200 faster.
Currently configure spends 50-70% of its runtime inside a single
function: flatten_extralibs[_wrapper] - which does string processing.
During its run, nearly 20K command substitutions (subshells) are used,
including its callees unique() and resolve(), which is the reason
for its lengthy run.
This commit avoids all subshells during its execution, speeding it up
by about two orders of magnitude, and reducing the overall configure
runtime by 50-70% .
resolve() is rewritten to avoid subshells, and in unique() and
flatten_extralibs() we "inline" the filter[_out] functionality.
Note that logically, "unique" functionality has more than one possible
output (depending on which of the recurring items is kept). As it
turns out, other parts expect the last recurring item to be kept
(which was the original behavior of uniqie()). This patch preservs
its output order.
Tested-by: Michael Niedermayer <michael@niedermayer.cc>
Tested-by: Helmut K. C. Tessarek <tessarek@evermeet.cx>
Tested-by: Dave Yeo <daveryeo@telus.net>
Tested-by: Reino Wijnsma <rwijnsma@xs4all.nl>
Signed-off-by: James Almer <jamrial@gmail.com>
Encoder frame_number may be double-counted if some frames are cached and then flushed.
Take qsv encoder (some frames are cached firsty for asynchronism) as example,
./ffmpeg -loglevel verbose -hwaccel qsv -c:v h264_qsv -i in.mp4 -vframes 100 -c:v h264_qsv out.mp4
frame_number passed to encoder is double-counted and larger than the accurate value.
Libx264 encoding with B frames can also reproduce it.
Signed-off-by: Zhong Li <zhong.li@intel.com>
Fixes: signed integer overflow: -19818 + -2147483648 cannot be represented in type 'int'
Fixes: 9545/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_SNOW_fuzzer-4928769537081344
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
ISMV lacks any sort of edit list support, as well as tfxd is
effectively the PTS of the fragment for most intents and purposes.
Thus, if b-frames are requested without negative CTS offsets you
end up with N frames' worth of delay (tfxd PTS plus the CTS offset
of the first sample). Negative CTS offsets enable the first sample
to have CTS=DTS, and thus a/v desync due to b-frame reorder delay
is avoided.
Since libopus 1.2, packets of sizes 80ms, 100ms and 120ms are allowed.
Fixes assertion failures when trying to mux such streams.
Signed-off-by: James Almer <jamrial@gmail.com>
Packets of sizes 80ms, 100ms and 120ms are allowed since libopus 1.2
Reviewed-by: Rostislav Pehlivanov <atomnuker@gmail.com>
Signed-off-by: James Almer <jamrial@gmail.com>
This reverts commit 7e0df5910e.
"complete frames" containers, even if they don't need to assemble
packets, still depended on this code for proper packet duration and
timestamp generation.
This field is a uint16_t, see docs:
http://opus-codec.org/docs/opus_in_isobmff.html#4.3.2
Signed-off-by: Dale Curtis <dalecurtis@chromium.org>
Reviewed-by: James Almer <jamrial@gmail.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
remove redundant av_init_packet after av_packet_unref.
av_packet_unref have call av_init_packet and reset the packet size.
Signed-off-by: Jun Zhao <mypopydev@gmail.com>
If there is a saio/saiz in clear content, we shouldn't create the
encryption index if we don't already have one. Otherwise it will
confuse the cenc_filter.
The changed method is also used for senc atoms, but they should not
appear in clear content.
Found by Chromium's ClusterFuzz: https://crbug.com/873432
Signed-off-by: Jacob Trimble <modmaker@google.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
fix the waring: libavcodec/libkvazaar.c:210:27: warning: passing argument 3 of ‘av_image_copy’ from incompatible pointer type [-Wincompatible-pointer-types]
frame->data, frame->linesize,
^~~~~
In file included from libavcodec/libkvazaar.c:31:0:
./libavutil/imgutils.h:119:6: note: expected ‘const uint8_t ** {aka const unsigned char **}’ but argument is of type ‘uint8_t * const* {aka unsigned char * const*}’
void av_image_copy(uint8_t *dst_data[4], int dst_linesizes[4],
Signed-off-by: Jun Zhao <mypopydev@gmail.com>
fix the build warning for "ISO C90 forbids mixed declarations and code"
Reviewed-by: Steven Liu <lq@chinaffmpeg.org>
Signed-off-by: Jun Zhao <mypopydev@gmail.com>
Currently float are converted to 16b uint in input part
using src depth (32 bits) in hScale16To19 and hScale16to15,
make an invalid shift for the data
So shift the value when using float input
like 16 bpc uint.
Add fix a memory leak issue as James's comments.
V2: use a local pict_type since coded_frame is deprecated.
Signed-off-by: Zhong Li <zhong.li@intel.com>