From x86inc:
> On AMD cpus <=K10, an ordinary ret is slow if it immediately follows either
> a branch or a branch target. So switch to a 2-byte form of ret in that case.
> We can automatically detect "follows a branch", but not a branch target.
> (SSSE3 is a sufficient condition to know that your cpu doesn't have this problem.)
x86inc can automatically determine whether to use REP_RET rather than
REP in most of these cases, so impact is minimal. Additionally, a few
REP_RETs were used unnecessary, despite the return being nowhere near a
branch.
The only CPUs affected were AMD K10s, made between 2007 and 2011, 16
years ago and 12 years ago, respectively.
In the future, everyone involved with x86inc should consider dropping
REP_RETs altogether.
If it's unsupported or invalid, then there's no point trying to rebuild it
using a value that may have been derived from the same layout to begin with.
Move the checks before the attempts at copying the layout while at it.
Fixes ticket #9908.
Signed-off-by: James Almer <jamrial@gmail.com>
This is more spec-compliant because it does not rely
on dead-code elimination by the compiler. Especially
MSVC has problems with this, as can be seen in
https://ffmpeg.org/pipermail/ffmpeg-devel/2022-May/296373.html
or
https://ffmpeg.org/pipermail/ffmpeg-devel/2022-May/297022.html
This commit does not eliminate every instance where we rely
on dead code elimination: It only tackles branching to
the initialization of arch-specific dsp code, not e.g. all
uses of CONFIG_ and HAVE_ checks. But maybe it is already
enough to compile FFmpeg with MSVC with whole-programm-optimizations
enabled (if one does not disable too many components).
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
x64 always has MMX, MMXEXT, SSE and SSE2 and this means
that some functions for MMX, MMXEXT, SSE and 3dnow are always
overridden by other functions (unless one e.g. explicitly
disables SSE2). So given that the only systems which benefit
from the MMXEXT resamplers (which are overridden by SSE2)
are truely ancient 32bit x86s they are removed.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
x64 always has MMX, MMXEXT, SSE and SSE2 and this means
that some functions for MMX, MMXEXT and 3dnow are always
overridden by other functions (unless one e.g. explicitly
disables SSE2) for x64. So given that the only systems that
benefit from these functions are truely ancient 32bit x86s
they are removed.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
x64 always has MMX, MMXEXT, SSE and SSE2 and this means
that some functions for MMX, MMXEXT and 3dnow are always
overridden by other functions (unless one e.g. explicitly
disables SSE2) for x64. So given that the only systems that
benefit from these functions are truely ancient 32bit x86s
they are removed.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
There is a x86-32 MMXEXT implementation for resampling
planar 16bit data. multiple_resample() therefore calls
emms_c() if it thinks that this needed. And this is bad:
1. It is a maintenance nightmare because changes to the
x86 resample DSP code would necessitate changes to the check
whether to call emms_c().
2. The return value of av_get_cpu_flags() does not tell
whether the MMX DSP functions are in use, as they could
have been overridden by av_force_cpu_flags().
3. The MMX DSP functions will never be overridden in case of
an x86-32 build with --disable-sse2. In this scenario lots of
resampling tests (like swr-resample_exact_lin_async-s16p-8000-48000)
fail because the cpuflags indicate that SSE2 is available
(presuming that the test is run on a CPU with SSE2).
4. The check includes a call to av_get_cpu_flags(). This is not
optimized away for arches other than x86-32.
5. The check takes about as much time as emms_c() itself,
making it pointless.
This commit therefore removes the check and calls emms_c()
unconditionally (it is a no-op for non-x86).
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
This avoids having to rebuild big files every time FFMPEG_VERSION
changes (which it does with every commit).
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
This avoids unnecessary churn and build breakage for users, by
making sure the whole version.h is included like it has been so far,
while keeping the benefit of not needing to rebuild most files in
the ffmpeg tree on minor/micro bumps.
Signed-off-by: Martin Storsjö <martin@martin.st>
Also bump the minor versions of all libraries, to signify the
API change of splitting the version.h headers and adding the
new version_major.h header.
Signed-off-by: Martin Storsjö <martin@martin.st>
This is done a second time for 5.0 because master was
merged into 5.0 so that it contains the recent DOVI additions.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
In case of shared builds, some object files containing tables
are currently duplicated into other libraries: log2_tab.c,
golomb.c, reverse.c. The check for whether this is duplicated
is simply whether CONFIG_SHARED is true. Yet this is crude:
E.g. libavdevice includes reverse.c for shared builds, but only
needs it for the decklink input device, which given that decklink
is not enabled by default will be unused in most libavdevice.so.
This commit changes this by making it more explicit about what
to duplicate from other libraries. To do this, two new Makefile
variables were added: SHLIBOBJS and STLIBOBJS. SHLIBOBJS contains
the objects that are duplicated from other libraries in case of
shared builds; STLIBOBJS contains stuff that a library has to
provide for other libraries in case of static builds. These new
variables provide a way to enable/disable with a finer granularity
than just whether shared builds are enabled or not. E.g. lavd's
Makefile now contains: SHLIBOBJS-$(CONFIG_DECKLINK_INDEV) += reverse.o
Another example is provided by the golomb tables. These are provided
by lavc for static builds, even if one uses a build configuration
that makes only lavf use them. Therefore lavc's Makefile contains
STLIBOBJS-$(CONFIG_MXF_MUXER) += golomb.o, whereas lavf's Makefile
has a corresponding SHLIBOBJS-$(CONFIG_MXF_MUXER) += golomb_tab.o.
E.g. in case the MXF muxer is the only component needing these tables
only libavformat.so will contain them for shared builds; currently
libavcodec.so does so, too.
(There is currently a CONFIG_EXTRA group for golomb. But actually
one would need two groups (golomb_avcodec and golomb_avformat) in
order to know when and where to include these tables. Therefore
this commit uses a Makefile-based approach for this and stops
using these groups for the users in libavformat.)
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
If memory allocation fails, ERROR(ENOMEM) '-12' will be returned.
When resample() is done first, the negative size param would cause buffer-overflow and SEGV in swri_rematrix().
When swri_rematrix() is run first, resample() would not cause an error but Err num as a wrong parameter passing.
Err num should be returned immediately. And remove an unneeded term from an assert.
coredump info:
#0 0x499517 in posix_memalign (/home/r1/ffmpeg/ffmpeg_4.4.1+0x499517)
#1 0x6c1f0b4 in av_malloc /home/r1/ffmpeg/ffmpeg-4.4.1/build/src/libavutil/mem.c:86:9
#2 0x6c208fe in av_mallocz /home/r1/ffmpeg/ffmpeg-4.4.1/build/src/libavutil/mem.c:239:17
#3 0x6c207ad in av_mallocz_array /home/r1/ffmpeg/ffmpeg-4.4.1/build/src/libavutil/mem.c:195:12
#4 0x654b2e5 in swri_realloc_audio /home/r1/ffmpeg/ffmpeg-4.4.1/build/src/libswresample/swresample.c:418:14
#5 0x654f9a1 in swr_convert_internal /home/r1/ffmpeg/ffmpeg-4.4.1/build/src/libswresample/swresample.c:601:17
#6 0x654d2c0 in swr_convert /home/r1/ffmpeg/ffmpeg-4.4.1/build/src/libswresample/swresample.c:766:19
#7 0x186cf56 in flush_frame /home/r1/ffmpeg/ffmpeg-4.4.1/build/src/libavfilter/af_aresample.c:251:13
#8 0x186a454 in request_frame /home/r1/ffmpeg/ffmpeg-4.4.1/build/src/libavfilter/af_aresample.c:288:20
#9 0x787d9c in ff_request_frame_to_filter /home/r1/ffmpeg/ffmpeg-4.4.1/build/src/libavfilter/avfilter.c:459:15
#10 0x7877f1 in forward_status_change /home/r1/ffmpeg/ffmpeg-4.4.1/build/src/libavfilter/avfilter.c:1257:19
#11 0x77ed7e in ff_filter_activate_default /home/r1/ffmpeg/ffmpeg-4.4.1/build/src/libavfilter/avfilter.c:1288:20
#12 0x77e4e1 in ff_filter_activate /home/r1/ffmpeg/ffmpeg-4.4.1/build/src/libavfilter/avfilter.c:1441:11
#13 0x793b3f in ff_filter_graph_run_once /home/r1/ffmpeg/ffmpeg-4.4.1/build/src/libavfilter/avfiltergraph.c:1403:12
#14 0x7a7bee in get_frame_internal /home/r1/ffmpeg/ffmpeg-4.4.1/build/src/libavfilter/buffersink.c:131:19
#15 0x7a7287 in av_buffersink_get_frame_flags /home/r1/ffmpeg/ffmpeg-4.4.1/build/src/libavfilter/buffersink.c:142:12
#16 0x792888 in avfilter_graph_request_oldest /home/r1/ffmpeg/ffmpeg-4.4.1/build/src/libavfilter/avfiltergraph.c:1356:17
#17 0x5d07df in transcode_from_filter /home/r1/ffmpeg/ffmpeg-4.4.1/build/src/fftools/ffmpeg.c:4639:11
#18 0x59e557 in transcode_step /home/r1/ffmpeg/ffmpeg-4.4.1/build/src/fftools/ffmpeg.c:4729:20
#19 0x593970 in transcode /home/r1/ffmpeg/ffmpeg-4.4.1/build/src/fftools/ffmpeg.c:4805:15
#20 0x58f7a4 in main /home/r1/ffmpeg/ffmpeg-4.4.1/build/src/fftools/ffmpeg.c:5010:9
#21 0x7f6fd2dee0b2 in __libc_start_main /build/glibc-eX1tMB/glibc-2.31/csu/../csu/libc-start.c:308:16
SUMMARY: AddressSanitizer: negative-size-param (/home/r1/ffmpeg/ffmpeg_4.4.1+0x497e67) in __asan_memcpy
Reported-by: TOTE Robot <oslab@tsinghua.edu.cn>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
SWR_CH_MAX is internal only and the arrays are therefore not required
to have that many elements (and they typically don't do it). So remove
this potentially confusing hint.
(Newer versions of GCC emit -Warray-parameter= warnings for this,
because the definition with explicit size differs from the declaration
(which leaves the size unspecified); this is IMO a false-positive,
because definition and declaration didn't conflict, but anyway it is
fixed by this commit.)
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Instead only include libavutil/version.h; including avutil.h is a
remnant from the time in which the version was in it.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
These inclusions are not necessary, as cpu.h is already included
wherever it is needed (via direct inclusion or via the arch-specific
headers).
Also remove other unnecessary cpu.h inclusions from ordinary
non-headers.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
It is not used here at all; instead, add it where it is used without
including it or any of the arch-specific CPU headers.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Some files currently rely on libavutil/cpu.h to include it for them;
yet said file won't use include it any more after the currently
deprecated functions are removed, so include attributes.h directly.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
This is as far as 22.2 follows the same channel order as
WaveFormatExtensible's channel mask (and the AV_CH_* defines).
After LFE2 the side channels would follow, but that offset of
one stops us from utilizing them without further tweaks.
This change was verified by using swresample to downmix to 5.1,
and then feeding that to WASAPI.
Only this sub-set of channels actually follows the bit mask order
in the official 22.2 channel mapping. Additionally, the 5.1 channels
are there for backwards compatibility with the previous system.
This enables the utilization of 22.2 content until a proper down/up
matrix is added into swresample.