This allows user to build FFmpeg against Intel oneVPL. oneVPL 2.6
is the required minimum version when building Intel oneVPL code.
It will fail to run configure script if both libmfx and libvpl are
enabled.
It is recommended to use oneVPL for new work, even for currently available
hardwares [1]
Note the preferred child device type is d3d11va for libvpl on Windows.
The commands below will use d3d11va if d3d11va is available on Windows.
$ ffmpeg -hwaccel qsv -c:v h264_qsv ...
$ ffmpeg -qsv_device 0 -hwaccel qsv -c:v h264_qsv ...
$ ffmpeg -init_hw_device qsv=qsv:hw_any -hwaccel qsv -c:v h264_qsv ...
$ ffmpeg -init_hw_device qsv=qsv:hw_any,child_device=0 -hwaccel qsv -c:v h264_qsv ...
User may use child_device_type option to specify child device type to
dxva2 or derive a qsv device from a dxva2 device
$ ffmpeg -init_hw_device qsv=qsv:hw_any,child_device=0,child_device_type=dxva2 -hwaccel qsv -c:v h264_qsv ...
$ ffmpeg -init_hw_device dxva2=d3d9:0 -init_hw_device qsv=qsv@d3d9 -hwaccel qsv -c:v h264_qsv ...
[1] https://www.intel.com/content/www/us/en/develop/documentation/upgrading-from-msdk-to-onevpl/top.html
If qsv hwdevice is available, use the mfxLoader handle in qsv hwdevice
to create mfx session. Otherwise create mfx session with a new mfxLoader
handle.
This is in preparation for oneVPL support
In oneVPL, MFXLoad() and MFXCreateSession() are required to create a
workable mfx session[1]
Add config filters for D3D9/D3D11 session (galinart)
The default device is changed to d3d11va for oneVPL when both d3d11va
and dxva2 are enabled on Microsoft Windows
This is in preparation for oneVPL support
[1] https://spec.oneapi.io/versions/latest/elements/oneVPL/source/programming_guide/VPL_prg_session.html#onevpl-dispatcher
Co-authored-by: galinart <artem.galin@intel.com>
Signed-off-by: galinart <artem.galin@intel.com>
Signed-off-by: Haihao Xiang <haihao.xiang@intel.com>
The following Cflags has been added to libmfx.pc, so mfx/ prefix is no
longer needed when including mfx headers in FFmpeg.
Cflags: -I${includedir} -I${includedir}/mfx
Some old versions of libmfx have the following Cflags in libmfx.pc
Cflags: -I${includedir}
We may add -I${includedir}/mfx to CFLAGS when running 'configure
--enable-libmfx' for old versions of libmfx, if so, mfx headers without
mfx/ prefix can be included too.
If libmfx comes without pkg-config support, we may do a small change to
the settings of the environment(e.g. set -I/opt/intel/mediasdk/include/mfx
instead of -I/opt/intel/mediasdk/include to CFLAGS), then the build can
find the mfx headers without mfx/ prefix
After applying this change, we won't need to change #include for mfx
headers when mfx headers are installed under a new directory.
This is in preparation for oneVPL support (mfx headers in oneVPL are
installed under vpl directory)
The data structures for VP9 in mfxvp9.h is wrapped by
MFX_VERSION_NEXT, which means those data structures have never been used
in a public release. Actually MFX_CODEC_VP9 and other VP9 stuffs are
added in mfxstructures.h. In addition, mfxdefs.h is included in
mfxvp9.h, so we may use the check in this patch for MFX_CODEC_VP9
This is in preparation for oneVPL support because mfxvp9.h is removed
from oneVPL [1]
[1]: https://github.com/oneapi-src/oneVPL
Intel's oneVPL is a successor to MediaSDK, but removed some obsolete
features of MediaSDK[1], some early versions of oneVPL still use libmfx
as library name[2]. However some of obsolete features, including OPAQUE
memory, multi-frame encode, user plugins and LA_EXT rate control mode
etc, have been enabled in QSV, so user can not use --enable-libmfx to
enable QSV if using an early version of oneVPL SDK. In order to ensure
user builds FFmpeg against a right version of libmfx, this patch added a
check for version < 2.0 and warning message about the used obsolete
features.
[1] https://spec.oneapi.io/versions/latest/elements/oneVPL/source/VPL_intel_media_sdk.html
[2] https://github.com/oneapi-src/oneVPL
It is the proper place to set it, directly besides mb_width and
mb_stride. The reason for doing it the way it is done now seems
to be that the code does not create more slice contexts than necessary
(i.e. not more than one per row), so that this number needs to be
known before setting the number of slices. But this can always be
arranged by just moving the code that sets the number of slices.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
These fields are only ever set by the encoder for the current picture
and for no other picture. So only one set of these values needs to
exist, so move them to MpegEncContext.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Poisoning returned buffers is based around the implicit assumption
that the contents of said buffers are transient. Yet this is not true
for the buffer pools used by the various hardware contexts which store
important state in there that needs to be preserved.
Furthermore, the current code is also based on the assumption
that the complete buffer pointed to by AVBuffer->data coincides with
AVBufferRef->data; yet an implementation might store some data of its
own before the actual user-visible data (accessible via AVBufferRef)
which would be broken by the current code.
(This is of course yet more proof that the AVBuffer API is not the right
tool for the hardware contexts.)
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Of all the buffers that are made writable, three are always allocated
and the other four are allocated iff any one of them is allocated;
so one can replace the seven checks for existence with one.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Up until now, ff_wmv2_decode_secondary_picture_header() only
set the mb_type array for non I-pictures, so that the decoding
process uses the earlier values of this array; this affects
the output of the wmv8-x8intra FATE-test (which this patch
therefore updates). These earlier values were set when decoding
earlier frames or when the buffer was initially zero-allocated.
A consequence of this is that the output of this test would be
random if ff_find_unused_picture() would select the unused picture
to return at random. Furthermore decoding from a keyframe onwards
depends upon the earlier state of the decoder.
This patch therefore zeroes said array when decoding an I picture.
(It is not claimed that zero is the right value to fill the array with.
I just don't know.)
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Include two values for it, a default one that sets/keeps the current behavior,
where the frame event generated by the primary input will have a timestamp
equal or higher than frames in secondary input, plus a new one where the
secondary input frame will be that with the absolute closest timestamp to that
of the frame event one.
Addresses ticket #9689, where the new optional behavior produces better frame
syncronization.
Reviewed-by: Nicolas George <george@nsup.org>
Signed-off-by: James Almer <jamrial@gmail.com>
Stores the item ids of all the items found in the file and
processes the primary item at the end of the meta box. This patch
does not change any behavior. It sets up the code for parsing
alpha channel (and possibly images with 'grid') in follow up
patches.
Reviewed-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Signed-off-by: Vignesh Venkatasubramanian <vigneshv@google.com>
Signed-off-by: James Zern <jzern@google.com>
These tables are only used by encoders and only for the current picture;
ergo they need not be put into the picture at all, but rather into
the encoder's context. They also don't need to be refcounted,
because there is only one owner.
In contrast to this, the earlier code refcounts them which
incurs unnecessary overhead. These references are not unreferenced
in ff_mpeg_unref_picture() (they are kept in order to have something
like a buffer pool), so that several buffers are kept at the same
time, although only one is needed, thereby wasting memory.
The code also propagates references to other pictures not part of
the pictures array (namely the copy of the current/next/last picture
in the MpegEncContext which get references of their own). These
references are not unreferenced in ff_mpeg_unref_picture() (the
buffers are probably kept in order to have something like a pool),
yet if the current picture is a B-frame, it gets unreferenced
at the end of ff_mpv_encode_picture() and its slot in the picture
array will therefore be reused the next time; but the copy of the
current picture also still has its references and therefore
these buffers will be made duplicated in order to make them writable
in the next call to ff_mpv_encode_picture(). This is of course
unnecessary.
Finally, ff_find_unused_picture() is supposed to just return
any unused picture and the code is supposed to work with it;
yet for the vsynth*-mpeg4-adap tests the result depends upon
the content of these buffers; given that this patchset
changes the content of these buffers (the initial content is now
the state of these buffers after encoding the last frame;
before this patch the buffers used came from the last picture
that occupied the same slot in the picture array) their ref-files
needed to be changed. This points to a bug somewhere (if one removes
the initialization, one gets uninitialized reads in
adaptive_quantization in ratecontrol.c).
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Sufficiently recent Intel hardware is able to do encoding of 8bit 4:4:4
content in HEVC and VP9. The main requirement here is that the frames
must be provided in the AYUV format.
Enabling support is done by adding the appropriate encoding profiles
and noting that AYUV is officially a four channel format with alpha so
we must state that we expect all four channels.
vaapi_decode_find_best_format currently does not set the
VA_SURFACE_ATTRIB_SETTABLE flag on the pixel format attribute that it
returns.
Without this flag, the attribute will be ignored by vaCreateSurfaces,
meaning that the driver's default logic for picking a pixel format will
kick in.
So far, this hasn't produced visible problems, but when trying to
decode 4:4:4 content, at least on Intel, the driver will pick the
444P planar format, even though the decoder can only return the AYUV
packed format.
The hwcontext_vaapi code that sets surface attributes when picking
formats does not have this bug.
Applications may use their own logic for finding the best format, and
so may not hit this bug. eg: mpv is unaffected.
The present default value of 0 will render the overlay video invisible.
A default of 1.0 is consistent with most common use cases.
Signed-off-by: Fei Wang <fei.w.wang@intel.com>
Reviewed-by: Philip Langdale <philipl@overt.org>
Signed-off-by: Haihao Xiang <haihao.xiang@intel.com>
Directly branch into the special 64-point deinterleave
subroutine rather than going through the general deinterleave.
64-point transform timings on Zen 3:
Before:
1974 decicycles in av_tx (fft),16776864 runs, 352 skips
After:
1956 decicycles in av_tx (fft),16775378 runs, 1838 skips
This codepath is enabled by default on arm, if the linux perf API
is available, unless disabled with --disable-linux-perf.
Signed-off-by: Martin Storsjö <martin@martin.st>
-stream_loop is currently handled by destroying the demuxer thread,
seeking, then recreating it anew. This is very messy and conflicts with
the future goal of moving each major ffmpeg component into its own
thread.
Handle -stream_loop directly in the demuxer thread. Looping requires the
demuxer to know the duration of the file, which takes into account the
duration of the last decoded audio frame (if any). Use a thread message
queue to communicate this information from the main thread to the
demuxer thread.
This avoids a potential race with the demuxer adding new streams. It is
also more efficient, since we no longer do inter-thread transfers of
packets that will be just discarded.
This undocumented feature runtime-enables dumping input packets. I can
think of no reasonable real-world use case that cannot also be
accomplished in a different way. Keeping this functionality would
interfere with the following commit moving it to the input thread (then
setting the variable would require locking or atomics, which would be
unnecessarily complicated for a feature that probably nobody uses).