The test /libavutil/tests/hwdevice checks that when deriving a device
from a source device and then deriving back to the type of the source
device, the result is matching the original source device, i.e. the
derivation mechanism doesn't create a new device in this case.
Previously, this test was usually passed, but only due to two different
kind of flaws:
1. The test covers only a single level of derivation (and back)
It derives device Y from device X and then Y back to the type of X and
checks whether the result matches X.
What it doesn't check for, are longer chains of derivation like:
CUDA1 > OpenCL2 > CUDA3 and then back to OpenCL4
In that case, the second derivation returns the first device (CUDA3 ==
CUDA1), but when deriving OpenCL4, hwcontext.c was creating a new
OpenCL4 context instead of returning OpenCL2, because there was no link
from CUDA1 to OpenCL2 (only backwards from OpenCL2 to CUDA1)
If the test would check for two levels of derivation, it would have
failed.
This patch fixes those (yet untested) cases by introducing forward
references (derived_device) in addition to the existing back references
(source_device).
2. hwcontext_qsv didn't properly set the source_device
In case of QSV, hwcontext_qsv creates a source context internally
(vaapi, dxva2 or d3d11va) without calling av_hwdevice_ctx_create_derived
and without setting source_device.
This way, the hwcontext test ran successful, but what practically
happened, was that - for example - deriving vaapi from qsv didn't return
the original underlying vaapi device and a new one was created instead:
Exactly what the test is intended to detect and prevent. It just
couldn't do so, because the original device was hidden (= not set as the
source_device of the QSV device).
This patch properly makes these setting and fixes all derivation
scenarios.
(at a later stage, /libavutil/tests/hwdevice should be extended to check
longer derivation chains as well)
Reviewed-by: Lynne <dev@lynne.ee>
Reviewed-by: Anton Khirnov <anton@khirnov.net>
Tested-by: Wenbin Chen <wenbin.chen@intel.com>
Signed-off-by: softworkz <softworkz@hotmail.com>
Signed-off-by: Haihao Xiang <haihao.xiang@intel.com>
Fix#7830
When we upload a frame that is not padded as MSDK requires, we create a
new AVFrame to copy data. The frame's padding data is uninitialized so
it brings run to run problem. For example, If we run the following
command serveral times we will get different outputs.
ffmpeg -init_hw_device qsv=qsv:hw -qsv_device /dev/dri/renderD128 \
-filter_hw_device qsv -f rawvideo -s 192x200 -pix_fmt p010 \
-i 192x200_P010.yuv -vf "format=nv12,hwupload=extra_hw_frames=16" \
-c:v hevc_qsv output.265
According to https://github.com/Intel-Media-SDK/MediaSDK/blob/master/doc/mediasdk-man.md#encoding-procedures
"Note: It is the application's responsibility to fill pixels outside
of crop window when it is smaller than frame to be encoded. Especially
in cases when crops are not aligned to minimum coding block size (16
for AVC, 8 for HEVC and VP9)"
I add a function to fill padding area with border pixel to fix this
run2run problem, and also move the new AVFrame to global structure
to reduce redundant allocation operation to increase preformance.
Signed-off-by: Wenbin Chen <wenbin.chen@intel.com>
Signed-off-by: Haihao Xiang <haihao.xiang@intel.com>
The data stored in data[3] in VAAPI AVFrame is VASurfaceID while
the data stored in pair->first is the pointer of VASurfaceID, so
we need to do cast to make following commandline works:
ffmpeg -hwaccel vaapi -hwaccel_device /dev/dri/renderD128 \
-hwaccel_output_format vaapi -i input.264 \
-vf "hwmap=derive_device=qsv,format=qsv" -c:v h264_qsv output.264
Signed-off-by: nyanmisaka <nst799610810@gmail.com>
Signed-off-by: Wenbin Chen <wenbin.chen@intel.com>
Signed-off-by: Anton Khirnov <anton@khirnov.net>
It has already been checked immediately before that said
AVDictionaryEntry exists; checking again is redundant.
Furthermore, av_hwdevice_find_type_by_name() requires its argument
to be non-NULL, so adding a codepath that automatically calls it
with that parameter is nonsense. The same goes for the argument
corresponding to %s.
Fixes Coverity issue 1491394.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
This av_buffer_create() does nothing but leak an AVBuffer and an
AVBufferRef (except on allocation error).
Fixes Coverity issue 1491393.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Command below failed.
ffmpeg -v verbose -init_hw_device vaapi=va:/dev/dri/renderD128
-init_hw_device qsv=qs@va -hwaccel qsv -hwaccel_device qs
-filter_hw_device va -c:v h264_qsv
-i 1080P.264 -vf "hwmap,format=vaapi" -c:v h264_vaapi output.264
Cause: Assign pair->first directly to data[3] in vaapi frame.
pair->first is *VASurfaceID while data[3] in vaapi frame is
VASurfaceID. I fix this line of code. Now the command above works.
Signed-off-by: Wenbin Chen <wenbin.chen@intel.com>
Microsoft VideoProcessor requires texture with D3DUSAGE_RENDERTARGET flag as output.
There is no way to allocate array of textures with D3D11_BIND_RENDER_TARGET flag
and .ArraySize > 2 by ID3D11Device_CreateTexture2D due to the Microsoft limitation.
Adding AVD3D11FrameDescriptors array to store array of single textures
instead of texture with multiple slices resolves this.
Signed-off-by: Artem Galin <artem.galin@intel.com>
UPD: Rebase of last patch set over current master and use DX9 as default device type.
Makes selection of dxva2/DX9 device type by default as before with explicit d3d11va/DX11 usage to cover more HW configurations.
Added warning message to expect changing default device type in the future.
Fixes TGL / AV1 decode as requires DX11 with explicit DX11 type
selection.
Add headless/multi adapter support and fixes:
https://trac.ffmpeg.org/ticket/7511https://trac.ffmpeg.org/ticket/6827http://ffmpeg.org/pipermail/ffmpeg-trac/2017-November/041901.htmlhttps://trac.ffmpeg.org/ticket/7933338fbcd5bbhttps://github.com/jellyfin/jellyfin/issues/2626#issuecomment-602153952
Any other fixes are welcome including OpenCL interop patch since I don't have proper setup to validate this use case
Decoding, encoding, transcoding have been validated.
child_device_type option is responsible for d3d11va/dxva2 device selection
Usage examples:
DirectX 11:
-init_hw_device qsv:hw,child_device_type=d3d11va
-init_hw_device qsv:hw,child_device_type=d3d11va,child_device=0
OR
-init_hw_device d3d11va=dx -init_hw_device qsv@dx
DirectX 9 is still supported but requires explicit selection:
-init_hw_device qsv:hw,child_device_type=dxva2
OR
-init_hw_device dxva2=dx -init_hw_device qsv@dx
Signed-off-by: Artem Galin <artem.galin@intel.com>
This enables usage of non-powered/headless GPU, better HDR support.
Pool of resources is allocated as one texture with array of slices.
Signed-off-by: Artem Galin <artem.galin@intel.com>
This allows for users who derive devices to set options for the
new device context they derive.
The main use case of this is to allow users to enable extensions
(such as surface drawing extensions) in Vulkan while deriving from
the device their frames are on. That way, users don't need to write
any initialization code themselves, since the Vulkan spec invalidates
mixing instances, physical devices and active devices.
Apart from Vulkan, other hwcontexts ignore the opts argument since they
don't support options at all (or in VAAPI and OpenCL's case, options are
currently only used for device selection, which device_derive overrides).
Enables HEVC Range Extension decoding support (Linux) for 4:2:2 8/10 bit
on ICL+ (gen11 +) platform.
Restricted to linux only for now.
Signed-off-by: Linjie Fu <linjie.fu@intel.com>
Tries to find a device backed by the i915 kernel driver and loads the iHD
VAAPI driver to use with it. This reduces confusion on machines with
multiple DRM devices and removes the surprising requirement to set the
LIBVA_DRIVER_NAME environment variable to use libmfx at all.
Fix the aligned check in hwupload, input surface should be 16 aligned
too.
Partly fix#7830.
Signed-off-by: Linjie Fu <linjie.fu@intel.com>
Signed-off-by: Zhong Li <zhong.li@intel.com>
Libmfx requires 16 bytes aligned input/output for uploading.
Currently only output is 16 byte aligned and assigning same width/height to
input with smaller buffer size actually, thus definitely will cause segment fault.
Can reproduce with any 1080p nv12 rawvideo input:
ffmpeg -init_hw_device qsv=qsv:hw -hwaccel qsv -filter_hw_device qsv -f rawvideo -pix_fmt nv12 -s:v 1920x1080
-i 1080p_nv12.yuv -vf 'format=nv12,hwupload=extra_hw_frames=16,hwdownload,format=nv12' -an -y out_nv12.yuv
It can fix#7418
Signed-off-by: Zhong Li <zhong.li@intel.com>
RGB32(AV_PIX_FMT_BGRA on intel platforms) format may be used as overlay with alpha blending.
So add AV_PIX_FMT_BGRA format support.
One example of alpha blending overlay: ffmpeg -hwaccel qsv -c:v h264_qsv -i BA1_Sony_D.jsv
-filter_complex 'movie=lena-rgba.png,hwupload=extra_hw_frames=16[a];[0:v][a]overlay_qsv=x=10:y=10'
-c:v h264_qsv -y out.mp4
Rename RGB32 to be BGRA to make it clearer as Mark Thompson's suggestion.
V2: Add P010 format support else will introduce HEVC 10bit encoding regression.
Thanks for LinJie's discovery.
Signed-off-by: Zhong Li <zhong.li@intel.com>
Verified-by: Fu, Linjie <linjie.fu@intel.com>
Variable 'ret' hasn't been initialized,thus introducing a random
hwupload failure regression due to qsv session uninitialized.
Signed-off-by: Zhong Li <zhong.li@intel.com>
Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
Removing unused VPP sessions by initializing only when used in order to help
reduce CPU utilization.
Thanks to Maxym for the guidance.
Signed-off-by: Joe Olivas <joseph.k.olivas@intel.com>
Signed-off-by: Maxym Dmytrychenko <maxim.d33@gmail.com>
Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
Per MediaSDK documentation, it requires width/height to 16 alignment.
Without this patch, hwupload pipeline may fail if 16 alignment is
not met. Although this patch also apply 16 alignment to qsv encoder/decoder,
it will not bring any side-effect to them as they are already aligned.
Signed-off-by: Ruiling Song <ruiling.song@intel.com>
Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
The PicStruct is required by MediaSDK, so give a default value.
hwupload does not work without this.
Signed-off-by: Ruiling Song <ruiling.song@intel.com>
Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
It is benefit to diagnose issues related to different libmfx version.
Signed-off-by: Zhong Li <zhong.li@intel.com>
Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
Fixes build warning of "variable 's' is declared but not used"
Signed-off-by: Zhong Li <zhong.li@intel.com>
Signed-off-by: Mark Thompson <sw@jkqxz.net>
The initialisation should be common. For libmfx, it was previously
happening in the derivation function and this moves it out. For VAAPI,
it fixes some failures when deriving from a DRM device because this
initialisation did not run.
Uploading/downloading data through VPP may not work for some formats, in
that case we can still try to call av_hwframe_transfer_data() on the
child context.
Signed-off-by: Maxym Dmytrychenko <maxym.dmytrychenko@intel.com>
Certain pixel formats (e.g. P8) might not be supported for
download/upload through VPP operations, but can still be used otherwise.
Signed-off-by: Maxym Dmytrychenko <maxym.dmytrychenko@intel.com>
When using GPU surfaces with QSV, one needs to supply a frame allocator,
which will be invoked to pass surface pools to libmfx.
For encoding, this allocator gets invoked not only for the pool of input
frames, but also for a separate pool of (apparently) reconstructed frames
and another pool of MFX_FOURCC_P8, which on Windows needs to return
D3DFMT_P8 D3D surfaces. Those are probably used to store the encoded
bitstream on the GPU.
Signed-off-by: Maxym Dmytrychenko <maxym.dmytrychenko@intel.com>