This commit cleans up and refactors the mess of private state upon
private state that used to be.
Now, FFHWBaseEncodePicture is fully initialized upon call-time,
and, most importantly, this lets APIs which require initialization
data for frames (VkImageViews) to initialize this for both the
input image, and the reconstruction (DPB) image.
Signed-off-by: Tong Wu <wutong1208@outlook.com>
This patch is to make FFHWBaseEncodeContext a standalone component
and avoid getting FFHWBaseEncodeContext from avctx->priv_data.
This patch also removes some unnecessary AVCodecContext arguments.
For receive_packet call, a small wrapper is introduced.
Signed-off-by: Tong Wu <tong1.wu@intel.com>
Move receive_packet function to base. This requires adding *alloc,
*issue, *output, *free as hardware callbacks. HWBaseEncodePicture is
introduced as the base layer structure. The related parameters in
VAAPIEncodeContext are also extracted to HWBaseEncodeContext. Then DPB
management logic can be fully extracted to base layer as-is.
Signed-off-by: Tong Wu <tong1.wu@intel.com>
Since VAAPI and future D3D12VA implementation may share some common parameters,
a base layer encode context is introduced as vaapi context's base.
Signed-off-by: Tong Wu <tong1.wu@intel.com>
Makes it robust against adding fields before it, which will be useful in
following commits.
Majority of the patch generated by the following Coccinelle script:
@@
typedef AVOption;
identifier arr_name;
initializer list il;
initializer list[8] il1;
expression tail;
@@
AVOption arr_name[] = { il, { il1,
- tail
+ .unit = tail
}, ... };
with some manual changes, as the script:
* has trouble with options defined inside macros
* sometimes does not handle options under an #else branch
* sometimes swallows whitespace
Up until now, the VAAPI encoder uses fake data with the
AVBuffer-API: The data pointer does not point to real memory,
but is instead just a VABufferID converted to a pointer.
This has probably been copied from the VAAPI-hwcontext-API
(which presumably does it to avoid allocations).
This commit changes this without causing additional allocations
by switching to the RefStruct-pool API. This also fixes an
unchecked av_buffer_ref().
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
To support more reference frames from different directions.
Signed-off-by: Fei Wang <fei.w.wang@intel.com>
Reviewed-by: Neal Gompa <ngompa13@gmail.com>
These defines are also used in other contexts than just AVCodecContext
ones, e.g. in libavformat. Furthermore, given that these defines are
public, the AV-prefix is the right one, so deprecate (and not just move)
the FF-macros.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Add support for max frame size:
- max_frame_size (bytes) to indicate the max allowed size for frame.
Control each encoded frame size into target limitation size by adjusting
whole frame's average QP value. The driver will use multi passes to
adjust average QP setp by step to achieve the target, and the result
may not strictly guaranteed. Frame size may exceed target alone with
using the maximum average QP value. The failure always happens on the
intra(especially the first intra frame of a new GOP) frames or set
max_frame_size with a very small number.
example cmdline:
ffmpeg -hwaccel vaapi -vaapi_device /dev/dri/renderD128 -f rawvideo \
-v verbose -s:v 352x288 -i ./input.yuv -vf format=nv12,hwupload \
-c:v h264_vaapi -profile:v main -g 30 -rc_mode VBR -b:v 500k \
-bf 3 -max_frame_size 40000 -vframes 100 -y ./max_frame_size.h264
Max frame size was enabled since VA-API version (0, 33, 0), but query
is available since (1, 5, 0). It will be passed as a parameter in picParam
and should be set for each frame.
Signed-off-by: Linjie Fu <linjie.fu@intel.com>
Signed-off-by: Fei Wang <fei.w.wang@intel.com>
The block size can be dependent on the profile and entrypoint selected.
It defaults to 16x16, with codecs able to override this choice with their
own function.
Signed-off-by: Fei Wang <fei.w.wang@intel.com>
Use GPB frames to replace regular P/B frames if backend driver does not
support it.
- GPB:
Generalized P and B picture. Regular P/B frames replaced by B
frames with previous-predict only, L0 == L1. Normal B frames
still have 2 different ref_lists and allow bi-prediction
Signed-off-by: Linjie Fu <linjie.fu@intel.com>
Signed-off-by: Fei Wang <fei.w.wang@intel.com>
Fix: #7706. After commit 5fdcf85bbf, vaapi encoder's performance
decrease. The reason is that vaRenderPicture() and vaSyncBuffer() are
called at the same time (vaRenderPicture() always followed by a
vaSyncBuffer()). Now I changed them to be called in a asynchronous way,
which will make better use of hardware.
Async_depth is added to increase encoder's performance. The frames that
are sent to hardware are stored in a fifo. Encoder will sync output
after async fifo is full.
Signed-off-by: Wenbin Chen <wenbin.chen@intel.com>
Signed-off-by: Haihao Xiang <haihao.xiang@intel.com>
Add vaSyncBuffer to VAAPI encoder. Old version API vaSyncSurface wait
surface to complete. When surface is used for multiple operation, it
waits all operations to finish. vaSyncBuffer only wait one channel to
finish.
Signed-off-by: Wenbin Chen <wenbin.chen@intel.com>
Signed-off-by: Haihao Xiang <haihao.xiang@intel.com>
Dimensions are normally specified as width x height, and this will match
the same option to libaom-av1.
Remove the indirection through the private context at the same time.
Add functions to initialize tile slice structure and make tile slice:
- vaapi_encode_init_tile_slice_structure
- vaapi_encode_make_tile_slice
Tile slice is not allowed to cross the boundary of a tile due to
the constraints of media-driver. Currently adding support for one
slice per tile.
N x N tile encoding is supposed to be supported with the the
capability of ARBITRARY_MACROBLOCKS slice structures.
N X 1 tile encoding should also work in ARBITRARY_ROWS slice
structure.
Signed-off-by: Linjie Fu <linjie.justin.fu@gmail.com>
This commit follows the same logic as 061a0c14bb, but for the encode API: The
new public encoding API will no longer be a wrapper around the old deprecated
one, and the internal API used by the encoders now consists of a single
receive_packet() callback that pulls frames as required.
amf encoders adapted by James Almer
librav1e encoder adapted by James Almer
nvidia encoders adapted by James Almer
MediaFoundation encoders adapted by James Almer
vaapi encoders adapted by Linjie Fu
v4l2_m2m encoders adapted by Andriy Gelman
Signed-off-by: James Almer <jamrial@gmail.com>
This removes the use of the nonstandard combined structures, which
generated some warnings with clang and will cause alignment problems
with some parameter buffer types.
This attaches the logic of picking the mode of for the next picture to
the output, which simplifies some choices by removing the concept of
the picture for which input is not yet available. At the same time,
we allow more complex reference structures and track more reference
metadata (particularly the contents of the DPB) for use in the
codec-specific code.
It also adds flags to explicitly track the available features of the
different codecs. The new structure also allows open-GOP support, so
that is now available for codecs which can do it.
This adds common code to query driver support and set appropriate
address/size information for each slice. It only supports rectangular
slices for now, since that is the most common use-case.
Add a larger warning more clearly explaining the consequences of missing
packed header support in the driver. Also only write the extradata if the
user actually requests it via the GLOBAL_HEADER flag.
Choose what types of reference frames will be used based on what types
are available, and make the intra-only mode explicit (GOP size one,
which must be used for MJPEG).
Query which modes are supported and select between VBR and CBR based
on that - this removes all of the codec-specific rate control mode
selection code.
Previously there was one fixed choice for each codec (e.g. H.265 -> Main
profile), and using anything else then required an explicit option from
the user. This changes to selecting the profile based on the input format
and the set of profiles actually supported by the driver (e.g. P010 input
will choose Main 10 profile for H.265 if the driver supports it).
The entrypoint and render target format are also chosen dynamically in the
same way, removing those explicit selections from the per-codec code.
This removes the arbitrary limit on the allowed number of slices and
parameter buffers.
From ffmpeg commit e4a6eb70f4.
Signed-off-by: Mark Thompson <sw@jkqxz.net>
Change the slice/parameter buffers to be allocated dynamically.
Signed-off-by: Wang, Yi A <yi.a.wang@intel.com>
Signed-off-by: Jun Zhao <jun.zhao@intel.com>
Signed-off-by: Mark Thompson <sw@jkqxz.net>
Use AVCodecContext.compression_level rather than a private option,
replacing the H.264-specific quality option (which stays only for
compatibility).
This now works with the H.265 encoder in the i965 driver, as well as
the existing cases with the H.264 encoder.
(cherry picked from commit 19388a7200)
Use AVCodecContext.compression_level rather than a private option,
replacing the H.264-specific quality option (which stays only for
compatibility).
This now works with the H.265 encoder in the i965 driver, as well as
the existing cases with the H.264 encoder.
Only do this when building for a recent VAAPI version - initial
driver implementations were confused about the interpretation of the
framerate field, but hopefully this will be consistent everywhere
once 0.40.0 is released.
(cherry picked from commit ff35aa8ca4)