Add support for max frame size:
- max_frame_size (bytes) to indicate the max allowed size for frame.
Control each encoded frame size into target limitation size by adjusting
whole frame's average QP value. The driver will use multi passes to
adjust average QP setp by step to achieve the target, and the result
may not strictly guaranteed. Frame size may exceed target alone with
using the maximum average QP value. The failure always happens on the
intra(especially the first intra frame of a new GOP) frames or set
max_frame_size with a very small number.
example cmdline:
ffmpeg -hwaccel vaapi -vaapi_device /dev/dri/renderD128 -f rawvideo \
-v verbose -s:v 352x288 -i ./input.yuv -vf format=nv12,hwupload \
-c:v h264_vaapi -profile:v main -g 30 -rc_mode VBR -b:v 500k \
-bf 3 -max_frame_size 40000 -vframes 100 -y ./max_frame_size.h264
Max frame size was enabled since VA-API version (0, 33, 0), but query
is available since (1, 5, 0). It will be passed as a parameter in picParam
and should be set for each frame.
Signed-off-by: Linjie Fu <linjie.fu@intel.com>
Signed-off-by: Fei Wang <fei.w.wang@intel.com>
The block size can be dependent on the profile and entrypoint selected.
It defaults to 16x16, with codecs able to override this choice with their
own function.
Signed-off-by: Fei Wang <fei.w.wang@intel.com>
Use GPB frames to replace regular P/B frames if backend driver does not
support it.
- GPB:
Generalized P and B picture. Regular P/B frames replaced by B
frames with previous-predict only, L0 == L1. Normal B frames
still have 2 different ref_lists and allow bi-prediction
Signed-off-by: Linjie Fu <linjie.fu@intel.com>
Signed-off-by: Fei Wang <fei.w.wang@intel.com>
Fix: #7706. After commit 5fdcf85bbf, vaapi encoder's performance
decrease. The reason is that vaRenderPicture() and vaSyncBuffer() are
called at the same time (vaRenderPicture() always followed by a
vaSyncBuffer()). Now I changed them to be called in a asynchronous way,
which will make better use of hardware.
Async_depth is added to increase encoder's performance. The frames that
are sent to hardware are stored in a fifo. Encoder will sync output
after async fifo is full.
Signed-off-by: Wenbin Chen <wenbin.chen@intel.com>
Signed-off-by: Haihao Xiang <haihao.xiang@intel.com>
Add vaSyncBuffer to VAAPI encoder. Old version API vaSyncSurface wait
surface to complete. When surface is used for multiple operation, it
waits all operations to finish. vaSyncBuffer only wait one channel to
finish.
Signed-off-by: Wenbin Chen <wenbin.chen@intel.com>
Signed-off-by: Haihao Xiang <haihao.xiang@intel.com>
Dimensions are normally specified as width x height, and this will match
the same option to libaom-av1.
Remove the indirection through the private context at the same time.
Add functions to initialize tile slice structure and make tile slice:
- vaapi_encode_init_tile_slice_structure
- vaapi_encode_make_tile_slice
Tile slice is not allowed to cross the boundary of a tile due to
the constraints of media-driver. Currently adding support for one
slice per tile.
N x N tile encoding is supposed to be supported with the the
capability of ARBITRARY_MACROBLOCKS slice structures.
N X 1 tile encoding should also work in ARBITRARY_ROWS slice
structure.
Signed-off-by: Linjie Fu <linjie.justin.fu@gmail.com>
This commit follows the same logic as 061a0c14bb, but for the encode API: The
new public encoding API will no longer be a wrapper around the old deprecated
one, and the internal API used by the encoders now consists of a single
receive_packet() callback that pulls frames as required.
amf encoders adapted by James Almer
librav1e encoder adapted by James Almer
nvidia encoders adapted by James Almer
MediaFoundation encoders adapted by James Almer
vaapi encoders adapted by Linjie Fu
v4l2_m2m encoders adapted by Andriy Gelman
Signed-off-by: James Almer <jamrial@gmail.com>
This removes the use of the nonstandard combined structures, which
generated some warnings with clang and will cause alignment problems
with some parameter buffer types.
This attaches the logic of picking the mode of for the next picture to
the output, which simplifies some choices by removing the concept of
the picture for which input is not yet available. At the same time,
we allow more complex reference structures and track more reference
metadata (particularly the contents of the DPB) for use in the
codec-specific code.
It also adds flags to explicitly track the available features of the
different codecs. The new structure also allows open-GOP support, so
that is now available for codecs which can do it.
This adds common code to query driver support and set appropriate
address/size information for each slice. It only supports rectangular
slices for now, since that is the most common use-case.
Add a larger warning more clearly explaining the consequences of missing
packed header support in the driver. Also only write the extradata if the
user actually requests it via the GLOBAL_HEADER flag.
Choose what types of reference frames will be used based on what types
are available, and make the intra-only mode explicit (GOP size one,
which must be used for MJPEG).
Query which modes are supported and select between VBR and CBR based
on that - this removes all of the codec-specific rate control mode
selection code.
Previously there was one fixed choice for each codec (e.g. H.265 -> Main
profile), and using anything else then required an explicit option from
the user. This changes to selecting the profile based on the input format
and the set of profiles actually supported by the driver (e.g. P010 input
will choose Main 10 profile for H.265 if the driver supports it).
The entrypoint and render target format are also chosen dynamically in the
same way, removing those explicit selections from the per-codec code.
This removes the arbitrary limit on the allowed number of slices and
parameter buffers.
From ffmpeg commit e4a6eb70f4.
Signed-off-by: Mark Thompson <sw@jkqxz.net>
Change the slice/parameter buffers to be allocated dynamically.
Signed-off-by: Wang, Yi A <yi.a.wang@intel.com>
Signed-off-by: Jun Zhao <jun.zhao@intel.com>
Signed-off-by: Mark Thompson <sw@jkqxz.net>
Use AVCodecContext.compression_level rather than a private option,
replacing the H.264-specific quality option (which stays only for
compatibility).
This now works with the H.265 encoder in the i965 driver, as well as
the existing cases with the H.264 encoder.
(cherry picked from commit 19388a7200)
Use AVCodecContext.compression_level rather than a private option,
replacing the H.264-specific quality option (which stays only for
compatibility).
This now works with the H.265 encoder in the i965 driver, as well as
the existing cases with the H.264 encoder.
Only do this when building for a recent VAAPI version - initial
driver implementations were confused about the interpretation of the
framerate field, but hopefully this will be consistent everywhere
once 0.40.0 is released.
(cherry picked from commit ff35aa8ca4)
This change makes the configured GOP size be respected exactly -
previously the value could be exceeded slightly due to flaws in the
frame type selection logic.
(cherry picked from commit 37fab0661a)
Only do this when building for a recent VAAPI version - initial
driver implementations were confused about the interpretation of the
framerate field, but hopefully this will be consistent everywhere
once 0.40.0 is released.
This change makes the configured GOP size be respected exactly -
previously the value could be exceeded slightly due to flaws in the
frame type selection logic.
This allows better checking of capabilities and will make it easier
to add more functionality later.
It also commonises some duplicated code around rate control setup
and adds more comments explaining the internals.
(cherry picked from commit 80a5d05108)
This allows better checking of capabilities and will make it easier
to add more functionality later.
It also commonises some duplicated code around rate control setup
and adds more comments explaining the internals.
Previously we would allocate a new one for every frame. This instead
maintains an AVBufferPool of them to use as-needed.
Also makes the maximum size of an output buffer adapt to the frame
size - the fixed upper bound was a bit too easy to hit when encoding
large pictures at high quality.