This patch aims to reduce the number of input/output surfaces
NVENC allocates per session. Previous default sets allocated surfaces to 32
(unless there is user specified param or lookahead involved). Having large
number of surfaces consumes extra video memory (esp for higher resolution
encoding). The patch changes the surfaces calculation for default, B-frames,
lookahead scenario respectively.
The other change involves surface selection. Previously, if a session
allocates x surfaces, only x-1 surfaces are used (due to combination
of output delay and lock toggle logic). To prevent unused surfaces,
changing surface rotation to using predefined fifo.
Signed-off-by: Timo Rothenpieler <timo@rothenpieler.org>
As Nvidia has put the most recent Video Codec SDK behind a double
registration wall, of which one needs manual approval of a lenghty
application, bundling this header saves everyone trying to use NVENC
from that headache.
The header is still MIT licensed and thus fine to bundle with ffmpeg.
Not bundling this header would get ffmpeg stuck at SDK v6, which is
still freely available, holding back future development of the NVENC
encoder.
Use explicit nvenc capability checks instead to determine usable devices
instead of SM versions.
Signed-off-by: Timo Rothenpieler <timo@rothenpieler.org>
The code needs only a few definitions from cuda.h, so define them
directly when CUDA is not enabled. CUDA is still required for accepting
HW frames as input.
Based on the code by Timo Rothenpieler <timo@rothenpieler.org>.
When there is a non-zero decoding delay due to reordering, the first dts
should be lower than the first pts (since the first packet fed to the
decoder does not produce any output).
Use the same scheme used in mpegvideo_enc (which comes from x264
originally) -- wait for first two timestamps and extrapolate linearly to
the past to produce the first dts value.