GSoC'22
libavfilter/vf_chromakey_cuda.cu:the CUDA kernel for the filter
libavfilter/vf_chromakey_cuda.c: the C side that calls the kernel and gets user input
libavfilter/allfilters.c: added the filter to it
libavfilter/Makefile: added the filter to it
cuda/cuda_runtime.h: added two math CUDA functions that are used in the filter
Signed-off-by: Timo Rothenpieler <timo@rothenpieler.org>
cuda_runtime.h as well as dynlink_loader.h used nonstandard inclusion
guards with an AV_ prefix, although these files are not in an libav*/
path. So change the inclusion guards and adapt the ref file of the
source fate test accordingly.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
Signed-off-by: Timo Rothenpieler <timo@rothenpieler.org>
This avoids using the CUDA SDK at all; instead, we provide a minimal
reimplementation of the basic functionality that lavfi actually uses.
It generates very similar code to what NVCC produces.
The header contains no implementation code derived from the SDK.
The function and type declarations are derived from the SDK only to the
extent required to build a compatible implementation. This is generally
accepted to qualify as fair use.
Because this option does not require the proprietary SDK, it does not require
the "--enable-nonfree" flag in configure.
Signed-off-by: Timo Rothenpieler <timo@rothenpieler.org>
External headers are no longer welcome in the ffmpeg codebase because they
increase the maintenance burden. However, in the NVidia case the vanilla
headers need some modifications to be usable in ffmpeg therefore we still
provide them, but in a separate repository.
The external headers can be found at
https://git.videolan.org/?p=ffmpeg/nv-codec-headers.git
Fate-source is updated because of the deleted files, and dynlink_loader.h
license headers were updated with the standard FFmpeg headers.
Signed-off-by: Marton Balint <cus@passwd.hu>
Signed-off-by: Timo Rothenpieler <timo@rothenpieler.org>
Windows nvcc + cl.exe produce a .ctx file with CR+LF newlines which
need to be stripped to work with gcc.
Signed-off-by: Timo Rothenpieler <timo@rothenpieler.org>
This raises the required minimum NVIDIA display driver versions:
NVIDIA Linux display driver 378.13 or newer
NVIDIA Windows display driver 378.66 or newer
The nvidia 375.xx driver introduces support for P016 output surfaces,
for 10bit and 12bit HEVC content (it's also the first driver to support
hardware decoding of 12bit content).
The cuvid api, as far as I can tell, only declares one output format
that they appear to refer to as P016 in the driver strings. Of course,
10bit content in P016 is identical to P010, and it is useful for
compatibility purposes to declare the format to be P010 to work with
other components that only know how to consume P010 (and to avoid
triggering swscale conversions that are lossy when they shouldn't be).
For simplicity, this change does not maintain the previous ability
to output dithered NV12 for 10/12 bit input video - the user will need
to update their driver to decode such videos.
We need to remove the dynlink fanciness and replace it with normal
function prototypes and update the include paths and configure logic.
We don't need to explicitly check for PICPARMS now - they're going
to be there.
For unknown reasons, the only accurately descriptive version of
cuviddec.h is in the Video SDK - the one in CUDA 7.5 lacks vp8
PICPARAMS and the vp9 struct definition is inaccurate. The CUDA 8 RC
includes an ancient version of this file from many many years go.
However, the one in the Video SDK is modified to work through a
dynamic link mechanism which we don't really want to use, so the
next change will modify the files to just declare functions in
the normal way.
I've split the changes so it's clear to see what changed between
the original files and ones that work for us.