Many places are using their own custom code for handling overflow
around timestamps or other int64_t values. There are enough of these
now that having some common saturated math functions seems sound.
Signed-off-by: Dale Curtis <dalecurtis@chromium.org>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
On windows and darwin (and modern android), the x18 register is reserved
and shouldn't be modified by user code, while it is freely available on
linux. Strictly avoid it, to keep the assembly code portable.
This would have helped catch the issue fixed in 872790b1f9
immediately.
Signed-off-by: Martin Storsjö <martin@martin.st>
Only warn instead. API users can find out which extensions were unavailable
by using the enabled_inst_extensions and enabled_dev_extensions fields.
This eliminates having to trial-and-error to find which extensions were missing.
Due to our AVHWDevice infrastructure, where API users are offered a way
to derive contexts rather than always create new one, our filterchains,
being supported by a single hardware device context, can grow to considerable
size.
Hence, in such situations, using the maximum amount of queues the device offers
can be benefitial to eliminating bottlenecks where queue submissions on the
same family have to wait for the previous one to finish.
This is intended to replace the deprecated the AV_FRAME_DATA_QP_TABLE*
API and extend it to a wider range of codecs.
In the future, it may also be extended to support other encoding
parameters such as motion vectors.
Additional changes by Anton Khirnov <anton@khirnov.net> with suggestions
by Lynne <dev@lynne.ee>.
Signed-off-by: Juan De León <juandl@google.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Signed-off-by: Anton Khirnov <anton@khirnov.net>
This reverts commit 97b526c192.
It broke the API, and assumed no other APIs used multiple semaphores.
This also disallowed certain optimizations to happen.
Dealing with APIs that give or expect single semaphores is easier when
we use per-image semaphores.
The specs note that images should be in the GENERAL layout when exporting
for maximum compatibility.
CUDA exported images are handled differently, and the queue is the same,
so we don't need to do that there.
As it turns out, we were already assuming and treating all images as if they had
concurrent access mode. This just changes the flag to CONCURRENT, which has less
restrictions than EXCLUSIVE, and fixed validation messages on machines with
multiple queues.
The validation layer didn't pick this up because the machine I was testing on
had only a single queue.
This solves a huge oversight - it lets users reliably use their own
AVVulkanDeviceContext. Otherwise, the extensions supplied and enabled
are not discoverable by anything outside of hwcontext_vulkan.
Also clarifies that any user-supplied VkInstance must be at least 1.1.
Also documents all options supported by the hwdevice.
This lets users enable all extensions they need without writing their own
instance initialization code.
Fixes problems when non-rational options were set using rational expressions,
causing rounding errors and the option range limits not to be enforced
properly.
ffmpeg -f lavfi -i "sine=r=96000/2"
This caused an assertion failure with assert level 2.
Signed-off-by: Marton Balint <cus@passwd.hu>
bump minor version for DOVI sidedata, because added the dovi_meta.h
as lavu API part. Also update APIchanges.
Signed-off-by: Jun Zhao <barryjzhao@tencent.com>
We derive the destination buffer stride from the input stride,
which meant if the image was flipped with a negative stride,
we'd be FFALIGNING a negative number which ends up being huge,
thus making the Vulkan buffer allocation fail and the whole
image transfer fail.
Only found out about this as OpenGL compositors can copy an entire
image with a single call if its flipped, rather than iterate over
each line.
Do not limit the array allocation functions and av_calloc() to allocations
of INT_MAX, instead depend on max_alloc_size like av_malloc().
Allows a workaround for ticket #7140.
The idea was to allow separate planes to be filtered independently, however,
in hindsight, literaly nothing uses separate per-plane semaphores and it
would only work when each plane is backed by separate device memory.
If one calls av_opt_set() with an incorrect string to set the value of
an option of type AV_OPT_TYPE_VIDEO_RATE, the given string is used in a
log message via %s. This also happens when the string is actually a
nullpointer in which case using it for %s is forbidden.
This commit changes this by erroring out early in case of a nullpointer.
This also fixes a warning from GCC 9.2:
"‘%s’ directive argument is null [-Wformat-overflow=]"
Reviewed-by: Anton Khirnov <anton@khirnov.net>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
By itself, this allows 6-point, 10-point and 30-point transforms.
When the 9-point transform is added it allows for 18-point FFT,
and also for a 36-point MDCT (used by MP3).
This matches the inclusion of the other hwcontext_<hwaccel>.h headers.
The skipping of the header depending on build flags is already present.
Signed-off-by: Daniel Playfair Cal: <daniel.playfair.cal@gmail.com>
Signed-off-by: Anton Khirnov <anton@khirnov.net>
The specifications are very vague about who has ownership, and in this case,
Vulkan takes ownership of all DMABUF FDs passed to it, causing errors
to occur if someone gave us images for mapping which were meant to be kept.
The old behavior worked with one-way VAAPI and DMABUF imports, but was broken
with clients like wlroots' dmabuf-capture.
There was a recent change in Intel's driver that triggered a driver-internal
error if the semaphore given to the command buffer wasn't initialized.
Given that the specifications require the semaphore to be initialized,
this is within spec. Unlike what's causing it in the first place, which is
that there are no ways to extract/import dma sync objects from DMABUFs,
so we must leave our semaphores bare.
If we are given a non-render node, try to find the matching render node and
fail if that isn't possible.
libva will not accept a non-render device which is not DRM master, because
it requires legacy DRM authentication to succeed in that case:
<https://github.com/intel/libva/blob/master/va/drm/va_drm.c#L68-L75>. This
is annoying for kmsgrab because in most recording situations DRM master is
already held by something else (such as a windowing system), leading to
device derivation not working and forcing the user to create the target
VAAPI device separately.
VA_RT_FORMAT describes the desired sampling format for surface.
When creating surface, VA_RT_FORMAT will be used firstly to choose
the expected fourcc/media_format for the surface. And the fourcc
will be revised by the value of VASurfaceAttribPixelFormat.
Add vaapi_format_map support for new pixel_format Y210.
This is fundamental for both VA-API and QSV.
Signed-off-by: Linjie Fu <linjie.fu@intel.com>
Required minimal changes to the code so made sense to implement.
FFT and MDCT tested, the output of both was properly rounded.
Fun fact: the non-power-of-two fixed-point FFT and MDCT are the fastest ever
non-power-of-two fixed-point FFT and MDCT written.
This can replace the power of two integer MDCTs in aac and ac3 if the
MIPS optimizations are ported across.
Unfortunately the ac3 encoder uses a 16-bit fixed point forward transform,
unlike the encoder which uses a 32bit inverse transform, so some modifications
might be required there.
The 3-point FFT is somewhat less accurate than it otherwise could be,
having minor rounding errors with bigger transforms. However, this
could be improved later, and the way its currently written is the way one
would write assembly for it.
Similar rounding errors can also be found throughout the power of two FFTs
as well, though those are more difficult to correct.
Despite this, the integer transforms are more than accurate enough.
Compared to ad-hoc if(printed) ... code this allows the user to disable
it by adjusting the log level
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>