This is possible because the lifetime of both coincide.
Besides reducing the number of allocations this also simplifies
access to VTFramesContext as one no longer has to
go through AVHWFramesInternal.
Tested-by: Jan Ekström <jeebjp@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
This is possible because the lifetime of both coincide.
Besides reducing the number of allocations this also simplifies
access to QSVFramesContext as one no longer has to
go through AVHWFramesInternal.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
This is possible because the lifetime of both coincide.
Besides reducing the number of allocations this also simplifies
access to QSVDeviceContext as one no longer has to
go through AVHWDeviceInternal.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
This is possible because the lifetime of both coincide.
Besides reducing the number of allocations this also simplifies
access to D3D11VAFramesContext as one no longer has to
go through AVHWFramesInternal.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
This is possible because the lifetime of both coincide.
Besides reducing the number of allocations this also simplifies
access to DXVA2FramesContext as one no longer has to
go through AVHWFramesInternal.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Use AVHWFramesContext.hwctx instead.
This simplifies access to VDPAUFramesContext as one no longer has
to go through AVHWFramesInternal.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
This is possible because the lifetime of both coincide.
Besides reducing the number of allocations this also simplifies
access to VDPAUDeviceContext as one no longer has to
go through AVHWDeviceInternal.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
This makes the code much simpler (especially for adding support
for other instruction set extensions), avoids needing inline
assembly for this feature, and generally is more of the canonical
way to do this.
The CPU feature detection was added in
493fcde50a, using HWCAP_CPUID.
The argument for using that, was that HWCAP_CPUID was added much
earlier in the kernel (in Linux v4.11), while the HWCAP flags for
individual features always come later. This allows detecting support
for new CPU extensions before the kernel exposes information about
them via hwcap flags.
However in practice, there's probably quite little advantage in this.
E.g. HWCAP2_I8MM was added in Linux v5.10 - long after HWCAP_CPUID,
but there's probably very little practical cases where one would
run a kernel older than that on a CPU that supports those instructions.
Additionally, we provide our own definitions of the flag values to
check (as they are fixed constants anyway), with names not conflicting
with the ones from system headers. This reduces the number of ifdefs
needed, and allows detecting those features even if building with
userland headers that are lacking the definitions of those flags.
Also, slightly older versions of QEMU, e.g. 6.2 in Ubuntu 22.04,
do expose support for these features via HWCAP flags, but the
emulated cpuid registers are missing the bits for exposing e.g. I8MM.
(This issue is fixed in later versions of QEMU though.)
Signed-off-by: Martin Storsjö <martin@martin.st>
This is possible because the lifetime of both coincide.
Besides reducing the number of allocations this also simplifies
access to OpenCLFramesContext as one no longer has to
go through AVHWFramesInternal.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
This is possible because the lifetime of both coincide.
Besides reducing the number of allocations this also simplifies
access to OpenCLDeviceContext as one no longer has to
go through AVHWDeviceInternal.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
To do so, concatenate all the names together to one big string
name1\0name2\0....lastname\0\0. This avoids the pointer in
the FunctionLoadInfo structure and thereby moves vk_load_info
into .rodata (and makes it smaller by 888B).
Reviewed-by: Lynne <dev@lynne.ee>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Saves 16B per entry here (four of these 16 bytes are padding);
leads to 1776 B of savings in each file that uses
ff_vk_load_functions().
Reviewed-by: Lynne <dev@lynne.ee>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
There are three possible names for the functions requested;
they only differ in an extension: "", "EXT" or "KHR".
Yet vk_load_info contained pointers to all these strings.
This is wasteful and this commit changes it to avoid
the latter two strings. This saves 6353B of strings,
1776 B of .data.rel.ro as well as 5328 B due to the removed
relocations (corresponding to 2 * 111 removed pointers)
in lavc/vulkan_decode.o alone (ff_vk_load_functions()
is inlined in lavfi/vulkan_filter.c, lavu/hwcontext_vulkan.c
and lavc_vulkan_decode.c, so the savings are three times
this for shared builds; for static builds, the number may
be smaller depending upon whether strings are deduplicated).
Reviewed-by: Lynne <dev@lynne.ee>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
This is possible because the lifetime of both coincide.
Besides reducing the number of allocations this also simplifies
access to VulkanFramesPriv as one no longer has to
go through AVHWFramesInternal.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
This is possible because the lifetime of both coincide.
Besides reducing the number of allocations this also simplifies
access to VulkanDevicePriv as one no longer has to
go through AVHWDeviceInternal.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Use AVHWFramesContext.hwctx instead.
This simplifies accesses to VDPAUFramesContext as one no longer has
to go through AVHWFramesInternal.
Tested-by: Timo Rothenpieler <timo@rothenpieler.org>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Correct the names of the format-specific headers (not hwframe_*.h)
and clarify that the user shall ignore this field if there is no
public context associated with it.
In particular, this allows to use this field for the private context
alone if there is no public context. This can't break conforming
API users, because they always have to live with the possibility
that a new public context has been introduced.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Without ff_vk_exec_discard_deps which is called by ff_vk_exec_wait,
the reference count of hwframe context cannot reach zero due to
circular reference created by ff_vk_exec_add_dep_frame.
Fix#10873
Signed-off-by: Zhao Zhili <zhilizhao@tencent.com>