Fixes: CID1551694 Use after free (false positive based on assuming that out == in and one is freed and one used)
Sponsored-by: Sovereign Tech Fund
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Fixes: CID1598548 Logically dead code
Sponsored-by: Sovereign Tech Fund
Reviewed-by: "Xiang, Haihao" <haihao.xiang@intel.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Due to hysterical raisins, most RISC-V Linux distributions target a
RV64GC baseline excluding the Bit-manipulation ISA extensions, most
notably:
- Zba: address generation extension and
- Zbb: basic bit manipulation extension.
Most CPUs that would make sense to run FFmpeg on support Zba and Zbb
(including the current FATE runner), so it makes sense to optimise for
them. In fact a large chunk of existing assembler optimisations relies
on Zba and/or Zbb.
Since we cannot patch shared library code, the next best thing is to
carry a flag initialised at load-time and check it on need basis.
This results in 3 instructions overhead on isolated use, e.g.:
1: AUIPC rd, %pcrel_hi(ff_rv_zbb_supported)
LBU rd, %pcrel_lo(1b)(rd)
BEQZ rd, non_Zbb_fallback_code
// Zbb code here
The C compiler will typically load the flag ahead of time to reducing
latency, and can also keep it around if Zbb is used multiple times in a
single optimisation scope. For this to work, the flag symbol must be
hidden; otherwise the optimisation degrades with a GOT look-up to
support interposition:
1: AUIPC rd, GOT_OFFSET_HI
LD rd, GOT_OFFSET_LO(rd)
LBU rd, (rd)
BEQZ rd, non_Zbb_fallback_code
// Zbb code here
This patch adds code to provision the flag in libraries using bit
manipulation functions from libavutil: byte-swap, bit-weight and
counting leading or trailing zeroes.
Add xpu device support to libtorch backend.
To enable xpu support you need to add
"-Wl,--no-as-needed -lintel-ext-pt-gpu -Wl,--as-needed" to
"--extra-libs" when configure ffmpeg.
Signed-off-by: Wenbin Chen <wenbin.chen@intel.com>
Actually, the jaccard distance is defined as D = 1 - intersect / union.
Additionally, the distance value is compared against a constant that
must be between 0 and 1, which is not the case here. Both facts together
has led to the fact, that the function always returned a matching course
signature. To leave the constant intact and to avoid floating point
computation, this commit multiplies with 1 << 16 making the constant
effectively 9000 / (1<<16) =~ 0.14.
Reported-by: Sachin Tilloo <sachin.tilloo@gmail.com>
Reviewed-by: Sachin Tilloo <sachin.tilloo@gmail.com>
Tested-by: Sachin Tilloo <sachin.tilloo@gmail.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
The result might not fit into 32bit if an image has gigantic
dimensions and one of the planes has a dominant value
(particularly so if said value is big).
Fixes Coverity issues #1598399, #1598401, #1598402, #1598403, #1598404.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
For code such as 'model->model = ov_model' is confusing. We can
just drop the member variable and use cast to get the subclass.
Signed-off-by: Zhao Zhili <zhilizhao@tencent.com>
Reviewed-by: Wenbin Chen <wenbin.chen@intel.com>
Reviewed-by: Guo Yejun <yejun.guo@intel.com>
It will be freed again by ff_dnn_uninit.
Signed-off-by: Zhao Zhili <zhilizhao@tencent.com>
Reviewed-by: Wenbin Chen <wenbin.chen@intel.com>
Reviewed-by: Guo Yejun <yejun.guo@intel.com>
It will be freed again by ff_dnn_uninit.
Signed-off-by: Zhao Zhili <zhilizhao@tencent.com>
Reviewed-by: Wenbin Chen <wenbin.chen@intel.com>
Reviewed-by: Guo Yejun <yejun.guo@intel.com>
The earlier code distinguished between a partial reset
(yae_clear()) and a complete reset (yae_release_buffers()
which also releases the buffers); this separation existed
to avoid allocations, as buffers were reallocated on reconfigs.
Yet it is pointless since a5704659e3,
so simply use yae_release_buffers() everywhere.
Reviewed-by: Pavel Koshevoy <pkoshevoy@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>