jpeg2000_decode_tile() (which is run concurrently by several threads
when using slice threading) currently modifies some joint values
before doing its actual work. This is a data race that happens to work
because all threads set the same values; but it is nevertheless
undefined behaviour.
Fix this by performing said preparatory work in the main thread instead.
This fixes the vsynth(1|2|_lena)-jpeg2000(-97)? FATE-tests when using
TSAN and slice threading.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
When AV_CODEC_EXPORT_DATA_FILM_GRAIN is present, AV1 decoder should
disable film grain application and export the corresponding side data
Signed-off-by: Haihao Xiang <haihao.xiang@intel.com>
For vaapi if the init_pool_size is not zero, the pool size is fixed.
This means max surfaces is init_pool_size, but when mapping vaapi
frame to qsv frame, the init_pool_size < nb_surface. The cause is that
vaapi_decode_make_config() config the init_pool_size and it is called
twice. The first time is to init frame_context and the second time is to
init codec. On the second time the init_pool_size is changed to original
value so the init_pool_size is lower than the reall size because
pool_size used to initialize frame_context need to plus thread_count and
3 (guarantee 4 base work surfaces). Now add code to make sure
init_pool_size is only set once. Now the following commandline works:
ffmpeg -hwaccel vaapi -hwaccel_device /dev/dri/renderD128 \
-hwaccel_output_format vaapi -i input.264 \
-vf "hwmap=derive_device=qsv,format=qsv" \
-c:v h264_qsv output.264
Signed-off-by: Wenbin Chen <wenbin.chen@intel.com>
Makes Bulldozer prefer AVX functions rather than AVX2,
which are 64% slower:
AVX: 117653 decicycles in av_tx (fft), 1048535 runs, 41 skips
AVX2: 193385 decicycles in av_tx (fft), 1048561 runs, 15 skips
The only difference between both is that vgatherdpd is used in
the former. We don't want to mark them with the new SLOW_GATHER
flag however, since gathers are still faster on Haswell/Zen 2/3
than plain loads.
If a codelet initializes 2 subtransforms, and the second one fails,
the failure would free all subcontexts.
Instead, if there are subcontexts still left, don't free the array.
If all initializations fail, the init() function will return,
and reset_ctx() from the previous step will clean up all contained
subtransforms.
Fix CID: 1497864
The control flow should return ENOSYS if nb_cd_matches is 0 at before
and the ret equal AVERROR(ENOMEM) or goto end label, so remove the last
control flow if (ret >= 0) before end label.
Signed-off-by: Steven Liu <liuqi05@kuaishou.com>
Add intra refresh support to hevc_qsv as well.
Add an new intra refresh type: "horizontal", and an new param
ref_cycle_dist. This param specify the distance between the
beginnings of the intra-refresh cycles in frames.
Signed-off-by: Wenbin Chen <wenbin.chen@intel.com>
Signed-off-by: Haihao Xiang <haihao.xiang@intel.com>
Add b_strategy option to hevc_qsv. By enabling this option, encoder can
use b frames as reference.
Signed-off-by: Wenbin Chen <wenbin.chen@intel.com>
Signed-off-by: Haihao Xiang <haihao.xiang@intel.com>
This broke builds with --disable-mmx, which also disabled assembly
entirely, but ARCH_X86 was still true, so the init file tried to find
assembly that didn't exist.
Instead of checking for architecture, check if external x86 assembly
is enabled.
E.g. the inclusion of parser.h comes from a time when
the parser used a H264Context.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
This is only needed by h264_cabac.c and h264_cavlc.c.
Also fix up the other headers while at it.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
The only thing that is actually used directly from there is the
PART_NOT_AVAILABLE constant, which can be moved to h264pred.h.
Otherwise it only depends on other indirectly included headers.