Perform av_clip_int16(val) _after_ copying the value to last_dc.
This change ensures that clipping is applied only within the context of
the current block, preventing the propagation of clipped values to
subsequent DC components.
Related commits: c28f648b19 and dffae122d0
Related ticket: 4683
This avoids hardcoding any implementation-specific limitiations as
part of the API, and allows for future expandability.
This also allows API users to more conveniently convert the
values into floats without hardcoding specific conversion constants.
The API was committed a few days ago, so changing this field now
is within the realms of acceptable.
This allows ending up with a normal, non-fragmented file when
the file is finished, while keeping the file readable if writing
is aborted abruptly at any point. (Normally when writing a
mov/mp4 file, the unfinished file is completely useless unless it
is finished properly.)
This results in a file where the mdat atom contains (and hides)
all the moof atoms that were part of the fragmented file structure
initially.
Signed-off-by: Martin Storsjö <martin@martin.st>
ffprobe is meant to generate parseable output, and if a field is present, it
should be printed even if it has a default value.
Signed-off-by: James Almer <jamrial@gmail.com>
These tables are supposed to contain the number of bits needed
to encode a given (run, level) pair. Yet the number of bits
for pairs needing the escape code was wrong (it only contained
the escape code and not the bits needed for run and level).
Furthermore, H.261 (a format with explicit end-of-block codes)
does not work well together with the RLTable API from rl.c:
The EOB code is the first one in ff_h261_rl_tcoeff's VLC table
and has a run value of zero. Therefore the result of get_rl_index()
is off by one for run == 0 and level values with explicit
(run, level) pair.
Fixing this necessitated changing the ref files of the
vsynth*-h261-trellis tests. Both filesizes as well as PSNR
decreased. If one used a qscale value of 11 for this test,
one would have received files with about the same size as
before this patch (with qscale 12), but with better PSNR.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Not every function will be set, so zero the context
to initialize everything.
This also allows to remove an initialization in dvenc.c.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
MECmpContext.ildct_cmp is an array of function pointers that
are not set by ff_me_cmp_init(), but that are set by users
to one of the other arrays via ff_set_cmp().
Remove these pointers from MECmpContext and add pointers
for the actually used functions to its users. (The DV encoder
already did so.)
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
MECmpContext has several arrays of function pointers that
are not set by ff_me_cmp_init(), but that are set by users
to one of the other arrays via ff_set_cmp().
One of these other users is mpegvideo_enc; it is the only user
of MECmpContext.frame_skip_cmp and it only uses one of these
function pointers at all.
This commit therefore moves this function pointer to MpegEncContext;
and removes the array from MECmpContext.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
MECmpContext has several arrays of function pointers that
are not set by ff_me_cmp_init(), but that are set by users
to one of the other arrays via ff_set_cmp().
One of these other users is the motion estimation API.
It uses MECmpContext.(me_pre|me|me_sub|mb)_cmp. It is
basically the only user of these arrays.
This commit therefore moves these arrays to MotionEstContext;
this has the additional advantage of making motion_est.c
more independent from MpegEncContext.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Before After
-------------------------------------------------
make fate-vvc CPU Time (No ASM) 131.52s 134.83s
libavcodec/vvc/* Line Coverage 95.3% 96.9%
inter_template.c Line Coverage 74.3% 88.2%
inter.c Line Coverage 85.3% 99.2%
Signed-off-by: Frank Plowman <post@frankplowman.com>
Otherwise a bunch of SEI units that should not be in hvcC will be included,
and generate different output with builds where extract_extradata_bsf is not
present.
Signed-off-by: James Almer <jamrial@gmail.com>
The check should be >= 0, not > 0. The check itself is redundant
since uninit only being called after init is success.
Signed-off-by: Zhao Zhili <zhilizhao@tencent.com>
Also make the iso_channel_position table consistent with what the AAC decoder
uses in avcodec/aac/aacdec_usac.c.
Fate changes are caused by the change of how 7.1 layout is mapped, previously
it included Side Surround channels, now it includes the Surround channels.
Signed-off-by: Marton Balint <cus@passwd.hu>
Besides being write only it had the wrong type:
An uint8_t is definitely not enough to store the size
of these buffers.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Due to hysterical raisins, most RISC-V Linux distributions target a
RV64GC baseline excluding the Bit-manipulation ISA extensions, most
notably:
- Zba: address generation extension and
- Zbb: basic bit manipulation extension.
Most CPUs that would make sense to run FFmpeg on support Zba and Zbb
(including the current FATE runner), so it makes sense to optimise for
them. In fact a large chunk of existing assembler optimisations relies
on Zba and/or Zbb.
Since we cannot patch shared library code, the next best thing is to
carry a flag initialised at load-time and check it on need basis.
This results in 3 instructions overhead on isolated use, e.g.:
1: AUIPC rd, %pcrel_hi(ff_rv_zbb_supported)
LBU rd, %pcrel_lo(1b)(rd)
BEQZ rd, non_Zbb_fallback_code
// Zbb code here
The C compiler will typically load the flag ahead of time to reducing
latency, and can also keep it around if Zbb is used multiple times in a
single optimisation scope. For this to work, the flag symbol must be
hidden; otherwise the optimisation degrades with a GOT look-up to
support interposition:
1: AUIPC rd, GOT_OFFSET_HI
LD rd, GOT_OFFSET_LO(rd)
LBU rd, (rd)
BEQZ rd, non_Zbb_fallback_code
// Zbb code here
This patch adds code to provision the flag in libraries using bit
manipulation functions from libavutil: byte-swap, bit-weight and
counting leading or trailing zeroes.
Instead of an ad-hoc scheme. Also, combine skipping RASL frames with
skip_frame handling - current code seems flawed as it only executes for
the first slice of a RASL frame and unnecessarily unsets is_decoded,
which should not be set at this point anyway..
Some RASL frames in fate-hevc-afd-tc-sei that were previously discarded
are now output.
B0 is defined by system header, see f0f596dbc6 for ref.
Reviewed-by: Martin Storsjö <martin@martin.st>
Signed-off-by: Zhao Zhili <zhilizhao@tencent.com>
src is apparently not guaranteed to be >8 byte aligned, but align to 16
nonetheless as the x86 asm will do unaligned loads anyway.
dst is guaranteed to be 32 byte aligned for the Y plane, but 16 byte for UV.
Signed-off-by: James Almer <jamrial@gmail.com>
The MMXEXT versions of the rgb2rgb functions tested here
always emit emms on their own. Therefore one can use
a stricter test to ensure that it stays that way.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
There is no benefit in using it: The fast path of copying
is not taken because of misalignment; furthermore we are
only dealing with a few byte here anyway, so simply copy
the bytes manually, avoiding the dependency on bitstream.c
in lavf (which also contains a function that is completely
unused in lavf).
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
A 360 video specific tool
see https://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=9503377
passed files:
DMVR_A_Huawei_3.bit
WRAP_D_InterDigital_4.bit
WRAP_A_InterDigital_4.bit
WRAP_B_InterDigital_4.bit
WRAP_C_InterDigital_4.bit
ERP_A_MediaTek_3.bit
The OS may silently fix (emulate) unaligned hardware access exceptions.
This is extremely slow and code should be fixed not to rely on unaligned
access on affected hardware. Accordingly this requests that the OS
disable emulation and instead throw Bus error, which will be caught by
checkasm's signal handler.
This has no effects if the hardware supports unaligned access in
hardware, since no exceptions are generated. prctl() will fail safe in
that case.
The line width 8 is supposed to test corner case, while the
performance doesn't matter. Width 1080 is also a case of
unaligned to 16.
Width 1920 meant for benchmark (together with --runs options).
Signed-off-by: James Almer <jamrial@gmail.com>
From Benjamin Bross:
> for ALF where functions are in increments of 4 while 8 should be sufficient according to the spec.
Signed-off-by: Wu Jianhua <toqsxw@outlook.com>