Rework the code a bit to speed up the 10-bit bitpacked decoding
routine. This is probably about as fast as I can get it without
switching to assembly language.
Demonstratable with:
./ffmpeg -f lavfi -i "smptehdbars=size=3840x2160" -c bitpacked -f image2 -frames:v 1 source.yuv
./ffmpeg -f bitpacked -pix_fmt yuv422p10le -s 3840x2160 -c:v bitpacked -i source.yuv -pix_fmt yuv422p10le out.yuv
On my development system, it went from 80ms for a 2160p frame
down to 20ms (i.e. a 4X speedup). Good enough for now, I hope...
Comments from Marton:
Originally on my system better performance could be achieved by simply
switching to the cached bitstream reader, but for Devin it was slower than
his direct byte operations.
I changed the order of writing output from u/y/v/y to u/v/y/y, and that made
the code faster than the cached bitstream reader on my system as well.
TIMER measurement of the decode loop on Ryzen 5 3600 with command line:
./ffmpeg -stream_loop 256 -threads 1 -f bitpacked -pix_fmt yuv422p10le -s 3840x2160 -c:v bitpacked -i source.yuv -pix_fmt yuv422p10le -f null none -loglevel error
Before: 823204127 decicycles in YUV, 256 runs, 0 skips
After: 315070524 decicycles in YUV, 256 runs, 0 skips
Signed-off-by: Devin Heitmueller <dheitmueller@ltnglobal.com>
Signed-off-by: Marton Balint <cus@passwd.hu>
Some encoders (e.g., libx264) dump encoder configuration as user
data unregistered SEI message. This option try to print it as
ascii character when possible.
Signed-off-by: Zhao Zhili <zhilizhao@tencent.com>
The specification doesn't mention that clusters cannot have alphabet
sizes greater than 1 << bundle->log_alphabet_size, but the reference
implementation rejects these entropy streams as invalid, so we should
too. Refusing to do so can overflow a stack variable that should be
large enough otherwise.
Fixes#10738.
Found-by: Zeng Yunxiang and Li Zeyuan
Signed-off-by: Leo Izen <leo.izen@gmail.com>
Fixes the build. It's a requirement when utilizing PIE.
Signed-off-by: Brad Smith <brad@comstyle.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
If a sequence of JXL images is encapsulated in a container that has PTS
information, we should use the PTS information from the container. At
this time there is no container that does this, but if JPEG XL support
is ever added to NUT, AVTransport, or some other container, this commit
should allow the PTS information those containers provide to work as
expected.
Signed-off-by: Leo Izen <leo.izen@gmail.com>
These pixel formats have always been supported by libjxl, but at the
time this plugin was written, they were not in FFmpeg yet. Now that
they are in FFmpeg, we should support them.
Signed-off-by: Leo Izen <leo.izen@gmail.com>
These pixel formats have always been supported by libjxl, but at the
time this plugin was written, they were not in FFmpeg yet. Now that
they are in FFmpeg, we should support them.
Signed-off-by: Leo Izen <leo.izen@gmail.com>
FFmpeg doesn't support tv-range RGB throughout most of its pipeline, so
we should keep the warning. However, in case something does support it
we should at least keep it tagged properly. Additionally, the encoder
writes this tag if the space is tagged as such so this makes a round
trip work as it should.
Also, PNG doesn't support nonzero matrices but we only warn and ignore
in that case, so we have no reason to error out for illegal cICP ranges
either (i.e. greater than 1).
Signed-off-by: Leo Izen <leo.izen@gmail.com>
Reported-by: Kacper Michajłow <kasper93@gmail.com>
Unnecessary since acf63d5350adeae551d412db699f8ca03f7e76b9;
also avoids relocations.
Reviewed-by: Anton Khirnov <anton@khirnov.net>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
For memcpy and memcmp, we need to multiply by the element size,
otherwise we're copying and comparing only a fraction of the buffer.
For decorrelate_sr, the buffer p1 is the one that is mutated;
copy and check p1 instead of p2.
For decorrelate_sm, both buffers are mutated, so copy and check
both of them.
For decorrelate_sm, the memcpy initialization of p1 and p1_2 was
reversed - p1 is filled with randomize, but then memcpy copies from
p1_2 to p1. As p1_2 is uninitialized at this point, clang concluded
that the copy was bogus and omitted it entirely, triggering failures
in this test on x86 (where there was an existing assembly implementation
to test).
Signed-off-by: Martin Storsjö <martin@martin.st>
* filter subtitle/data options out of main, video and audio sections
* add filters that were missing entirely from the subtitle section
* add a missing section for advanced subtitle options
So they don't clutter the standard help output.
-loglevel is marked because there is no need to show two options (-v and
-loglevel) that do the same thing.
Currently it requires every single OPT_SPEC option to be accompanied by
an array of alternate names for this option. The vast majority of
options have no alternate names, resulting in a large numbers of
unnecessary single-element arrays that merely contain the option name.
Extend the option parsing API to allow marking options as having
alternate names, or as being the canonical name for some existing
alternatives. Use this new information to avoid the need for
abovementioned unnecessary single-element arrays.
This option flag only carries nontrivial information for options that
call a function, in all other cases its presence can be inferred from
the option type (bool options do not have arguments, all other types do)
and is thus nothing but useless clutter.
Change the option parsing code to infer its value when it can, and drop
the flag from options where it's not needed.
It causes those options to be parsed as either
* -autofoo 0/1 (with an argument)
* -noautofoo (without an argument)
This is unnecessary, confusing, and against the documentation; these are
also the only two bool options that take an argument.
This should not affect the users, as these options are on by default,
and are supposed to be used as -nofoo per the documentation.
Otherwise an unitialized stack value would be copied to FPSConvContext.
As it's then never used, it tends not to be a problem in practice,
however it is UB and some compilers warn about it.
The check for UWP mode was duplicated from right above, in
d54127c41a.
Also, instead of several lines with "enabled uwp && ...", make one
"if enabled uwp; then" block.
Signed-off-by: Martin Storsjö <martin@martin.st>
This frame will be freed in the next line.
Reviewed-by: Zhao Zhili <quinkblack@foxmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
The current pack_output function pointer is a property of the decoder,
rather than a constant method provided by the DSP code. Indeed, except
for an unused initialisation, the field is never used in DSP code.