Requires a new upstream function to test not for *import* support on a
given output pixel format, but also whether we can render to it.
Fixes: https://github.com/haasn/libplacebo/issues/173
That feature is overkill for a constant pointer to AVFilterLink which
can be stored in AVCodecContext.opaque (indirectly, because the link is
not allocated yet at the time the codec is opened).
This also avoids leaking non-NULL AVFrame.opaque to callers.
Add an optional filter_line3 to the available optimisations.
filter_line3 is equivalent to filter_line, memcpy, filter_line
filter_line shares quite a number of loads and some calculations in
common with its next iteration and testing shows that using aarch64
neon filter_line3s performance is 30% better than two filter_lines
and a memcpy.
Adds a test for vf_bwdif filter_line3 to checkasm
Rounds job start lines down to a multiple of 4. This means that if
filter_line3 exists then filter_line will not sometimes be called
once at the end of a slice depending on thread count. The final slice
may do up to 3 extra lines but filter_edge is faster than filter_line
so it is unlikely to create any noticable thread load variation.
Signed-off-by: John Cox <jc@kynesim.co.uk>
Signed-off-by: Martin Storsjö <martin@martin.st>
Exports C filter_line needed for tail fixup of neon code
Adds neon for filter_line
Signed-off-by: John Cox <jc@kynesim.co.uk>
Signed-off-by: Martin Storsjö <martin@martin.st>
Adds clip and spatial macros for aarch64 neon
Exports C filter_edge needed for tail fixup of neon code
Adds neon for filter_edge
Signed-off-by: John Cox <jc@kynesim.co.uk>
Signed-off-by: Martin Storsjö <martin@martin.st>
Adds an outline for aarch neon functions
Adds common macros and consts for aarch64 neon
Exports C filter_intra needed for tail fixup of neon code
Adds neon for filter_intra
Signed-off-by: John Cox <jc@kynesim.co.uk>
Signed-off-by: Martin Storsjö <martin@martin.st>
The HDR metadata should be removed after HDR to SDR conversion,
otherwise the output frame still has HDR side data.
Signed-off-by: Haihao Xiang <haihao.xiang@intel.com>
This makes the filter output match that of the C version.
It was left intentionally while we figured out if it was better
or not, and while it makes certain samples better, it makes static
samples jump around slightly.
The discrepancy between the definition and the declaration
in allfilters.c is actually UB.
Reviewed-by: James Almer <jamrial@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
The old logic was trying to be excessively clever in "deducing" that the
user wanted to stretch/scale the image when ow/oh differed from iw/ih
aspect ratio. But this is almost surely unintended except in
pathological cases, and in those cases users should simply disable
normalize_sar and do all the stretching/scaling logic themselves. This
is especially important in multi-input mode, where the canvas may be
vastly different from the input dimensions of any stream. Also, passing
through input 0 SAR in multi-input mode is arbitrary and nearly useless,
so again force output SAR to 1:1 here.
Use the gcd of all input timebases to ensure PTS accuracy. For the
framerate, just pick the highest of all the inputs, under the assumption
that we will render frames with approximately this frequency. Of course,
this is not 100% accurate, in particular if the input frames are badly
misaligned. But this field is informational to begin with.
Importantly, it covers the "common" case of combining high FPS and low
FPS streams with aligned frames.