postProcess in postprocess_template.c copies a PPContext
to the stack, works with this copy and then copies
it back again. Said local copy uses a hardcoded alignment
of eight, although PPContext has alignment 32 since
cbe27006ce
(this commit was in anticipation of AVX2 code that never landed).
This leads to misalignment in the filter-(pp|pp1|pp2|pp3|qp)
FATE-tests which UBSan complains about. So avoid the local copy.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
postprocess.c currently has C, MMX, MMXEXT, 3DNow as well as
SSE2 versions of its internal functions. But given that only
ancient 32-bit x86 CPUs don't support SSE2, the MMX, MMXEXT
and 3DNow versions are obsolete and are therefore removed by
this commit. This saves about 56KB here.
(The SSE2 version in particular is not really complete,
so that it often falls back to MMXEXT (which means that
there were some identical (apart from the name) MMXEXT
and SSE2 functions; this duplication no longer exists
with this commit.)
Reviewed-by: Paul B Mahol <onemda@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
as well as includes of libavutil/timer.h.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
This avoids problems if %4 is the stack pointer
the constraints do not allow %4 to be the stack pointer but gcc 9 may
no longer support specifying such constraints
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Also pulled QP initialization out of inner loop, which removed some redundent code.
Added some dummy fields to PPContext to allow current code to work while
changing the rest of the postprocessing code to support the arrays.
I also increased alignment requirements for some fields in the PPContext struct to
support future avx2 code.
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
Prefetching functions are defined in postprocess_template using the
RENAME macro so that prefetching is used when available. For x86
targets inline asm is used and the functions are non-empty only for
cpus where prefetching is available. For non x86 targets the gcc bultin
prefetch is used if it is available, otherwise no prefetching is done.
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
There's still an if, as QP needs to be modified if isColor=0, but it
still removes a unecessary branch.
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
This refactoring simplifies the usage of the template: define the
profile and include the template is all that is required. It should now
be easier to add more instruction sets.
The HAVE_* flags are changed with TEMPLATE_PP_* setting to avoid messing
them up.
See the top comment in postprocess_template.c for details.
This library does not fit into Libav as a whole and its code is just a
maintenance burden. Furthermore it is now available as an external project,
which completely obviates any reason to keep it around.
URL: http://git.videolan.org/?p=libpostproc.git
This fixes compilation failures related to START_TIMER/STOP_TIMER macros and
-Werror=declaration-after-statement. START_TIMER declares variables and thus
may not be placed after statements outside of a new block.
This moves declarations without initialisers or with constant
initialisers to the start of a block, and adds do {} while(0)
around some macros, thus allowing declarations within them.
Signed-off-by: Mans Rullgard <mans@mansr.com>