The randomize_buffer() implementation assures that "most of the time",
we'll do a good mix of wide16/wide8/hev/regular/no filters for complete
code coverage. However, this is not mathematically assured because that
would make the code either much more complex, or much less random.
Some fixes and improvements by Rodger Combs <rodger.combs@gmail.com>
Signed-off-by: Anton Khirnov <anton@khirnov.net>
This fixes valgrind warnings about conditional jumps based on
uninitialized data (even though the uninitialized data only ever
was compared with a direct copy of the same uninitialized data).
Signed-off-by: Martin Storsjö <martin@martin.st>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
When targeting COFF (windows), clang doesn't support this
directive (while binutils supports it for all targets).
Signed-off-by: Martin Storsjö <martin@martin.st>
These bits are set by exceptions in NEON instructions.
Also print the differing bits when FPSCR is clobbered,
and use bic instead of lsl, for clearing the topmost bits.
Signed-off-by: Martin Storsjö <martin@martin.st>
Each const block needs to be terminated by one endconst
invocation so either call endconst after each, or just
declare plain labels to the later strings.
This fixes errors such as this, on some binutils versions:
checkasm.S:38: Error: Macro `endconst' was already defined
Signed-off-by: Martin Storsjö <martin@martin.st>
The stack used by checkasm_checked_call_vfp was a multiple of 4 when the
checked function is called. AAPCS requires a double word (8 byte)
aligned stack public interfaces. Since both calls are public interfaces
the stack is misaligned when the checked is called.
Might fix the SIGBUS error in the armv7-linux-clang-3.7 fate config.
This avoids listing the same feature multiple times in the
test output. Previously the output contained something like this:
SSE2:
- hevc_mc.qpel [OK]
- hevc_mc.epel [OK]
- hevc_mc.unweighted_pred [OK]
- hevc_mc.qpel [OK]
- hevc_mc.epel [OK]
- hevc_mc.unweighted_pred [OK]
Signed-off-by: Martin Storsjö <martin@martin.st>
This avoids the risk of accidentally clobbering such variables outside
of the macro if the same variables are used there.
Signed-off-by: Martin Storsjö <martin@martin.st>
This fixes valgrind warnings about conditional jumps based on
uninitialized data (even though the uninitialized data only ever
was compared with a direct copy of the same uninitialized data).
Signed-off-by: Martin Storsjö <martin@martin.st>
The functions may not clean up properly after using MMX
registers. For the normal testing calls, the checkasm_checked_call
functions will do the cleanup (and check that functions that
should clean up do it as well), but when benchmarking functions
that don't clean up, we don't currently properly clean up at all.
This causes issues if a benchmarked function is followed by testing
of a function that is supposed to not clobber the MMX/FPU state but
doesn't touch it at all.
Signed-off-by: Martin Storsjö <martin@martin.st>
Restore alphabetical order in lists, break overly long lines, do some
prettyprinting, add some explanatory section comments, group parts
together that belong together logically.
Also bench a smaller buffer. This drastically reduces --bench runtime
and reports smaller, more readable numbers.
Reviewed-by: Michael Niedermayer <michael@niedermayer.cc>
Signed-off-by: James Almer <jamrial@gmail.com>
Some debuggers/profilers use this metadata to determine which function a
given instruction is in; without it they get can confused by local labels
(if you haven't stripped those). On the other hand, some tools are still
confused even with this metadata. e.g. this fixes `gdb`, but not `perf`.
Currently only implemented for ELF.
Signed-off-by: Anton Khirnov <anton@khirnov.net>
Some debuggers/profilers use this metadata to determine which function a
given instruction is in; without it they get can confused by local labels
(if you haven't stripped those). On the other hand, some tools are still
confused even with this metadata. e.g. this fixes `gdb`, but not `perf`.
Currently only implemented for ELF.
Use two separate functions, depending on whether VFP/NEON is available.
This is set to require armv5te - it uses blx, which is only available
since armv5t, but we don't have a separate configure item for that.
(It also uses ldrd, which requires armv5te, but this could be avoided
if necessary.)
Signed-off-by: Martin Storsjö <martin@martin.st>
Check the full FPU tag word instead of only the lower half and simplify
the comparison.
Use upper-case function base name as macro name to instantiate both
checked_call variants.