The vector dequantization has a test in a loop preventing effective SIMD
implementation. By moving it out of the loop, this loop can be DSPized.
Therefore, modify the current DSP implementation. In particular, the
DSP implementation no longer has to handle null loop sizes.
The decode_hf implementations have following timings:
For x86 Arrandale:
C SSE SSE2 SSE4
win32: 260 162 119 104
win64: 242 N/A 89 72
The arm NEON optimizations follow in a later patch as external asm. The
now unused check for the y modifier in arm inline asm is removed from
configure.
The current configure fails when static libbluray is compiled with libxml2
support.
Signed-off-by: Timothy Gu <timothygu99@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
Framerate is now a sane rational instead of an integer, and
inputDepth is changed to what it actually is.
Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
Framerate is now a sane rational instead of an integer, and
inputDepth is changed to what it actually is.
Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
Moving cpunop from the HAVE_LIST to the ARCH_EXT_LIST_X86 has the side
effect of enabling it. The semantics of the check have to be changed
from enable if successful to disable if unsuccessful. This was missing
in 2b0bb69997 causing build errors with
nasm.
This fixes cpunop detection and unbreaks NASM assembly
Signed-off-by: Dave Yeo <daveryeo@telus.net>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
The DCBZL instruction is not available for the e500v1 and e500v2
architectures, but may still be recognized by the toolchain, so we need to
remove the test for it explicitly for these architectures.
References: PowerPC™ e500 Core Family Reference Manual (Freescale)
Found-by: Ståle Kristoffersen <staalebk@ifi.uio.no>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
Enable compilation on machines with an old libfdk-aac.
Signed-off-by: Timothy Gu <timothygu99@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
7.1(wide) and 7.1(wide-side) channel layouts are supported in fdk_aac since october 2013 (commit fa3eba1644)
Signed-off-by: Jean First <jeanfirst@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
Fixes dependency file generation with gas-preprocessor.pl and clang.
Flags copied from GCC and tested with Apple's clang from Xcode 5 and
5.1 and clang 3.2, 3.3, 3.4 on Linux.
NEON and VFP are currently mandatory for all ARMv8 profiles. Both are
handled as extensions as far as cpuflags are concerned. This is
consistent with handling x86_64 which always has SSE2, but still
handles it as an extension.
Stack is always 16 byte aligned and clz, 64bit operations and unaligned
memory access are fast in aarch64 mode on ARMv8.
Signed-off-by: Janne Grunau <janne-libav@jannau.net>