The "CentaurHauls family 6 model 9 stepping 8" family of CPUs
(flags: fpu vme de pse tsc msr cx8 sep mtrr pge mov pat mmx fxsr sse
up rng rng_en ace ace_en) SIGILLs on long nop codes.
Signed-off-by: Martin Storsjö <martin@martin.st>
The new name is more descriptive and will allow defining a separate
public prefix for externally visible library symbols.
Signed-off-by: Diego Biurrun <diego@biurrun.de>
Use this in VP8/H264-8bit loopfilter functions so they can be used if
there is no aligned stack (e.g. MSVC 32bit or ICC 10.x).
Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
It appears that something goes wrong in old nasm versions when the
%+ operator is used in the last argument of a macro invocation and
this argument is tested with %ifdef within the macro. This patch
rearranges the macro arguments such that the %+ operator is never
used in the last argument.
nasm does not support 'CPU foonop' directives. This adds a configure
test for the directive and uses it only if supported.
Signed-off-by: Mans Rullgard <mans@mansr.com>
Currently there is a wild mix of 3dn2/3dnow2/3dnowext. Switching to
"3dnowext", which is a more common name of the CPU flag, as reported
e.g. by the Linux kernel, unifies this.
This allows us to unconditionally set the cglobal num_args
parameter to a bigger value, thus making writing yasm code
even easier than before.
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
Add support for all x86-64 registers
Prefer caller-saved register over callee-saved on WIN64
Support up to 15 function arguments
Also (by Ronald S. Bultje)
Fix up our asm to work with new x86inc.asm.
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
Signed-off-by: Justin Ruggles <justin.ruggles@gmail.com>
This sets __OUTPUT_FORMAT__ to win64 instead of win32, even though both
(through -m amd64) produce 64-bit binary code.
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
Functions using INIT_MMX may still access XMM registers through direct
means (xmm0-15). Therefore, they still need to be marked for clobber
so they can be properly saved/restored.
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
We keep INIT_AVX (for backwards compatibility). 3arg AVX ops with
a memory arg can only have it in src2, whereas SSE emulation of
3arg prefers to have it in src1 (i.e. the mov). So, if the op is
symmetric and the wrong one is memory, swap them.
Modify the asm accordingly.
GLOBAL is now no longoer necessary for PIC-compliant loads.
Originally committed as revision 23739 to svn://svn.ffmpeg.org/ffmpeg/trunk
2.2x faster than C on conroe, 3.6x on penryn.
4-6% faster huffyuv decoding if using left or plane mode and yuv
Originally committed as revision 20287 to svn://svn.ffmpeg.org/ffmpeg/trunk
Use the new x86inc features to support 64-bit Windows on all non-x264 nasm
assembly code as well.
Patch by John Adcock, dscaler.johnad AT googlemail DOT com.
Win64 changes originally by Anton Mitrofanov.
x86util changes mostly by Holger Lubitz.
Originally committed as revision 19580 to svn://svn.ffmpeg.org/ffmpeg/trunk
It contains optimizations that are not specific to i386 and
libavutil uses this naming scheme already.
Originally committed as revision 16270 to svn://svn.ffmpeg.org/ffmpeg/trunk