Jason Garrett-Glaser
a3fabc6cb3
x86: more AVX2 framework
...
Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
11 years ago
Jason Garrett-Glaser
c6908d6b4b
x86inc: FMA3/4 Support
...
Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
11 years ago
Derek Buitenhuis
206895708e
x86inc: Remove our FMA4 support
...
This is so we can sync to x264's version of FMA4 support.
This partialy reverts commit 79687079a9
.
Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
11 years ago
Henrik Gramner
c108ba0175
x86inc: Use VEX-encoded instructions in AVX functions
...
Automatically use VEX-encoding in AVX/AVX2/XOP/FMA3/FMA4
functions for all instructions that exists in a VEX-encoded
version.
This change makes it easier to extend existing code to use AVX2.
Also add support for AVX emulation of a few instructions that
were missing before.
Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
11 years ago
Henrik Gramner
ad7d7d4f6a
x86inc: Remove .rodata kludges
...
The Mach-O bug was fixed in yasm 0.8.0 and we don't
support versions that old anymore.
Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
11 years ago
Henrik Gramner
3e2fa991db
x86inc: remove misaligned cpu flag
...
Prevents a crash if the misaligned exception mask bit is
cleared for some reason.
Misaligned SSE functions are only used on AMD Phenom CPUs
and the benefit is miniscule. They also require modifying
the MXCSR control register and by removing those functions
we can get rid of that complexity altogether.
Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
11 years ago
Jason Garrett-Glaser
7115566541
x86inc: various minor backports from x264
...
Small backports that sneaked into other asm commits in x264.
Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
11 years ago
Derek Buitenhuis
47f9d7ce54
x86inc: Check for __OUTPUT_FORMAT__ having a value of "x64"
...
This is also a valid value for WIN64.
Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
11 years ago
Henrik Gramner
bbe4a6db44
x86inc: Utilize the shadow space on 64-bit Windows
...
Store XMM6 and XMM7 in the shadow space in functions that
clobbers them. This way we don't have to adjust the stack
pointer as often, reducing the number of instructions as
well as code size.
Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
11 years ago
Loren Merritt
3fb78e99a0
x86inc: create xm# and ym#, analagous to m#
...
For when we want to mix simd sizes within one function.
Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
11 years ago
Loren Merritt
49ebe3f9fe
x86inc: fix some corner cases of SWAP
...
SWAP with >=3 named (rather than numbered) args
PERMUTE followed by SWAP with 2 named args
used to produce the wrong permutation
Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
11 years ago
Henrik Gramner
63f0d62310
x86inc: Use SSE instead of SSE2 for copying data
...
Reduces code size because movaps/movups is one byte
shorter than movdqa/movdqu.
Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
11 years ago
Henrik Gramner
ad76e6e7e1
x86inc: Set ELF hidden visibility for global constants
...
Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
11 years ago
Loren Merritt
25cb0c1a1e
x86inc: activate REP_RET automatically
...
Now RET checks whether it immediately follows a branch, so the
programmer dosen't have to keep track of that condition. REP_RET
is still needed manually when it's a branch target, but that's
much rarer.
The implementation involves lots of spurious labels, but that's OK
because we strip them.
Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
11 years ago
Niv Sardi
49ba6e56bd
lavu/parseutils: add more resolutions
...
See http://en.wikipedia.org/wiki/Graphics_display_resolution
Signed-off-by: Niv Sardi <xaiki@evilgiggle.com>
Signed-off-by: Stefano Sabatini <stefasab@gmail.com>
11 years ago
Luca Barbato
4272bb6ef1
doxy: Document avlog
...
Provide some information for every function and add a group.
11 years ago
Stefano Sabatini
515e651f56
lavu/opt: fix doxy for av_opt_get* functions about return value
...
Success code must be >= 0 and not == 0, consistently with the
implementation.
11 years ago
Stefano Sabatini
719b4eef5d
lavu/common: add warning to GET_UTF8 doxy
...
Should prevent wrong uses, or at least decrease their chance.
11 years ago
Diego Biurrun
80fefbed62
x86: cpu: Restore some explanatory comments removed in 7160bb7
11 years ago
Diego Biurrun
5ce04c14dd
Use correct Doxygen syntax
11 years ago
Ronald S. Bultje
c07ac8d467
VP9 MC (ssse3) optimizations.
...
Decoding time of ped1080p.webm goes from 20.7sec to 11.3sec.
11 years ago
Anton Khirnov
38e15df148
avframe: note that linesize is not the usable data size
11 years ago
Michael Niedermayer
a454dec19a
pixdesc: fix NV20* descriptors
...
They were inconsistent (overlapping fields and wrong sizes)
Signed-off-by: Anton Khirnov <anton@khirnov.net>
11 years ago
Michael Niedermayer
8310bccc91
avutil/pixdesc: try to fix NV20* descriptors
...
They where inconsistent (overlapping fields and wrong sizes)
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
11 years ago
Kieran Kunhya
90ca5a9b5f
Add interleaved 4:2:2 8/10-bit formats
...
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
11 years ago
Alex Smith
08fa828b3f
avutil: Fix compilation with inline asm disabled on mingw
...
Because of -Werror=implicit-function-declaration the build will fail.
Signed-off-by: Martin Storsjö <martin@martin.st>
11 years ago
Kieran Kunhya
e208e6d209
lavu: Add interleaved 4:2:2 8/10-bit formats
...
Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
11 years ago
Alex Smith
66c2f200b6
lavu/attributes: Don't define av_restrict
...
This is always defined in config.h.
Original patch by Derek Buitenhuis.
11 years ago
Michael Niedermayer
0506f3fa38
avutil/cpu: remove duplicate include
...
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
11 years ago
Martin Storsjö
67e285ceca
mem: Handle av_reallocp(..., 0) properly
...
Previously this did a double free (and returned an error).
Reported-by: Justin Ruggles
Signed-off-by: Martin Storsjö <martin@martin.st>
11 years ago
Alex Smith
33b88f2a4a
msvc/icl: Use __declspec(noinline)
...
Signed-off-by: Martin Storsjö <martin@martin.st>
11 years ago
Alex Smith
09f2581dc5
msvc/icl: Use __declspec(deprecated)
...
Prior to this on msvc/icl there was no handling of deprecated functions
and the deprecated warning was disabled.
After enabling there are a number of warnings relating to the CRT and
the use of the non-secure versions of several functions. Defining
_CRT_SECURE_NO_WARNINGS silences these warnings.
Signed-off-by: Martin Storsjö <martin@martin.st>
11 years ago
Michael Niedermayer
1225b67fc9
avutil/frame: suppress "comparison of unsigned expression < 0 is always false" warning
...
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
11 years ago
Michael Niedermayer
9c8aeacf82
avutil: add av_get_colorspace_name()
...
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
11 years ago
Lenny Wang
29664fab0c
OpenCL: convert meaningless "device id" output to "device name"
...
Approved-by: Wei Gao <highgod0401@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
11 years ago
Luca Barbato
3feb3d6ce4
mem: Introduce av_reallocp
11 years ago
Diego Biurrun
9997a812e7
mem: Document the non-compatibility of av_realloc() and av_malloc()
12 years ago
Michael Niedermayer
f3ba91a3f1
avutil/pixdesc: dont try to use av_read_image_line() with bayer formats
...
It has undefined behavior ATM as its not supported.
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
12 years ago
Michael Niedermayer
a25585bb50
avutil/pixdesc: Prevent minor array overread in ff_check_pixfmt_descriptors()
...
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
12 years ago
James Almer
bbcaf25d4d
lavu/sha512: Fully unroll the transform function loops
...
crypto_bench SHA-512 results using an AMD Athlon X2 7750+, mingw32-w64 GCC 4.7.3 x86_64
Before:
lavu SHA-512 size: 1048576 runs: 1024 time: 12.737 +- 0.147
After:
lavu SHA-512 size: 1048576 runs: 1024 time: 11.670 +- 0.173
Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
12 years ago
James Almer
7e4fe5162a
lavu/sha: Fully unroll the transform function loops
...
crypto_bench SHA-1 and SHA-256 results using an AMD Athlon X2 7750+, mingw32-w64 GCC 4.7.3 x86_64
Before:
lavu SHA-1 size: 1048576 runs: 1024 time: 9.012 +- 0.162
lavu SHA-256 size: 1048576 runs: 1024 time: 19.625 +- 0.173
After:
lavu SHA-1 size: 1048576 runs: 1024 time: 7.948 +- 0.154
lavu SHA-256 size: 1048576 runs: 1024 time: 17.841 +- 0.170
Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
12 years ago
Diego Biurrun
a0b901a348
Drop pointless directory name prefixes from #includes in the current dir
12 years ago
James Almer
8702a94e49
lavu/ripemd: Add a size optimized version of the transform functions
...
When compiling with --enable-small, ripemd.o will weigh a few kilobytes less than
it used to before the previous commit.
Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
12 years ago
James Almer
452ac2aaec
lavu/ripemd: Fully unroll the transform function loops
...
crypto_bench RIPEMD-160 results using an AMD Athlon X2 7750+, mingw32-w64 GCC 4.8.1 x86_64
Before:
lavu RIPEMD-160 size: 1048576 runs: 1024 time: 12.342 +- 0.199
After:
lavu RIPEMD-160 size: 1048576 runs: 1024 time: 10.143 +- 0.192
Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
12 years ago
Kirill Gavrilov
0f48acf29b
lavu: provide msvc implementation of attribute_deprecated
...
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
12 years ago
Paul B Mahol
6508bd4aa3
pixfmt: add native GBRAP16 format
...
Signed-off-by: Paul B Mahol <onemda@gmail.com>
12 years ago
Diego Biurrun
c3e6e8f06c
mem: Do not check unsigned values for negative size
12 years ago
Diego Biurrun
b634b36fce
mem: Improve documentation wording and spelling
12 years ago
Nicolas George
d5b58f678d
tools: add benchmark for crypto functions.
...
The benchmark tests the speed of the following algorithms:
MD5, SHA-1, SHA-256, SHA-512, RIPEMD-160, AES-128.
It can optionally be built to perform the same benchmark on
other crypto libraries, for comparison purposes.
The supported libraries are:
- crypto: OpenSSL's libcrypto;
- gcrypt: GnuTLS's libgcrypt;
- tomcrypt: LibTomCrypt
To enable them, use this syntax:
make VERSUS=crypto+gcrypt+tomcrypt tools/crypto_bench
They do not need to have been enabled in configure.
12 years ago
Luca Barbato
b4ec7a5fee
mem: Document the av_realloc family of functions properly
...
realloc() does not accept pointers from memalign().
12 years ago