This results in more alignment for pixel formats that have "odd" pixel
sizes like RGB24. It makes access through SIMD easier
Works around Issue described in Ticket1031
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
The description if for the function, not the group.
Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
Mostly a copy&paste from other hash functions, with changes
where required.
Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
Also replace custom tests for MD5 with those published in RFC 2202
Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
Includes HMAC-SHA-224, HMAC-SHA-256, HMAC-SHA-384, and HMAC-SHA-512.
Tested using test vectors from https://tools.ietf.org/html/rfc4231
Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
Currently, standard tables like AV_CRC_32_IEEE and such are being generated (or
provided in case the user compiles with hardcoded tables) with only 257 elements.
We're missing a considerable boost in performance by not making them with a size
of 1024 elements instead.
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
This matches the other eabi attribute in the same file. This is
required in order to build for arm/hardfloat with other object
file formats than ELF.
Signed-off-by: Martin Storsjö <martin@martin.st>
The old check would fail on huge but not infinite values
and the later code could then fail to handle them correctly in
some cases.
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
Fixes compilation when the macros are empty and the label above ends up
containing no statement. Also makes usage of these macro consistent
(some already have a semi colon, some others don't).
Fixes Ticket #2603
4-operation form is preferred over 3-operation because it breaks a long
dependency chain, thus allowing a superscalar processor to execute more
operations in parallel.
The idea was taken from: http://www.zorinaq.com/papers/md5-amd64.html
AMD Athlon(tm) II X3 450 Processor, x86_64
$ for i in $(seq 1 4); do ./avutil_md5_test2; done
size: 1048576 runs: 1024 time: 5.821 +- 0.019
size: 1048576 runs: 1024 time: 5.822 +- 0.019
size: 1048576 runs: 1024 time: 5.841 +- 0.018
size: 1048576 runs: 1024 time: 5.821 +- 0.018
$ for i in $(seq 1 4); do ./avutil_md5_test2; done
size: 1048576 runs: 1024 time: 5.646 +- 0.019
size: 1048576 runs: 1024 time: 5.646 +- 0.018
size: 1048576 runs: 1024 time: 5.642 +- 0.019
size: 1048576 runs: 1024 time: 5.641 +- 0.019
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
Where necessary use memcpy instead.
Thanks to Giorgio Vazzana [mywing81 gmail] for
spotting this loop as the cause for the bad
performance.
Signed-off-by: Reimar Döffinger <Reimar.Doeffinger@gmx.de>
Initialize it with UINT32_MAX and xor the result with UINT32_MAX
as well.
Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>