* qatar/master:
lavc: Move ff_cropTbl and ff_zigzag_direct from dsputil to mathtables
Conflicts:
libavcodec/mathtables.c
Merged-by: Michael Niedermayer <michaelni@gmx.at>
* commit '610b18e2e3d8ef5eca3e78f33a0625689b8d2bb9':
x86: qpel: Move fullpel and l2 functions to a separate file
bfin: Make vp3 functions static
Conflicts:
libavcodec/bfin/vp3_bfin.c
libavcodec/x86/Makefile
Merged-by: Michael Niedermayer <michaelni@gmx.at>
* commit 'aa8d89536d35af0a0c8d8bac2b452ffe7b82cae5':
bfin: Don't use the vp3 idct functions if bitexact behaviour is expected
Conflicts:
libavcodec/bfin/vp3_bfin.c
Merged-by: Michael Niedermayer <michaelni@gmx.at>
These are widely used throughout libavcodec, nothing dsputil-specific.
Change ff_cropTbl to a statically initialized table, to avoid
initializing it with a function call.
Signed-off-by: Martin Storsjö <martin@martin.st>
This way, they can be shared between mpeg4qpel and h264qpel without
requiring either one to be compiled unconditionally.
Signed-off-by: Martin Storsjö <martin@martin.st>
This makes the vp3 decoder less dependent on dsputil, and will aid
in making it (eventually) dsputil-independent.
Signed-off-by: Martin Storsjö <martin@martin.st>
In the non-bitexact mode, vp3 currently decodes to the same
frame crcs as before 28f9ab702 (and the output visually looks
correct).
Signed-off-by: Martin Storsjö <martin@martin.st>
From 312 to 89/68 (sse/sse2) cycles on Arrandale and Win64.
Sandybridge: 68/47 cycles.
Having a loop counter is a 7 cycle gain.
Unrolling is another 7 cycle gain.
Working in reverse scan is another 6 cycles.
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
Text subtitles packets are not 0-terminated (and if they are,
it is handled by the recoding process since 0 is a valid
Unicode code point). The terminating 0 would overwrite the
last payload octet.
OTOH, packets must be 0-padded.
Fix a problem reported in trac ticket #2431.
This patch can be controversial, by assuming floats are IEEE-754 and
particular behaviour of the FPU will get in the way.
Timing on Arrandale and Win32 (thus, x87 FPU is used in the reference).
sbr_qmf_pre_shuffle_c: 115 to 76
sbr_neg_odd_64_c: 84 to 55
sbr_qmf_post_shuffle_c: 112 to 83
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
Timing on Arrandale:
C SSE
Win32: 57 44
Win64: 47 38
Unrolling and not storing mask both save some cycles.
Signed-off-by: Diego Biurrun <diego@biurrun.de>
The gcov/lcov are a common toolchain for visualizing code coverage with
the GNU/Toolchain. The documentation and implementation of this
integration was heavily inspired from the blog entry by Mike Melanson:
http://multimedia.cx/eggs/using-lcov-with-ffmpeg/
Timing on Arrandale:
C SSE
Win32: 57 44
Win64: 47 38
Unrolling and not storing mask both save some cycles.
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>