Always use the special filter for the first and last 3 columns (only).
Changes made in 64ed397 slowed the filter to just under 3/4 of what it
was. This commit restores the speed while maintaining identical output.
For reference, on my Athlon64:
1733222 decicycles in old
2358563 decicycles in new
1727558 decicycles in this
Signed-off-by: Anton Khirnov <anton@khirnov.net>
Current code divides before increasing precision.
Also reduce upper bound for strength from 255 to 64. This will prevent
an overflow in the SSSE3 and MMX filter_line code: delta is expressed as
an u16 being shifted by 2 to the left. If it overflows, having a
strength not above 64 will make sure that m is set to 0 (making the
m*m*delta >> 14 expression void).
A value above 64 should not make any sense unless gradfun is used as
a blur filter.
Signed-off-by: Anton Khirnov <anton@khirnov.net>
Manually load registers to avoid using 8 registers on x86_32 with
compilers that do not align the stack (e.g. MSVC).
Signed-off-by: Diego Biurrun <diego@biurrun.de>
Under some circumstances, suncc will use a single register for the
address of all memory operands, inserting lea instructions loading
the correct address prior to each memory operand being used in the
code. In the yadif code, the branch in the asm block bypasses such
an lea instruction, causing an incorrect address to be used in the
following load.
This patch replaces the tmpX arrays with a single array and uses a
register operand to hold its address. Although this prevents using
offsets from the stack pointer to access these locations, the code
still builds as 32-bit PIC even with old compilers.
Signed-off-by: Mans Rullgard <mans@mansr.com>
Refactoring mmx2/mmxext YASM code with cpuflags will force renames.
So switching to a consistent naming scheme beforehand is sensible.
The name "mmxext" is more official and widespread and also the name
of the CPU flag, as reported e.g. by the Linux kernel.
Patch by Nolan L nol888 <=> gmail >=< com.
See thread:
Subject: [FFmpeg-devel] [PATCH] Port gradfun to libavfilter (GCI)
Date: Mon, 29 Nov 2010 07:18:14 -0500
Originally committed as revision 25942 to svn://svn.ffmpeg.org/ffmpeg/trunk
yadif.c:226: error: can't find a register in class 'GENERAL_REGS' while reloading 'asm'
yadif.c:220: error: 'asm' operand has impossible constraints
Patch by Alexander Strange <astrange ithinksw com>.
Originally committed as revision 25251 to svn://svn.ffmpeg.org/ffmpeg/trunk