This function is always called with a non-negative argument, so
those special cases are not needed. In the places the argument
might be zero, the return value for a zero argument does not matter
since it would then be used to scale an array full of zeros.
Signed-off-by: Mans Rullgard <mans@mansr.com>
Note that the symbols used to run the hardware decoder in asynchronous mode
have been marked deprecated and will be dropped at a future version bump.
Signed-off-by: Diego Biurrun <diego@biurrun.de>
To access data at multiple fixed offsets from a base address, this
code uses a single "m" operand and code of the form "32%0", relying on
the memory operand instantiation having no displacement, giving a final
result of the form "32(%rax)". If the compiler uses a register and
displacement, e.g. "64(%rax)", the end result becomes "3264(%rax)",
which obviously does not work.
Replacing the "m" operands with "r" operands allows safe addition of a
displacement. In theory, multiple memory operands could use a shared
base register with different index registers, "(%rax,%rbx)", potentially
making more efficient use of registers. In the cases at hand, no such
sharing is possible since the addresses involved are entirely unrelated.
After this change, the code somewhat rudely accesses memory without
using a corresponding memory operand, which in some cases can lead to
unwanted "optimisations" of surrounding code. However, the original
code also accesses memory not covered by a memory operand, so this is
not adding any defect not already present. It is also hightly unlikely
that any such optimisations could be performed here since the memory
locations in questions are not accessed elsewhere in the same functions.
This fixes crashes with suncc.
Signed-off-by: Mans Rullgard <mans@mansr.com>
Under some circumstances, suncc will use a single register for the
address of all memory operands, inserting lea instructions loading
the correct address prior to each memory operand being used in the
code. In the yadif code, the branch in the asm block bypasses such
an lea instruction, causing an incorrect address to be used in the
following load.
This patch replaces the tmpX arrays with a single array and uses a
register operand to hold its address. Although this prevents using
offsets from the stack pointer to access these locations, the code
still builds as 32-bit PIC even with old compilers.
Signed-off-by: Mans Rullgard <mans@mansr.com>
This fixes two issues preventing suncc from building this code.
The undocumented 'a' operand modifier, causing gcc to omit a $ in
front of immediate operands (as required in addresses), is not
supported by suncc. Luckily, the also undocumented 'c' modifer
has the same effect and is supported.
On some asm statements with a large number of operands, suncc for no
obvious reason fails to correctly substitute some of the operands.
Fortunately, some of the operands in these statements are plain
numbers which can be inserted directly into the code block instead
of passed as operands.
With these changes, the code builds correctly with both gcc and
suncc.
Signed-off-by: Mans Rullgard <mans@mansr.com>
This code contains a C array of addresses of labels defined in
inline asm. To do this, the names must be declared as external
in C. The declared type does not matter since only the address is
used, and for some reason, the author of the code used the 'void'
type despite taking the address of a void expression being invalid.
Changing the type to char, a reasonable choice since the alignment
of the code labels cannot be known or guaranteed, eliminates gcc
warnings and allows building with suncc.
Signed-off-by: Mans Rullgard <mans@mansr.com>
Although a reasonable compiler will probably optimise out the
actual store and load, this operation still implies a truncation
to 16 bits which the compiler will probably not realise is not
necessary here.
Signed-off-by: Mans Rullgard <mans@mansr.com>
Writing the scaled excitation to a scratch buffer (borrowing the
'audio' array) instead of modifying it in place avoids the need
to save and restore the unscaled values.
Signed-off-by: Mans Rullgard <mans@mansr.com>
Use saturating addition functions instead of 64-bit intermediates
and separate clipping. This is much faster when dedicated
instructions are available.
Signed-off-by: Mans Rullgard <mans@mansr.com>
Firstly, nothing in this function can overflow 32 bits so the use
of a 64-bit type is completely unnecessary. Secondly, the scale
is either a power of two or 0x7fff. Doing separate loops for these
cases avoids using multiplications. Finally, since only the number
of bits, not the actual value, of the maximum value is needed, the
bitwise or of all the values serves the purpose while being faster.
It is worth noting that even if overflow could happen, it was not
handled correctly anyway.
Signed-off-by: Mans Rullgard <mans@mansr.com>
The operands in both cases are 16-bit so cannot overflow a 32-bit
destination. In gain_scale() the inputs are reduced to 14-bit,
so even the shift cannot overflow.
Signed-off-by: Mans Rullgard <mans@mansr.com>
Adding instead of subtracting the products in the loop allows the
compiler to generate more efficient multiply-accumulate instructions
when 16-bit multiply-subtract is not available. ARM has only
multiply-accumulate for 16-bit operands. In general, if only one
variant exists, it is usually accumulate rather than subtract.
In the same spirit, using the dedicated saturation function enables
use of any special optimised versions of this.
Signed-off-by: Mans Rullgard <mans@mansr.com>
Fixed-point audio codecs often use saturating arithmetic, and
special instructions for these operations are common.
Signed-off-by: Mans Rullgard <mans@mansr.com>
These are normally initialized to AV_NOPTS_VALUE at the start
of avformat_find_stream_info, but if a new stream is found while
this function is running (e.g. like in mpegts), the newly added
AVStreams didn't have these values properly initalized, leading
to avformat_find_stream_info terminating too soon (when the
first timestamps are far from 0).
Signed-off-by: Martin Storsjö <martin@martin.st>
Some files' embedded art seems to have the mimetype 'image/JPG' instead
of 'image/jpg'. Libav fails to parse those because it matches
case-sensitively.
Use av_strncasecmp() to fix this behaviour.
Signed-off-by: Mohammad Alsaleh <msal@tormail.org>
Signed-off-by: Anton Khirnov <anton@khirnov.net>