Overall almost 4% faster, idct_add down from 350 to 85 cycles, idct_dc_add
down from 83 to 30 cycles.
squash: rv34 idct rearrange partial register loads
This allows audio encoders to optionally take an AVFrame as input and write
encoded output to an AVPacket.
This also adds AVCodec.encode2() which will also be usable by video and
subtitle encoders once support is implemented in the public functions.
Extract processing of intra 16x16 blocks from intra macroblock
processing.
Also implement a function performing inverse transform and block
reconstruction for DC-only blocks in 1 pass instead of 2.
Split inter/intra macroblock handling code. This will allow further
optimizations such as performing inverse transform and block reconstruction
in a single pass as well as specialize code.
Signed-off-by: Janne Grunau <janne-libav@jannau.net>
Do not fail audio decoding with avcodec_decode_audio3 if user has set a
custom get_buffer. Strictly speaking, this was never allowed by the API,
but it seems that some software packages did so anyways. In order to
unbreak applications (cf. http://bugs.debian.org/655890), this change
clarifies the API and overrides the custom get_buffer() with the defaults.
This change is inspired by a similar
commit (c3846e3eba) in FFmpeg.
Signed-off-by: Reinhard Tartler <siretart@tauware.de>
Reference decoder clips data before shifting it to final range and also
forces 32-bit lossy mode to be actually 24-bit lossy mode in order to be
able to perform proper clipping.
max_b_frames is initialized to -1 for libx264, to allow
distinguishing between an explicit user set 0 and a default not
touched 0 (see bb73cda2).
If max_b_frames is left as -1, this affects dts generation (where
expressions like max_b_frames != 0 are used), so make sure it is
left at the default 0 after the libx264 init function returns.
This avoids unnecessarily producing dts != pts when using
profile=baseline.
Signed-off-by: Martin Storsjö <martin@martin.st>
The alignment directive must obviously precede the label.
This was never noticed in ARM mode since the location is
already aligned there.
Signed-off-by: Mans Rullgard <mans@mansr.com>
Due to apprent bugs in the GNU assembler and/or linker, relocations
can be incorrectly processed if the alignment of a Thumb instruction
is changed in the output file compared to the input object.
This fixes crashes in h264 decoding with Thumb enabled. No effect in
ARM mode since everything is 4-byte aligned there.
Signed-off-by: Mans Rullgard <mans@mansr.com>
Fixes problems where rgbToRgbWrapper() is called even though it doesn't
support this particular conversion (e.g. converting from RGB444 to
anything). Thirdly, fixes issues where rgbToRgbWrapper() is called for
non-native endiannness conversions (e.g. RGB555BE on a LE system).
Fourthly, fixes crashes when converting from e.g. monowhite to
monowhite, which calls planarCopyWrapper() and overwrites/reads because
n_bytes != n_pixels.
This fixes standalone compilation of some decoders with --disable-optimizations.
cabac.h defines some inline functions that use symbols from cabac.c. Without
optimizations these inline functions are not eliminated and linking fails with
references to non-existing symbols.
Splitting the inline functions off into their own header and only #including
it in the places where the inline functions are used allows #including cabac.h
from anywhere without ill effects.
This isn't used in practice anywhere within libav at the moment,
but change it for consistency until it is removed.
URL_RDONLY/WRONLY were fixed in commit 5b81e29593 (after the
values that actually were used were changed at the major bump,
in commit cbea3ac8), but this flag was unintentionally left unfixed.
Signed-off-by: Martin Storsjö <martin@martin.st>
The sporadic threading errors during fate-rv30 were caused by calling
ff_thread_await_progress with mb row -1 as argument. That returns
immediately since progress is initialized to -1. Not yet computed
motion vectors from the reference could be used for the first
macroblocks.