The first X96 channel set can have more channels than core, causing X96
decoding to be skipped. Clear the number of decoded X96 channels to zero
in this rudimentary case.
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Signed-off-by: Umair Khan <omerjerk@gmail.com>
Reviewed-by: Paul B Mahol <onemda@gmail.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
I cannot see any point whatsoever to use
double here instead of float, the results
are likely identical in all cases..
Using float allows for much more
efficient use of SIMD.
Signed-off-by: Reimar Döffinger <Reimar.Doeffinger@gmx.de>
Fixes compilation of fft with hardcoded tables
Reviewed-by: Michael Niedermayer <michael@niedermayer.cc>
Signed-off-by: James Almer <jamrial@gmail.com>
This was a leftover from before the slices were encoded in parallel.
Since the put_bits context is initialized per slice aligning it
aferwards is pointless.
Signed-off-by: Rostislav Pehlivanov <atomnuker@gmail.com>
This commit solves most of the crashes and issues with the encoder and
the bitrate setting. Now the encoder will always allocate the absolute
lowest amount of memory regardless of what the bitrate has been set to.
Therefore if a user inputs a very low bitrate the encoder will use the
maximum possible quantization (basically zero out all coefficients),
allocate a packet and encode it. There is no coupling between the
bitrate and the allocation size and so no crashes because the buffer
isn't large enough.
The maximum quantizer was raised to the size of the table now to both
keep the overshoot at ridiculous bitrates low and to improve quality
with higher bit depths (since the coefficients grow larger per transform
quantizing them to the same relative level requires larger quantization
indices).
Since the quantization index start follows the previous quantization
index for that slice, the quantization step was reduced to a static 1
to improve performance. Previously with quant/5 the step was usually
set to 0 upon start (and was later clipped to 1), that isn't a big change.
As the step size increases so does the amount of bits leftover and so
the redistribution algorithm has to iterate more and thus waste more
time.
Signed-off-by: Rostislav Pehlivanov <atomnuker@gmail.com>
These macros were added in OS X 10.11, and the file compiles without warnings
on both 10.10 and 10.11 with them removed.
Thanks to mark4o on IRC for pointing out the failure and testing the patch.
Autodetected by default. Encode using -codec:v h264_videotoolbox.
Signed-off-by: Rick Kern <kernrj@gmail.com>
Signed-off-by: wm4 <nfxjfg@googlemail.com>
It makes no sense whatsoever to do this at each function call; we
already have a table for this.
Yields a 2x improvement in find_min_book (x86-64, Haswell+GCC):
ffmpeg -i sin.flac -acodec aac -y sin.aac
find_min_book
old
605 decicycles in find_min_book, 8388453 runs, 155 skips.9x
606 decicycles in find_min_book,16776912 runs, 304 skips.9x
607 decicycles in find_min_book,33553819 runs, 613 skips.2x
607 decicycles in find_min_book,67107668 runs, 1196 skips.3x
607 decicycles in find_min_book,134215360 runs, 2368 skips3x
new
359 decicycles in find_min_book, 8388552 runs, 56 skips.3x
360 decicycles in find_min_book,16777112 runs, 104 skips.1x
361 decicycles in find_min_book,33554218 runs, 214 skips.4x
361 decicycles in find_min_book,67108381 runs, 483 skips.5x
361 decicycles in find_min_book,134216725 runs, 1003 skips5x
and more importantly a non-negligible speedup (~ 8%) to overall AAC encoding:
old:
ffmpeg -i sin.flac -acodec aac -strict -2 -y sin_new.aac 6.82s user 0.03s system 104% cpu 6.565 total
new:
ffmpeg -i sin.flac -acodec aac -strict -2 -y sin_old.aac 6.24s user 0.03s system 104% cpu 5.993 total
This also improves accuracy of the expression by ~ 2 ulp in some cases.
Reviewed-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
Reviewed-by: Rostislav Pehlivanov <atomnuker@gmail.com>
Signed-off-by: Ganesh Ajjanagadde <gajjanag@gmail.com>
This fixes a data race warning by ThreadSanitizer.
FrameThreadContext.die is read by all the worker threads but is not
protected by any mutex. Move it to PerThreadContext so that each worker
thread reads its own copy of |die|, which can then be protected with
PerThreadContext.mutex.
Signed-off-by: Wan-Teh Chang <wtc@google.com>
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
This was a regression introduced by commit e7345abe05 which
enabled full use of the allocated packet but due to the overhead of
using field coding the buffer was too small and triggered warnings and
crashes.
Signed-off-by: Rostislav Pehlivanov <atomnuker@gmail.com>
The fact that now all quantization indices costs are cached justifies
storing 20 more integers in a structure already allocated on heap.
Signed-off-by: Rostislav Pehlivanov <atomnuker@gmail.com>
This commit redistributes the leftover bytes amongst the top 150 slices
in terms of size (in the hopes that they'll be the ones pretty bitrate
starved).
A more perceptual method would probably need to cut bits off from slices
which don't need much, but that'll be implemented later.
Signed-off-by: Rostislav Pehlivanov <atomnuker@gmail.com>
Previously a global average was used. Using the previous quantizer
resulted in a fairly significant speedup as slice size selection settled
down quicker.
Signed-off-by: Rostislav Pehlivanov <atomnuker@gmail.com>
16 bits were definitely not enough and caused artifacts to appear on
images at barely compressed images.
Signed-off-by: Rostislav Pehlivanov <atomnuker@gmail.com>
Check that the required plane pointers and only
those are set up.
Currently does not enforce anything for the palette
pointer of pseudopal formats as I am unsure about the
requirements.
Signed-off-by: Reimar Döffinger <Reimar.Doeffinger@gmx.de>
We do neither document nor check such a requirement
and for application-provided get_buffer2 they could
contain the result of a malloc(0) or whatever value
they had previously.
This fixes a use-after-free in e.g. MPlayer:
https://trac.mplayerhq.hu/ticket/2262
We might want to consider changing the (documented)
API in addition though.
Signed-off-by: Reimar Döffinger <Reimar.Doeffinger@gmx.de>
Reported as https://trac.mplayerhq.hu/ticket/2264 but have
not been able to reproduce with FFmpeg-only.
I have no idea what coded_height is used for here exactly,
so this might not be the best fix.
Fixes the following chain of events:
ff_mss12_decode_init sets coded_height while not setting height.
ff_mpv_decode_init then copies coded_height into MpegEncContext height.
This is then used by init_context_frame to allocate the data structures.
However the wmv9rects are validated/initialized based on avctx->height, not
avctx->coded_height.
Thus the decode_wmv9 function will try to decode a larger video that we
allocated data structures for, causing out-of-bounds writes.
Signed-off-by: Reimar Döffinger <Reimar.Doeffinger@gmx.de>