The new API requires an extra array member at the very end,
which old API users did not do.
This disables in-place RDFT transforms and instead
does the transform out of place by copying once, there shouldn't
be a significant loss of speed as our in-place FFT requires a reorder
which is likely more expensive in the majority of cases to do.
These inline implementations of AV_COPY64, AV_SWAP64 and AV_ZERO64
are known to clobber the FPU state - which has to be restored
with the 'emms' instruction afterwards.
This was known and signaled with the FF_COPY_SWAP_ZERO_USES_MMX
define, which calling code seems to have been supposed to check,
in order to call emms_c() after using them. See
0b1972d409,
29c4c0886d and
df215e5758 for history on earlier
fixes in the same area.
However, new code can use these AV_*64() macros without knowing
about the need to call emms_c().
Just get rid of these dangerous inline assembly snippets; this
doesn't make any difference for 64 bit architectures anyway.
Signed-off-by: Martin Storsjö <martin@martin.st>
The previous assumption that DXV needs to be aligned to 16x16 was
erroneous. 4x4 works just as well, and FATE decoder tests pass for all
texture formats.
On the encoder side, we should reject input that isn't 4x4 aligned,
like the HAP encoder does, and stop aligning to 16x16. This both solves
the uninitialized reads causing current FATE tests to fail and produces
smaller encoded outputs.
With regard to correctness, I've checked the decoding path by encoding a
real-world sample with git master, and decoding it with
ffmpeg -i dxt1-master.mov -c:v rawvideo -f framecrc -
The results are exactly the same between master and this patch.
On the encoding side, I've encoded a real-world sample with both master
and this patch, and decoded both versions with
ffmpeg -i dxt1-{master,patch}.mov -c:v rawvideo -f framecrc -
Under this patch, results for both inputs are exactly the same.
In other words, the extra padding gained by 16x16 alignment over 4x4
alignment has no impact on decoded video.
Signed-off-by: Connor Worley <connorbworley@gmail.com>
Signed-off-by: Martin Storsjö <martin@martin.st>
SDL supports only these three matrices. Actually, it only supports these
three combinations: BT.601+JPEG, BT.601+MPEG, BT.709+MPEG, but we have
no way to restrict the specific *combination* of YUV range and YUV
colorspace with the current filter design.
See-Also: https://trac.ffmpeg.org/ticket/10839
Instead of an incorrect conversion result, trying to play a YCgCo file
with ffplay will simply error out with a "No conversion possible" error.
An oversight in my previous series. This omission slipped under the
radar because fftools/ffmpeg_filter.c did not use these options, instead
preferring to insert an explicit format filter.
FFmpeg has instances of DECLARE_ALIGNED(32, ...) in a lot of structs,
which then end up heap-allocated.
By declaring any variable in a struct, or tree of structs, to be 32 byte
aligned, it allows the compiler to safely assume the entire struct
itself is also 32 byte aligned.
This might make the compiler emit code which straight up crashes or
misbehaves in other ways, and at least in one instances is now
documented to actually do (see ticket 10549 on trac).
The issue there is that an unrelated variable in SingleChannelElement is
declared to have an alignment of 32 bytes. So if the compiler does a copy
in decode_cpe() with avx instructions, but ffmpeg is built with
--disable-avx, this results in a crash, since the memory is only 16 byte
aligned.
Mind you, even if the compiler does not emit avx instructions, the code
is still invalid and could misbehave. It just happens not to. Declaring
any variable in a struct with a 32 byte alignment promises 32 byte
alignment of the whole struct to the compiler.
This patch limits the maximum alignment to the maximum possible simd
alignment according to configure.
While not perfect, it at the very least gets rid of a lot of UB, by
matching up the maximum DECLARE_ALIGNED value with the alignment of heap
allocations done by lavu.
Forgot to do this with the previous commit.
Actually makes the assembly being used.
Still the fastest FFT in the world, 15% faster than FFTW on the
largest available size.
The demuxer opens an internal parser instance in read_timestamp(), which
requires a codec context. There is no need for it to access the FFStream
one which is used for other purposes, it can allocate its own internal
one.
This check has survived the transition to AVCodecParameters, but is no
longer relevant after it, since the codec context is no longer updated
or accessed at all from the demuxer.
It does not use the AVFormatContext at all.
Reviewed-by: Marth64 <marth64@proxyid.net>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Resetting the counter of used elements is enough as nothing is
ever read from the currently unused elements.
Reviewed-by: Marth64 <marth64@proxyid.net>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
The rcwt muxer uses several counters for how much data
it has already cached: One byte counter and one counter
for how many complete blocks (of three bytes each).
These counters can become inconsistent when the muxer is
fed incomplete blocks as the muxer presumes that it is
about to write a new block at the start of each write_packet
call. E.g. sending 65535*3+1 1-byte packets (with data[0] e.g. 0x03)
will trigger an out-of-bounds write.
This patch fixes this by processing the data in complete blocks
only. This also allows to simplify the code, e.g. to remove one of
the counters.
Reviewed-by: Marth64 <marth64@proxyid.net>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
They are not intended for decoders (for which there is the get_format
callback in case the user has a choice).
Also note that the list was wrong for MPEG4, because it did not contain
the high bit depth pixel formats used for studio profiles.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
They are not intended for decoders (for which there is the get_format
callback in case the user has a choice).
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
It is currently called once in the codecs' init function
and once when (re)initializing the VC-1 decode context
(which happens upon frame size changes as well as before
decoding the first frame). The first one is unnecessary
now that vc1_decode_frame() no longer requires avctx->hwaccel
to be already set for hwaccel to work properly.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
VC-1 uses a 0x03 escaping scheme like H.26x and our decoder
unescapes data for this purpose, but hardware accelerations
just want the data as-is and therefore get fed the original
data. The pointers to the actual data are only setcorrectly
if avctx->hwaccel is set (after all, they are only used in
this case).
There are two problems with this: The first is that the branch
is pointless; the second is that it is harmful, because
a hardware acceleration may be added after the packet has been
parsed (in case there is a reconfiguration e.g. due to frame
size changes) in which case decoding the first few frames
won't work.
So delete these branches.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
It is entirely unnecessary to use it given that all decoders
here share the same set of supported pixel formats. So just
hardcode this list.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
AVCodec.pix_fmts is only intended for encoders (decoders use
the get_format callback to let the user choose a pix fmt).
So remove them for the decoders for which this is possible
without further complications; keep them for now in the codecs
that actually use them (by passing avctx->codec->pix_fmts to
ff_get_formatt()).
Also notice that some of these lists were wrong; e.g.
317b7b06fd added support for YUV444P16
for cuviddec, but forgot to add it to pix_fmts.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
This bug causes the DXT5 decoder to produce incorrect block texture data.
After the fix, textures are visually correct and match data decoded by
Resolume Alley (extracted with Nvida Nsight for comparison). Current FATE DXT5
samples did not cover this case.
Signed-off-by: Connor Worley <connorbworley@gmail.com>
DXV files seem to misnomer DXT5 and really encode DXT4 with
premultiplied alpha. At least, this is what Resolume alley does.
To check, encode some input with alpha as "Normal Quality, With Alpha"
in Alley, then decode the output with this change -- results are true
to the original input compared to git-master.
Signed-off-by: Connor Worley <connorbworley@gmail.com>
The runtime doesn't set the frame type to MFX_FRAMETYPE_IDR on the
returned mfx bitstream for a keyframe, it set the frame type to
MFX_FRAMETYPE_I only. This patch added workaround for VP9 keyframe to
make the coded stream seekable.
Signed-off-by: Haihao Xiang <haihao.xiang@intel.com>
The VC1_D2010 profile, also known as VC1_VLD2010, has the same functionality
and specification as the VC1_D profile. Support for this profile serves only
as a positive indication that the accelerator has been designed with awareness
of the modifications specified in the August 2010 version of this specification.
Hardware accelerator drivers that expose support for this profile must not
also expose the previously specified VC1_D GUID, unless the accelerator works
properly with existing software decoders that use VC1_D and that do not incorporate
the corrections added to the August 2010 version of this specification.
As a result, we could give VC1_VLD2010 a higher priority and initialize
it first.
Signed-off-by: Aleksoid <Aleksoid1978@mail.ru>
Signed-off-by: Tong Wu <tong1.wu@intel.com>
Ensure all hwaccels that allocate a buffer use NVDECContext->bitstream_internal
instead. Otherwise, if FFHWAccel->end_frame() isn't called before
FFHWAccel->uninit(), an attempt to free a stale pointer to memory not owned by
the hwaccel could take place.
Reviewed-by: Timo Rothenpieler <timo@rothenpieler.org>
Signed-off-by: James Almer <jamrial@gmail.com>
This can be achieved by using AV_OPT_TYPE_FLAGS instead of
AV_OPT_TYPE_STRING. It also avoids strcmp() when accessing
the option.
Reviewed-by: Stefano Sabatini <stefasab@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
It uses the int64_t instead of the double member.
(This code can currently not be reached: av_opt_get() calls
av_opt_find2() with NULL as unit in which case AV_OPT_TYPE_CONST
options are never returned, leading av_opt_get() to always
return AVERROR_OPTION_NOT_FOUND when searching for AV_OPT_TYPE_CONST*.
For the same reason the code read_number() will never be called
from get_number() when searching for an option of type
AV_OPT_TYPE_CONST. The other callers of read_number() also only
call it with types other than AV_OPT_TYPE_CONST.)
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
We don't need to include fifo.h, because we don't need AVFifo
as a complete type. Also add the other used headers directly.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Besides being extremly simple this also avoids including
ff_ccfifo_ccdetected() unnecessarily (it is only used by decklink).
This is possible because this is not avpriv, but duplicated into
lavd if necessary.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
The muxer's AVCodecContext is currently used for exactly one thing:
To store a time base in it that has been derived via heuristics
in avformat_transfer_internal_stream_timing_info(); said time base
can then be read back via av_stream_get_codec_timebase().
But one does not need a whole AVCodecContext for that, a simple
AVRational is enough.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
For muxers, the internal AVCodecContext is basically unused
except in avformat_transfer_internal_stream_timing_info()
(which sets time_base and ticks_per_frame) and
av_stream_get_codec_timebase() (a getter for time_base).
This makes ticks_per_frame write-only, so don't set it.
Also remove an always-false check for the AVCodecContext's
codec_tag.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>