The current design, where
- proper init is called for the first per-thread context
- first thread's private data is copied into private data for all the
other threads
- a "fixup" function is called for all the other threads to e.g.
allocate dynamically allocated data
is very fragile and hard to follow, so it is abandoned. Instead, the
same init function is used to init each per-thread context. Where
necessary, AVCodecInternal.is_copy can be used to differentiate between
the first thread and the other ones (e.g. for decoding the extradata
just once).
This commit uses smaller types for some static const arrays to reduce
their size in case the entries can be represented in the smaller type.
The biggest savings came from inv_map_table in vp9.c.
Reviewed-by: Michael Niedermayer <michael@niedermayer.cc>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
Signed-off-by: James Almer <jamrial@gmail.com>
Normally the aic decoder finds the proper slice combination (multiple of
some number less than 32) but in case of odd width, it resorts to the
default values, which were actually swapped.
The number of slices is modified to account for such odd width cases.
CC: libav-stable@libav.org
For some unclear reason Apple decided to use the same scan tables for luma and
chroma in the progressive mode while using different ones for luma in the
interlaced mode.