Ronald S. Bultje
7c62891efe
vp9lpf/x86: save one register in SIGN_ADD/SUB.
...
Signed-off-by: Anton Khirnov <anton@khirnov.net>
8 years ago
Ronald S. Bultje
c6375a83d1
vp9lpf/x86: store unpacked intermediates for filter6/14 on stack.
...
filter16 goes from 508 to 482 (h) or 346 to 314 (v) cycles; filter88
goes from 240 to 238 (h) or 174 to 165 (v) cycles, measured on TOS.
Signed-off-by: Anton Khirnov <anton@khirnov.net>
8 years ago
Ronald S. Bultje
4ce8ba72f9
vp9lpf/x86: move variable assigned inside macro branch.
...
The value is not used outside the branch.
Signed-off-by: Anton Khirnov <anton@khirnov.net>
8 years ago
Ronald S. Bultje
e4961035b2
vp9lpf/x86: simplify ABSSUM_CMP by inverting the comparison meaning.
...
Signed-off-by: Anton Khirnov <anton@khirnov.net>
8 years ago
Ronald S. Bultje
683da2788e
vp9lpf/x86: remove unused register from ABSSUB_CMP macro.
...
Signed-off-by: Anton Khirnov <anton@khirnov.net>
8 years ago
Ronald S. Bultje
6e74e9636b
vp9lpf/x86: slightly simplify 44/48/84/88 h stores.
...
Signed-off-by: Anton Khirnov <anton@khirnov.net>
8 years ago
Ronald S. Bultje
6411c328a2
vp9lpf/x86: make cglobal statement more conservative in register allocation.
...
Signed-off-by: Anton Khirnov <anton@khirnov.net>
8 years ago
Ronald S. Bultje
a6e288d624
vp9lpf/x86: save one register in loopfilter surface coverage.
...
Signed-off-by: Anton Khirnov <anton@khirnov.net>
8 years ago
Clément Bœsch
0ed21bdc9e
vp9lpf/x86: add ff_vp9_loop_filter_[vh]_44_16_{sse2,ssse3,avx}.
...
Signed-off-by: Anton Khirnov <anton@khirnov.net>
8 years ago
Clément Bœsch
f2e3d706a1
vp9lpf/x86: add ff_vp9_loop_filter_h_{48,84}_16_{sse2,ssse3,avx}().
...
Signed-off-by: Anton Khirnov <anton@khirnov.net>
8 years ago
James Almer
92d47550ea
vp9lpf/x86: add an SSE2 version of vp9_loop_filter_[vh]_88_16
...
Similar gains as the ssse3 version once again
Additional improvements by Clément Bœsch <u@pkh.me>.
Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Anton Khirnov <anton@khirnov.net>
8 years ago
Clément Bœsch
6bea478158
vp9lpf/x86: add ff_vp9_loop_filter_[vh]_88_16_{ssse3,avx}.
...
Signed-off-by: Anton Khirnov <anton@khirnov.net>
8 years ago
James Almer
1f451eed60
vp9lpf/x86: add ff_vp9_loop_filter_[vh]_16_16_sse2().
...
Similar gains in performance as the SSSE3 version
Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Anton Khirnov <anton@khirnov.net>
8 years ago
Clément Bœsch
a692724c58
vp9lpf/x86: add x86 SSSE3/AVX SIMD for vp9_loop_filter_[vh]_16_16.
...
Signed-off-by: Anton Khirnov <anton@khirnov.net>
8 years ago
Ronald S. Bultje
a451324ddd
vp9: ignore reference segmentation map if error_resilience flag is set.
...
Fixes ffvp9_fails_where_libvpx.succeeds.webm.
Bug-Id: ffmpeg/3849.
Signed-off-by: Anton Khirnov <anton@khirnov.net>
8 years ago
Carl Eugen Hoyos
c19830aa2c
rscc: Support palette format
...
Signed-off-by: Vittorio Giovara <vittorio.giovara@gmail.com>
8 years ago
Vittorio Giovara
b8d5070db6
avcodec: Document AV_PKT_DATA_PALETTE side data type
8 years ago
Mark Thompson
5a5df90d9c
vaapi_h265: Add main 10 encode support
8 years ago
Mark Thompson
b8cac1e830
vaapi_h265: Fix buffering parameters
...
A decoder may need this to be set correctly to output frames in the
right order.
8 years ago
Mark Thompson
fc30a90898
vaapi_h265: Fix slice header writing
...
This was not observed earlier because the only syntax element which
it normally misses with the current setup is slice_qp_delta, but that
is always going to be zero (in IDR frames QP isn't varied on the
slice) which will always exp-golomb code as a single 1 bit. The
immediately following part is the byte alignment, which is always a 1
bit followed by 0s which are ignored, so as long as the bitstream is
never aligned at that point we will never notice because the only
difference is that an ignored bit is a 1 instead of a 0.
8 years ago
Mark Thompson
ec17ab381e
vaapi_h264: Write bitstream restriction fields
8 years ago
Mark Thompson
17a0f9481c
vaapi_h264: Fix CFR mode with frame_rate set in AVCodecContext
8 years ago
Mark Thompson
314b421dd8
vaapi_encode: Decide on GOP setup before initialising sequence parameters
...
This was always too late; several fields related to it have been incorrectly
zero since the encoder was added.
8 years ago
Anton Khirnov
59c7022740
pthread_frame: use atomics for frame progress
8 years ago
Anton Khirnov
64a31b2854
pthread_frame: use atomics for PerThreadContext.state
8 years ago
Anton Khirnov
db2733256d
pthread_frame: use a thread-safe way for signalling threads to die
...
Current code uses a plain int in a racy way, which is UB.
8 years ago
Anton Khirnov
8385ba53f1
mmaldec: convert to stdatomic
8 years ago
Luca Barbato
b015872c0d
huffyuvdsp: Enable the altivec code for PPC little-endian as well
...
Confirmed to work by checkasm.
8 years ago
Luca Barbato
1d25a86902
huffyuvdsp: Reenable PPC optimizations
8 years ago
Anton Khirnov
5bf2454e7c
h264dec: support broken files with mp4 extradata/annex b data
...
Bug-Id: 966
8 years ago
Justin Ruggles
b57e38f52c
ac3dsp: x86: Replace inline asm for in-decoder downmixing with standalone asm
...
Adds a wrapper function for downmixing which detects channel count changes
and updates the selected downmix function accordingly.
Simplification and porting to current x86inc infrastructure by Diego Biurrun.
Signed-off-by: Diego Biurrun <diego@biurrun.de>
8 years ago
Justin Ruggles
a9ba59591e
ac3dsp: Add some special-case handling for the C downmix function
...
This is about 200% faster for in-decoder downmixing of 5.0 and 5.1 content.
Signed-off-by: Diego Biurrun <diego@biurrun.de>
8 years ago
Justin Ruggles
43717469f9
ac3dsp: Reverse matrix in/out order in downmix()
...
Also use (float **) instead of (float (*)[2]). This matches the matrix
layout in libavresample so we can reuse assembly code between the two.
Signed-off-by: Diego Biurrun <diego@biurrun.de>
8 years ago
Hendrik Leppkes
8d1267932c
x86/h264_weight: use appropriate register size for weight parameters
...
This fixes decoding corruption on 64 bit windows.
Signed-off-by: Martin Storsjö <martin@martin.st>
8 years ago
Diego Biurrun
2caa93b813
mpegaudiodsp: Change type of array stride parameters to ptrdiff_t
...
This avoids SIMD-optimized functions having to sign-extend their
stride argument manually to be able to do pointer arithmetic.
8 years ago
Diego Biurrun
15b4f494fc
mss*: Change type of array stride parameters to ptrdiff_t
...
ptrdiff_t is the correct type for array strides and similar.
8 years ago
Diego Biurrun
a339e919ca
ea: Change type of array stride parameters to ptrdiff_t
...
ptrdiff_t is the correct type for array strides and similar.
8 years ago
Diego Biurrun
ba479f3daa
hevc: Change type of array stride parameters to ptrdiff_t
...
ptrdiff_t is the correct type for array strides and similar.
8 years ago
Diego Biurrun
e4a94d8b36
h264chroma: Change type of stride parameters to ptrdiff_t
...
This avoids SIMD-optimized functions having to sign-extend their
stride argument manually to be able to do pointer arithmetic.
8 years ago
Diego Biurrun
2ec9fa5ec6
idct: Change type of array stride parameters to ptrdiff_t
...
ptrdiff_t is the correct type for array strides and similar.
8 years ago
Diego Biurrun
b2939a7527
blockdsp: Change type of array stride parameters to ptrdiff_t
...
ptrdiff_t is the correct type for array strides and similar.
8 years ago
Diego Biurrun
3281d823cd
intrax8: Change type of array stride parameters to ptrdiff_t
...
ptrdiff_t is the correct type for array strides and similar.
Also rename all such parameters to "stride" for consistency.
8 years ago
Diego Biurrun
92c5755a18
hpeldsp: arm: Update comments left behind in 25841dfe80
8 years ago
Diego Biurrun
009adfd4fb
x86: fpel: Remove unnecessary sign extend
8 years ago
Mark Thompson
956a54129d
vaapi_h264: Set max_num_ref_frames to 1 when not using B frames
8 years ago
Mark Thompson
086e4b58b5
vaapi_encode: Sync to input surface rather than output
...
While outwardly bizarre, this change makes the behaviour consistent
with other VAAPI encoders which sync to the encode /input/ picture in
order to wait for /output/ from the encoder. It is not harmful on
i965 (because synchronisation already happens in vaRenderPicture(),
so it has no effect there), and it allows the encoder to work on
mesa/gallium which assumes this behaviour.
8 years ago
Mark Thompson
892bbbcdc1
vaapi_encode: Check packed header capabilities
...
This improves behaviour with drivers which do not support packed
headers, such as AMD VCE on mesa/gallium.
8 years ago
Mark Thompson
80a5d05108
vaapi_encode: Refactor initialisation
...
This allows better checking of capabilities and will make it easier
to add more functionality later.
It also commonises some duplicated code around rate control setup
and adds more comments explaining the internals.
8 years ago
Anton Khirnov
7bf8db4db6
tdsc: use the new decoding API
8 years ago
Anton Khirnov
de2ae3c1fa
lavc: add clobber tests for the new encoding/decoding API
8 years ago