You can not select more than 25 topics
Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
Martin Storsjö
cdb1665f70
aarch64: Make transpose_4x4H do a regular transpose
...
Previously, ff_h264_idct_add_neon (originally in the arm version) used
a non-regular transpose in order to be able to use more instructions
that deal with registers as 128 bit register pairs. The aarch64
translation doesn't do it to the same extent, but brought along the
same structure since it was a straight translation.
This reshuffles ff_h264_idct_add_neon, bringing it closer to
the C implementation, making the transpose_4x4H macro do a regular
transpose, usable for other algorithms as well.
Previously, the third and fourth output from transpose_4x4H were
swapped, and prior to cc29d96d5a
, the same inputs as well. In
addition to just swapping the outputs, also renumber the intermediate
registers for better readability (making the register order match
transpose_4x8B).
This runs with the same number of cycles as before.
Signed-off-by: Martin Storsjö <martin@martin.st>
9 years ago
..
Makefile
fft: Split MDCT bits off from FFT
9 years ago
asm-offsets.h
arm64: port synth_filter_float_neon from arm
9 years ago
cabac.h
aarch64: get_cabac inline asm
11 years ago
dcadsp_init.c
dca: remove unused decode_hf function and quant_d tables
9 years ago
dcadsp_neon.S
dca: remove unused decode_hf function and quant_d tables
9 years ago
fft_init_aarch64.c
fft: Split MDCT bits off from FFT
9 years ago
fft_neon.S
aarch64: Use .data.rel.ro for const data with relocations
10 years ago
fmtconvert_init.c
arm64: int32_to_float_fmul neon asm
9 years ago
fmtconvert_neon.S
arm64: int32_to_float_fmul neon asm
9 years ago
h264chroma_init_aarch64.c
aarch64: h264 chroma motion compensation NEON optimizations
11 years ago
h264cmc_neon.S
h264: avoid using uninitialized memory in NEON chroma mc
11 years ago
h264dsp_init_aarch64.c
aarch64: h264 (bi)weight NEON optimizations
11 years ago
h264dsp_neon.S
aarch64: h264 (bi)weight NEON optimizations
11 years ago
h264idct_neon.S
aarch64: Make transpose_4x4H do a regular transpose
9 years ago
h264pred_init.c
h264: aarch64: intra prediction optimisations
10 years ago
h264pred_neon.S
h264: aarch64: intra prediction optimisations
10 years ago
h264qpel_init_aarch64.c
arm64: constify src in h264qpel dsp function definitions
10 years ago
h264qpel_neon.S
aarch64: h264 qpel NEON optimizations
11 years ago
hpeldsp_init_aarch64.c
aarch64: hpeldsp NEON optimizations
11 years ago
hpeldsp_neon.S
aarch64: hpeldsp NEON optimizations
11 years ago
imdct15_init.c
opus: Factor out imdct15 into a standalone component
10 years ago
imdct15_neon.S
opus: Factor out imdct15 into a standalone component
10 years ago
mdct_init.c
fft: Split MDCT bits off from FFT
9 years ago
mdct_neon.S
aarch64: NEON float (i)MDCT
11 years ago
mpegaudiodsp_init.c
aarch64: NEON fixed/floating point MPADSP apply_window
11 years ago
mpegaudiodsp_neon.S
aarch64: add ',' between assembler macro arguments where missing
11 years ago
neon.S
aarch64: Make transpose_4x4H do a regular transpose
9 years ago
neontest.c
aarch64: port neon clobber test from arm
11 years ago
rv40dsp_init_aarch64.c
aarch64: h264 chroma motion compensation NEON optimizations
11 years ago
synth_filter_neon.S
arm64: port synth_filter_float_neon from arm
9 years ago
vc1dsp_init_aarch64.c
aarch64: h264 chroma motion compensation NEON optimizations
11 years ago
videodsp.S
aarch64: implement videodsp.prefetch
11 years ago
videodsp_init.c
aarch64: implement videodsp.prefetch
11 years ago
vorbisdsp_init.c
aarch64: NEON vorbis_inverse_coupling
11 years ago
vorbisdsp_neon.S
aarch64: NEON vorbis_inverse_coupling
11 years ago