Reimar Döffinger
38cd829dce
aarch64: Implement stack spilling in a consistent way.
...
Currently it is done in several different ways, which
might cause needless dependencies or in case of
tx_float_neon.S is incorrect.
Reviewed-by: Martin Storsjö <martin@martin.st>
Signed-off-by: Reimar Döffinger <Reimar.Doeffinger@gmx.de>
2 years ago
J. Dekker
3c694967f8
lavc/aarch64: hevc_sao reschedule slightly
...
Signed-off-by: J. Dekker <jdek@itanimul.li>
3 years ago
J. Dekker
2e832be322
lavc/aarch64: add hevc sao edge 8x8
...
bench on AWS Graviton:
hevc_sao_edge_8x8_8_c: 516.0
hevc_sao_edge_8x8_8_neon: 81.0
Signed-off-by: J. Dekker <jdek@itanimul.li>
3 years ago
J. Dekker
92f67e4017
lavc/aarch64: add hevc sao edge 16x16
...
bench on AWS Graviton:
hevc_sao_edge_16x16_8_c: 1857.0
hevc_sao_edge_16x16_8_neon: 211.0
hevc_sao_edge_32x32_8_c: 7802.2
hevc_sao_edge_32x32_8_neon: 808.2
hevc_sao_edge_48x48_8_c: 16764.2
hevc_sao_edge_48x48_8_neon: 1796.5
hevc_sao_edge_64x64_8_c: 32647.5
hevc_sao_edge_64x64_8_neon: 3118.5
Signed-off-by: J. Dekker <jdek@itanimul.li>
3 years ago
J. Dekker
d957ee34a6
lavc/aarch64: fix hevc sao band filter
...
The SAO band filter can be called with non-multiples of 8, we round up
to the nearest multiple of 8 to account for this.
Signed-off-by: J. Dekker <jdek@itanimul.li>
3 years ago
Martin Storsjö
16fba44b4d
Revert "lavc/aarch64: add hevc sao edge 16x16"
...
This reverts commit a9214a2ca3
, as
it breaks fate-hevc.
Signed-off-by: Martin Storsjö <martin@martin.st>
3 years ago
Martin Storsjö
df48b1d06f
Revert "lavc/aarch64: add hevc sao edge 8x8"
...
This reverts commit c97ffc1a77
, as
it breaks fate-hevc.
Signed-off-by: Martin Storsjö <martin@martin.st>
3 years ago
Martin Storsjö
cafed377eb
Revert "lavc/aarch64: add hevc sao band 8x8 tiling"
...
This reverts commit f63f9be37c
, as
it breaks fate-hevc.
Signed-off-by: Martin Storsjö <martin@martin.st>
3 years ago
J. Dekker
f63f9be37c
lavc/aarch64: add hevc sao band 8x8 tiling
...
bench on AWS Graviton:
hevc_sao_band_8x8_8_c: 317.5
hevc_sao_band_8x8_8_neon: 97.5
hevc_sao_band_16x16_8_c: 1115.0
hevc_sao_band_16x16_8_neon: 322.7
hevc_sao_band_32x32_8_c: 4599.2
hevc_sao_band_32x32_8_neon: 1246.2
hevc_sao_band_48x48_8_c: 10021.7
hevc_sao_band_48x48_8_neon: 2740.5
hevc_sao_band_64x64_8_c: 17635.0
hevc_sao_band_64x64_8_neon: 4875.7
Signed-off-by: J. Dekker <jdek@itanimul.li>
3 years ago
J. Dekker
89a2ed4a8b
lavc/aarch64: clean-up sao band 8x8 function formatting
...
Signed-off-by: J. Dekker <jdek@itanimul.li>
3 years ago
J. Dekker
c97ffc1a77
lavc/aarch64: add hevc sao edge 8x8
...
bench on AWS Graviton:
hevc_sao_edge_8x8_8_c: 516.0
hevc_sao_edge_8x8_8_neon: 81.0
Signed-off-by: J. Dekker <jdek@itanimul.li>
3 years ago
J. Dekker
a9214a2ca3
lavc/aarch64: add hevc sao edge 16x16
...
bench on AWS Graviton:
hevc_sao_edge_16x16_8_c: 1857.0
hevc_sao_edge_16x16_8_neon: 211.0
hevc_sao_edge_32x32_8_c: 7802.2
hevc_sao_edge_32x32_8_neon: 808.2
hevc_sao_edge_48x48_8_c: 16764.2
hevc_sao_edge_48x48_8_neon: 1796.5
hevc_sao_edge_64x64_8_c: 32647.5
hevc_sao_edge_64x64_8_neon: 3118.5
Signed-off-by: J. Dekker <jdek@itanimul.li>
3 years ago
Josh Dekker
7ac41e0db2
lavc/aarch64: add HEVC sao_band NEON
...
Only works for 8x8.
Signed-off-by: Josh Dekker <josh@itanimul.li>
4 years ago