mirror of https://github.com/FFmpeg/FFmpeg.git
Tag:
Branch:
Tree:
a5c0ed2122
master
oldabi
release/0.10
release/0.11
release/0.5
release/0.6
release/0.7
release/0.8
release/0.9
release/1.0
release/1.1
release/1.2
release/2.0
release/2.1
release/2.2
release/2.3
release/2.4
release/2.5
release/2.6
release/2.7
release/2.8
release/3.0
release/3.1
release/3.2
release/3.3
release/3.4
release/4.0
release/4.1
release/4.2
release/4.3
release/4.4
release/5.0
release/5.1
release/6.0
release/6.1
release/7.0
release/7.1
N
ffmpeg-0.6.3
n0.10
n0.10.1
n0.10.10
n0.10.11
n0.10.12
n0.10.13
n0.10.14
n0.10.15
n0.10.16
n0.10.2
n0.10.3
n0.10.4
n0.10.5
n0.10.6
n0.10.7
n0.10.8
n0.10.9
n0.11
n0.11-dev
n0.11.1
n0.11.2
n0.11.3
n0.11.4
n0.11.5
n0.12-dev
n0.5.10
n0.5.11
n0.5.12
n0.5.13
n0.5.14
n0.5.15
n0.5.5
n0.5.6
n0.5.7
n0.5.8
n0.5.9
n0.6.4
n0.6.5
n0.6.6
n0.6.7
n0.7.1
n0.7.10
n0.7.11
n0.7.12
n0.7.13
n0.7.14
n0.7.15
n0.7.16
n0.7.17
n0.7.2
n0.7.3
n0.7.4
n0.7.5
n0.7.6
n0.7.7
n0.7.8
n0.7.9
n0.8
n0.8.1
n0.8.10
n0.8.11
n0.8.12
n0.8.13
n0.8.14
n0.8.15
n0.8.2
n0.8.3
n0.8.4
n0.8.5
n0.8.6
n0.8.7
n0.8.8
n0.8.9
n0.9
n0.9.1
n0.9.2
n0.9.3
n0.9.4
n1.0
n1.0.1
n1.0.10
n1.0.2
n1.0.3
n1.0.4
n1.0.5
n1.0.6
n1.0.7
n1.0.8
n1.0.9
n1.1
n1.1-dev
n1.1.1
n1.1.10
n1.1.11
n1.1.12
n1.1.13
n1.1.14
n1.1.15
n1.1.16
n1.1.2
n1.1.3
n1.1.4
n1.1.5
n1.1.6
n1.1.7
n1.1.8
n1.1.9
n1.2
n1.2-dev
n1.2.1
n1.2.10
n1.2.11
n1.2.12
n1.2.2
n1.2.3
n1.2.4
n1.2.5
n1.2.6
n1.2.7
n1.2.8
n1.2.9
n1.3-dev
n2.0
n2.0.1
n2.0.2
n2.0.3
n2.0.4
n2.0.5
n2.0.6
n2.0.7
n2.1
n2.1-dev
n2.1.1
n2.1.2
n2.1.3
n2.1.4
n2.1.5
n2.1.6
n2.1.7
n2.1.8
n2.2
n2.2-dev
n2.2-rc1
n2.2-rc2
n2.2.1
n2.2.10
n2.2.11
n2.2.12
n2.2.13
n2.2.14
n2.2.15
n2.2.16
n2.2.2
n2.2.3
n2.2.4
n2.2.5
n2.2.6
n2.2.7
n2.2.8
n2.2.9
n2.3
n2.3-dev
n2.3.1
n2.3.2
n2.3.3
n2.3.4
n2.3.5
n2.3.6
n2.4
n2.4-dev
n2.4.1
n2.4.10
n2.4.11
n2.4.12
n2.4.13
n2.4.14
n2.4.2
n2.4.3
n2.4.4
n2.4.5
n2.4.6
n2.4.7
n2.4.8
n2.4.9
n2.5
n2.5-dev
n2.5.1
n2.5.10
n2.5.11
n2.5.2
n2.5.3
n2.5.4
n2.5.5
n2.5.6
n2.5.7
n2.5.8
n2.5.9
n2.6
n2.6-dev
n2.6.1
n2.6.2
n2.6.3
n2.6.4
n2.6.5
n2.6.6
n2.6.7
n2.6.8
n2.6.9
n2.7
n2.7-dev
n2.7.1
n2.7.2
n2.7.3
n2.7.4
n2.7.5
n2.7.6
n2.7.7
n2.8
n2.8-dev
n2.8.1
n2.8.10
n2.8.11
n2.8.12
n2.8.13
n2.8.14
n2.8.15
n2.8.16
n2.8.17
n2.8.18
n2.8.19
n2.8.2
n2.8.20
n2.8.21
n2.8.22
n2.8.3
n2.8.4
n2.8.5
n2.8.6
n2.8.7
n2.8.8
n2.8.9
n2.9-dev
n3.0
n3.0.1
n3.0.10
n3.0.11
n3.0.12
n3.0.2
n3.0.3
n3.0.4
n3.0.5
n3.0.6
n3.0.7
n3.0.8
n3.0.9
n3.1
n3.1-dev
n3.1.1
n3.1.10
n3.1.11
n3.1.2
n3.1.3
n3.1.4
n3.1.5
n3.1.6
n3.1.7
n3.1.8
n3.1.9
n3.2
n3.2-dev
n3.2.1
n3.2.10
n3.2.11
n3.2.12
n3.2.13
n3.2.14
n3.2.15
n3.2.16
n3.2.17
n3.2.18
n3.2.19
n3.2.2
n3.2.3
n3.2.4
n3.2.5
n3.2.6
n3.2.7
n3.2.8
n3.2.9
n3.3
n3.3-dev
n3.3.1
n3.3.2
n3.3.3
n3.3.4
n3.3.5
n3.3.6
n3.3.7
n3.3.8
n3.3.9
n3.4
n3.4-dev
n3.4.1
n3.4.10
n3.4.11
n3.4.12
n3.4.13
n3.4.2
n3.4.3
n3.4.4
n3.4.5
n3.4.6
n3.4.7
n3.4.8
n3.4.9
n3.5-dev
n4.0
n4.0.1
n4.0.2
n4.0.3
n4.0.4
n4.0.5
n4.0.6
n4.1
n4.1-dev
n4.1.1
n4.1.10
n4.1.11
n4.1.2
n4.1.3
n4.1.4
n4.1.5
n4.1.6
n4.1.7
n4.1.8
n4.1.9
n4.2
n4.2-dev
n4.2.1
n4.2.10
n4.2.2
n4.2.3
n4.2.4
n4.2.5
n4.2.6
n4.2.7
n4.2.8
n4.2.9
n4.3
n4.3-dev
n4.3.1
n4.3.2
n4.3.3
n4.3.4
n4.3.5
n4.3.6
n4.3.7
n4.3.8
n4.4
n4.4-dev
n4.4.1
n4.4.2
n4.4.3
n4.4.4
n4.4.5
n4.5-dev
n5.0
n5.0.1
n5.0.2
n5.0.3
n5.1
n5.1-dev
n5.1.1
n5.1.2
n5.1.3
n5.1.4
n5.1.5
n5.1.6
n5.2-dev
n6.0
n6.0.1
n6.1
n6.1-dev
n6.1.1
n6.1.2
n6.2-dev
n7.0
n7.0.1
n7.0.2
n7.1
n7.1-dev
n7.2-dev
v0.5
v0.5.1
v0.5.2
v0.5.3
v0.6
v0.6.1
${ noResults }
18 Commits (a5c0ed2122a9efe06613e52f0ff3f1323bb10169)
Author | SHA1 | Message | Date |
---|---|---|---|
Zhao Zhili | 5988a2729b |
aarch64/vvc: Add dmvr
dmvr_8_12x20_c: 1.5 ( 1.00x) dmvr_8_12x20_neon: 0.2 ( 6.56x) dmvr_8_20x12_c: 1.0 ( 1.00x) dmvr_8_20x12_neon: 0.2 ( 4.33x) dmvr_8_20x20_c: 1.7 ( 1.00x) dmvr_8_20x20_neon: 0.5 ( 3.63x) dmvr_12_12x20_c: 2.2 ( 1.00x) dmvr_12_12x20_neon: 0.5 ( 4.68x) dmvr_12_20x12_c: 2.0 ( 1.00x) dmvr_12_20x12_neon: 0.5 ( 4.16x) dmvr_12_20x20_c: 3.7 ( 1.00x) dmvr_12_20x20_neon: 0.7 ( 5.14x) Signed-off-by: Zhao Zhili <zhilizhao@tencent.com> |
4 months ago |
Zhao Zhili | bcd65ebd8f |
aarch64/vvc: Add dmvr_hv
dmvr_hv_8_12x20_c: 8.0 ( 1.00x) dmvr_hv_8_12x20_neon: 1.2 ( 6.62x) dmvr_hv_8_20x12_c: 8.0 ( 1.00x) dmvr_hv_8_20x12_neon: 0.9 ( 8.37x) dmvr_hv_8_20x20_c: 12.9 ( 1.00x) dmvr_hv_8_20x20_neon: 1.7 ( 7.62x) dmvr_hv_10_12x20_c: 7.0 ( 1.00x) dmvr_hv_10_12x20_neon: 1.7 ( 4.09x) dmvr_hv_10_20x12_c: 7.0 ( 1.00x) dmvr_hv_10_20x12_neon: 1.7 ( 4.09x) dmvr_hv_10_20x20_c: 11.2 ( 1.00x) dmvr_hv_10_20x20_neon: 2.7 ( 4.15x) dmvr_hv_12_12x20_c: 6.5 ( 1.00x) dmvr_hv_12_12x20_neon: 1.7 ( 3.79x) dmvr_hv_12_20x12_c: 6.5 ( 1.00x) dmvr_hv_12_20x12_neon: 1.7 ( 3.79x) dmvr_hv_12_20x20_c: 10.2 ( 1.00x) dmvr_hv_12_20x20_neon: 2.2 ( 4.64x) Signed-off-by: Zhao Zhili <zhilizhao@tencent.com> |
4 months ago |
Zhao Zhili | 0ba9e8d0d4 |
aarch64/vvc: Add w_avg
w_avg_8_2x2_c: 0.0 ( 0.00x) w_avg_8_2x2_neon: 0.0 ( 0.00x) w_avg_8_4x4_c: 0.2 ( 1.00x) w_avg_8_4x4_neon: 0.0 ( 0.00x) w_avg_8_8x8_c: 1.2 ( 1.00x) w_avg_8_8x8_neon: 0.2 ( 5.00x) w_avg_8_16x16_c: 4.2 ( 1.00x) w_avg_8_16x16_neon: 0.8 ( 5.67x) w_avg_8_32x32_c: 16.2 ( 1.00x) w_avg_8_32x32_neon: 2.5 ( 6.50x) w_avg_8_64x64_c: 64.5 ( 1.00x) w_avg_8_64x64_neon: 9.0 ( 7.17x) w_avg_8_128x128_c: 269.5 ( 1.00x) w_avg_8_128x128_neon: 35.5 ( 7.59x) w_avg_10_2x2_c: 0.2 ( 1.00x) w_avg_10_2x2_neon: 0.2 ( 1.00x) w_avg_10_4x4_c: 0.2 ( 1.00x) w_avg_10_4x4_neon: 0.2 ( 1.00x) w_avg_10_8x8_c: 1.0 ( 1.00x) w_avg_10_8x8_neon: 0.2 ( 4.00x) w_avg_10_16x16_c: 4.2 ( 1.00x) w_avg_10_16x16_neon: 0.8 ( 5.67x) w_avg_10_32x32_c: 16.2 ( 1.00x) w_avg_10_32x32_neon: 2.5 ( 6.50x) w_avg_10_64x64_c: 66.2 ( 1.00x) w_avg_10_64x64_neon: 10.0 ( 6.62x) w_avg_10_128x128_c: 277.8 ( 1.00x) w_avg_10_128x128_neon: 39.8 ( 6.99x) w_avg_12_2x2_c: 0.0 ( 0.00x) w_avg_12_2x2_neon: 0.2 ( 0.00x) w_avg_12_4x4_c: 0.2 ( 1.00x) w_avg_12_4x4_neon: 0.0 ( 0.00x) w_avg_12_8x8_c: 1.2 ( 1.00x) w_avg_12_8x8_neon: 0.5 ( 2.50x) w_avg_12_16x16_c: 4.8 ( 1.00x) w_avg_12_16x16_neon: 0.8 ( 6.33x) w_avg_12_32x32_c: 17.0 ( 1.00x) w_avg_12_32x32_neon: 2.8 ( 6.18x) w_avg_12_64x64_c: 64.0 ( 1.00x) w_avg_12_64x64_neon: 10.0 ( 6.40x) w_avg_12_128x128_c: 269.2 ( 1.00x) w_avg_12_128x128_neon: 42.0 ( 6.41x) Signed-off-by: Zhao Zhili <zhilizhao@tencent.com> |
4 months ago |
Zhao Zhili | 3f84d1d1fb |
aarch64/vvc: Add avg
avg_8_2x2_c: 0.2 ( 1.00x) avg_8_2x2_neon: 0.2 ( 1.00x) avg_8_4x4_c: 0.2 ( 1.00x) avg_8_4x4_neon: 0.2 ( 1.00x) avg_8_8x8_c: 0.9 ( 1.00x) avg_8_8x8_neon: 0.2 ( 5.29x) avg_8_16x16_c: 3.7 ( 1.00x) avg_8_16x16_neon: 0.7 ( 5.44x) avg_8_32x32_c: 14.9 ( 1.00x) avg_8_32x32_neon: 1.7 ( 8.91x) avg_8_64x64_c: 59.7 ( 1.00x) avg_8_64x64_neon: 6.9 ( 8.62x) avg_8_128x128_c: 254.7 ( 1.00x) avg_8_128x128_neon: 26.9 ( 9.46x) avg_10_2x2_c: 0.2 ( 1.00x) avg_10_2x2_neon: 0.2 ( 1.00x) avg_10_4x4_c: 0.2 ( 1.00x) avg_10_4x4_neon: 0.2 ( 1.00x) avg_10_8x8_c: 0.9 ( 1.00x) avg_10_8x8_neon: 0.2 ( 5.29x) avg_10_16x16_c: 3.4 ( 1.00x) avg_10_16x16_neon: 0.4 ( 8.06x) avg_10_32x32_c: 13.9 ( 1.00x) avg_10_32x32_neon: 1.9 ( 7.23x) avg_10_64x64_c: 54.2 ( 1.00x) avg_10_64x64_neon: 8.4 ( 6.43x) avg_10_128x128_c: 232.4 ( 1.00x) avg_10_128x128_neon: 30.9 ( 7.52x) avg_12_2x2_c: 0.0 ( 0.00x) avg_12_2x2_neon: 0.2 ( 0.00x) avg_12_4x4_c: 0.4 ( 1.00x) avg_12_4x4_neon: 0.2 ( 2.43x) avg_12_8x8_c: 0.7 ( 1.00x) avg_12_8x8_neon: 0.2 ( 3.86x) avg_12_16x16_c: 3.7 ( 1.00x) avg_12_16x16_neon: 0.4 ( 8.65x) avg_12_32x32_c: 13.7 ( 1.00x) avg_12_32x32_neon: 2.2 ( 6.29x) avg_12_64x64_c: 53.9 ( 1.00x) avg_12_64x64_neon: 7.7 ( 7.03x) avg_12_128x128_c: 270.9 ( 1.00x) avg_12_128x128_neon: 30.4 ( 8.90x) |
5 months ago |
Zhao Zhili | 1be5a2374f |
aarch64/vvc: Add put_epel_hv
On Apple M1: put_chroma_hv_8_4x4_c: 1.7 ( 1.00x) put_chroma_hv_8_4x4_neon: 0.2 ( 7.67x) put_chroma_hv_8_8x8_c: 5.5 ( 1.00x) put_chroma_hv_8_8x8_neon: 0.5 (11.53x) put_chroma_hv_8_16x16_c: 18.5 ( 1.00x) put_chroma_hv_8_16x16_neon: 1.5 (12.53x) put_chroma_hv_8_32x32_c: 72.5 ( 1.00x) put_chroma_hv_8_32x32_neon: 4.7 (15.34x) put_chroma_hv_8_64x64_c: 274.0 ( 1.00x) put_chroma_hv_8_64x64_neon: 18.5 (14.83x) put_chroma_hv_8_128x128_c: 1058.7 ( 1.00x) put_chroma_hv_8_128x128_neon: 75.2 (14.07x) On Android Pixel 8 Pro: put_chroma_hv_8_4x4_c: 1.2 ( 1.00x) put_chroma_hv_8_4x4_neon: 0.0 ( 0.00x) put_chroma_hv_8_4x4_i8mm: 0.2 ( 5.00x) put_chroma_hv_8_8x8_c: 4.0 ( 1.00x) put_chroma_hv_8_8x8_neon: 0.5 ( 8.00x) put_chroma_hv_8_8x8_i8mm: 0.5 ( 8.00x) put_chroma_hv_8_16x16_c: 15.2 ( 1.00x) put_chroma_hv_8_16x16_neon: 2.5 ( 6.10x) put_chroma_hv_8_16x16_i8mm: 2.2 ( 6.78x) put_chroma_hv_8_32x32_c: 61.0 ( 1.00x) put_chroma_hv_8_32x32_neon: 9.8 ( 6.26x) put_chroma_hv_8_32x32_i8mm: 8.5 ( 7.18x) put_chroma_hv_8_64x64_c: 229.5 ( 1.00x) put_chroma_hv_8_64x64_neon: 38.5 ( 5.96x) put_chroma_hv_8_64x64_i8mm: 34.0 ( 6.75x) put_chroma_hv_8_128x128_c: 919.8 ( 1.00x) put_chroma_hv_8_128x128_neon: 154.5 ( 5.95x) put_chroma_hv_8_128x128_i8mm: 140.0 ( 6.57x) |
5 months ago |
Zhao Zhili | 0dcf204e5d |
aarch64/vvc: Add put_epel_h i8mm
put_chroma_h_8_4x4_c: 0.4 ( 1.00x) put_chroma_h_8_4x4_neon: 0.0 ( 0.00x) put_chroma_h_8_4x4_i8mm: 0.1 ( 2.67x) put_chroma_h_8_8x8_c: 1.6 ( 1.00x) put_chroma_h_8_8x8_neon: 0.1 (11.00x) put_chroma_h_8_8x8_i8mm: 0.1 (11.00x) put_chroma_h_8_16x16_c: 6.9 ( 1.00x) put_chroma_h_8_16x16_neon: 1.1 ( 6.00x) put_chroma_h_8_16x16_i8mm: 0.7 (10.62x) put_chroma_h_8_32x32_c: 27.6 ( 1.00x) put_chroma_h_8_32x32_neon: 4.7 ( 5.95x) put_chroma_h_8_32x32_i8mm: 4.4 ( 6.28x) put_chroma_h_8_64x64_c: 116.2 ( 1.00x) put_chroma_h_8_64x64_neon: 19.1 ( 6.07x) put_chroma_h_8_64x64_i8mm: 17.1 ( 6.77x) put_chroma_h_8_128x128_c: 466.6 ( 1.00x) put_chroma_h_8_128x128_neon: 81.4 ( 5.73x) put_chroma_h_8_128x128_i8mm: 71.7 ( 6.51x) |
5 months ago |
Zhao Zhili | 41a1885f7a |
aarch64/vvc: Add put_epel_h
put_chroma_h_8_4x4_c: 0.2 ( 1.00x) put_chroma_h_8_4x4_neon: 0.2 ( 1.00x) put_chroma_h_8_8x8_c: 0.8 ( 1.00x) put_chroma_h_8_8x8_neon: 0.2 ( 3.00x) put_chroma_h_8_16x16_c: 3.8 ( 1.00x) put_chroma_h_8_16x16_neon: 0.8 ( 5.00x) put_chroma_h_8_32x32_c: 12.5 ( 1.00x) put_chroma_h_8_32x32_neon: 2.2 ( 5.56x) put_chroma_h_8_64x64_c: 47.0 ( 1.00x) put_chroma_h_8_64x64_neon: 8.8 ( 5.37x) put_chroma_h_8_128x128_c: 200.2 ( 1.00x) put_chroma_h_8_128x128_neon: 31.8 ( 6.31x) |
5 months ago |
Zhao Zhili | 260e1b4b62 |
aarch64/vvc: Add sad
sad_8x16_c: 0.8 ( 1.00x) sad_8x16_neon: 0.2 ( 3.00x) sad_16x8_c: 0.5 ( 1.00x) sad_16x8_neon: 0.2 ( 2.00x) sad_16x16_c: 1.5 ( 1.00x) sad_16x16_neon: 0.2 ( 6.00x) |
5 months ago |
Zhao Zhili | 5ac6925803 |
aarch64/vvc: Add put_qpel_hv
With Apple M1 (no i8mm): put_luma_hv_8_4x4_c: 2.2 ( 1.00x) put_luma_hv_8_4x4_neon: 0.8 ( 3.00x) put_luma_hv_8_8x8_c: 7.0 ( 1.00x) put_luma_hv_8_8x8_neon: 0.8 ( 9.33x) put_luma_hv_8_16x16_c: 22.8 ( 1.00x) put_luma_hv_8_16x16_neon: 2.5 ( 9.10x) put_luma_hv_8_32x32_c: 84.8 ( 1.00x) put_luma_hv_8_32x32_neon: 9.5 ( 8.92x) put_luma_hv_8_64x64_c: 333.0 ( 1.00x) put_luma_hv_8_64x64_neon: 35.5 ( 9.38x) put_luma_hv_8_128x128_c: 1294.5 ( 1.00x) put_luma_hv_8_128x128_neon: 137.8 ( 9.40x) With Pixel 8 Pro: put_luma_hv_8_4x4_c: 5.0 ( 1.00x) put_luma_hv_8_4x4_neon: 0.8 ( 6.67x) put_luma_hv_8_4x4_i8mm: 0.2 (20.00x) put_luma_hv_8_8x8_c: 13.2 ( 1.00x) put_luma_hv_8_8x8_neon: 1.2 (10.60x) put_luma_hv_8_8x8_i8mm: 1.2 (10.60x) put_luma_hv_8_16x16_c: 44.2 ( 1.00x) put_luma_hv_8_16x16_neon: 4.5 ( 9.83x) put_luma_hv_8_16x16_i8mm: 4.2 (10.41x) put_luma_hv_8_32x32_c: 160.8 ( 1.00x) put_luma_hv_8_32x32_neon: 17.5 ( 9.19x) put_luma_hv_8_32x32_i8mm: 16.0 (10.05x) put_luma_hv_8_64x64_c: 611.2 ( 1.00x) put_luma_hv_8_64x64_neon: 68.0 ( 8.99x) put_luma_hv_8_64x64_i8mm: 62.2 ( 9.82x) put_luma_hv_8_128x128_c: 2384.8 ( 1.00x) put_luma_hv_8_128x128_neon: 268.8 ( 8.87x) put_luma_hv_8_128x128_i8mm: 245.8 ( 9.70x) |
5 months ago |
Zhao Zhili | a0b52afd32 |
aarch64/vvc: Add put_qpel_vx
put_luma_v_8_4x4_c: 1.0 ( 1.00x) put_luma_v_8_4x4_neon: 0.0 ( 0.00x) put_luma_v_8_8x8_c: 3.5 ( 1.00x) put_luma_v_8_8x8_neon: 0.5 ( 7.00x) put_luma_v_8_16x16_c: 13.8 ( 1.00x) put_luma_v_8_16x16_neon: 1.2 (11.00x) put_luma_v_8_32x32_c: 54.2 ( 1.00x) put_luma_v_8_32x32_neon: 5.0 (10.85x) put_luma_v_8_64x64_c: 217.5 ( 1.00x) put_luma_v_8_64x64_neon: 18.8 (11.60x) put_luma_v_8_128x128_c: 886.2 ( 1.00x) put_luma_v_8_128x128_neon: 74.0 (11.98x) |
5 months ago |
Zhao Zhili | 9f6c8eb412 |
aarch64/vvc: Add put_qpel_hx i8mm
Benchmark on Android pixel 8 with -fno-vectorize put_luma_h_8_4x4_c: 0.2 ( 1.00x) put_luma_h_8_4x4_neon: 0.2 ( 1.00x) put_luma_h_8_4x4_i8mm: 0.0 ( 0.00x) put_luma_h_8_8x8_c: 1.5 ( 1.00x) put_luma_h_8_8x8_neon: 0.5 ( 3.00x) put_luma_h_8_8x8_i8mm: 0.5 ( 3.00x) put_luma_h_8_16x16_c: 6.2 ( 1.00x) put_luma_h_8_16x16_neon: 2.0 ( 3.12x) put_luma_h_8_16x16_i8mm: 1.5 ( 4.17x) put_luma_h_8_32x32_c: 25.5 ( 1.00x) put_luma_h_8_32x32_neon: 9.0 ( 2.83x) put_luma_h_8_32x32_i8mm: 6.8 ( 3.78x) put_luma_h_8_64x64_c: 99.8 ( 1.00x) put_luma_h_8_64x64_neon: 35.2 ( 2.83x) put_luma_h_8_64x64_i8mm: 27.2 ( 3.66x) put_luma_h_8_128x128_c: 422.0 ( 1.00x) put_luma_h_8_128x128_neon: 138.5 ( 3.05x) put_luma_h_8_128x128_i8mm: 109.2 ( 3.86x) |
5 months ago |
Zhao Zhili | 25448d1716 |
aarch64/vvc: Add put_pel/put_pel_uni/put_pel_uni_w
put_luma_pixels_8_4x4_c: 0.2 ( 1.00x) put_luma_pixels_8_4x4_neon: 0.2 ( 1.00x) put_luma_pixels_8_8x8_c: 0.7 ( 1.00x) put_luma_pixels_8_8x8_neon: 0.2 ( 3.22x) put_luma_pixels_8_16x16_c: 2.2 ( 1.00x) put_luma_pixels_8_16x16_neon: 0.2 ( 9.89x) put_luma_pixels_8_32x32_c: 8.2 ( 1.00x) put_luma_pixels_8_32x32_neon: 1.2 ( 6.71x) put_luma_pixels_8_64x64_c: 33.7 ( 1.00x) put_luma_pixels_8_64x64_neon: 2.5 (13.63x) put_luma_pixels_8_128x128_c: 145.5 ( 1.00x) put_luma_pixels_8_128x128_neon: 10.2 (14.23x) put_uni_pixels_luma_8_4x4_c: 0.5 ( 1.00x) put_uni_pixels_luma_8_4x4_neon: 0.0 ( 0.00x) put_uni_pixels_luma_8_8x8_c: 0.5 ( 1.00x) put_uni_pixels_luma_8_8x8_neon: 0.2 ( 2.11x) put_uni_pixels_luma_8_16x16_c: 1.2 ( 1.00x) put_uni_pixels_luma_8_16x16_neon: 0.2 ( 5.44x) put_uni_pixels_luma_8_32x32_c: 3.0 ( 1.00x) put_uni_pixels_luma_8_32x32_neon: 0.5 ( 6.26x) put_uni_pixels_luma_8_64x64_c: 3.0 ( 1.00x) put_uni_pixels_luma_8_64x64_neon: 1.7 ( 1.72x) put_uni_pixels_luma_8_128x128_c: 6.5 ( 1.00x) put_uni_pixels_luma_8_128x128_neon: 6.5 ( 1.00x) |
5 months ago |
Zhao Zhili | 20f2bf5530 |
aarch64/vvc: Add put_qpel_h_* and put_qpel_uni_h_*
Just share hevc implementation. checkasm --test=vvc_mc --benchmark: put_luma_h_8_4x4_c: 0.2 ( 1.00x) put_luma_h_8_4x4_neon: 0.2 ( 1.00x) put_luma_h_8_8x8_c: 1.0 ( 1.00x) put_luma_h_8_8x8_neon: 0.2 ( 4.33x) put_luma_h_8_16x16_c: 3.2 ( 1.00x) put_luma_h_8_16x16_neon: 1.2 ( 2.63x) put_luma_h_8_32x32_c: 13.7 ( 1.00x) put_luma_h_8_32x32_neon: 4.0 ( 3.45x) put_luma_h_8_64x64_c: 48.2 ( 1.00x) put_luma_h_8_64x64_neon: 15.7 ( 3.07x) put_luma_h_8_128x128_c: 203.5 ( 1.00x) put_luma_h_8_128x128_neon: 62.0 ( 3.28x) put_uni_h_luma_8_4x4_c: 0.2 ( 1.00x) put_uni_h_luma_8_4x4_neon: 0.2 ( 1.00x) put_uni_h_luma_8_8x8_c: 1.5 ( 1.00x) put_uni_h_luma_8_8x8_neon: 0.2 ( 6.56x) put_uni_h_luma_8_16x16_c: 5.7 ( 1.00x) put_uni_h_luma_8_16x16_neon: 1.2 ( 4.67x) put_uni_h_luma_8_32x32_c: 24.0 ( 1.00x) put_uni_h_luma_8_32x32_neon: 4.7 ( 5.07x) put_uni_h_luma_8_64x64_c: 90.0 ( 1.00x) put_uni_h_luma_8_64x64_neon: 17.0 ( 5.30x) put_uni_h_luma_8_128x128_c: 357.7 ( 1.00x) put_uni_h_luma_8_128x128_neon: 67.5 ( 5.30x) |
5 months ago |
Zhao Zhili | 4c0372281b |
aarch64/vvc: Bind h26x/sao filter implementation to vvc
Reviewed-by: Martin Storsjö <martin@martin.st> |
5 months ago |
Martin Storsjö | 4acb9b7d10 |
aarch64: vvc: Fix unnecessary extra spaces
Signed-off-by: Martin Storsjö <martin@martin.st> |
6 months ago |
Martin Storsjö | 99598629e8 |
aarch64: vvc: Consistently use # for immediate constants
Signed-off-by: Martin Storsjö <martin@martin.st> |
6 months ago |
Martin Storsjö | 400843151d |
aarch64: vvc: Fix compilation of alf.S with MSVC 2022 17.7 and older
Use the "ldur" instruction explicitly, instead of having the assembler implicitly convert "ldr" instructions to "ldur". This fixes build errors like these: libavcodec\aarch64\vvc\alf.o.asm(1023) : error A2518: operand 2: Memory offset must be aligned ldr q22, [x3, #24] libavcodec\aarch64\vvc\alf.o.asm(1024) : error A2518: operand 2: Memory offset must be aligned ldr q24, [x2, #24] libavcodec\aarch64\vvc\alf.o.asm(1393) : error A2518: operand 2: Memory offset must be aligned ldr q22, [x3, #24] libavcodec\aarch64\vvc\alf.o.asm(1394) : error A2518: operand 2: Memory offset must be aligned ldr q24, [x2, #24] Signed-off-by: Martin Storsjö <martin@martin.st> |
6 months ago |
Zhao Zhili | 2d4ef304c9 |
avcodec/vvc: Add aarch64 neon optimization for ALF
vvc_alf_filter_chroma_4x4_8_c: 3.0 vvc_alf_filter_chroma_4x4_8_neon: 1.0 vvc_alf_filter_chroma_4x4_10_c: 2.7 vvc_alf_filter_chroma_4x4_10_neon: 1.0 vvc_alf_filter_chroma_4x4_12_c: 2.7 vvc_alf_filter_chroma_4x4_12_neon: 1.0 vvc_alf_filter_chroma_8x8_8_c: 10.2 vvc_alf_filter_chroma_8x8_8_neon: 3.0 vvc_alf_filter_chroma_8x8_10_c: 10.0 vvc_alf_filter_chroma_8x8_10_neon: 2.5 vvc_alf_filter_chroma_8x8_12_c: 10.0 vvc_alf_filter_chroma_8x8_12_neon: 2.5 vvc_alf_filter_chroma_16x16_8_c: 41.7 vvc_alf_filter_chroma_16x16_8_neon: 11.2 vvc_alf_filter_chroma_16x16_10_c: 39.0 vvc_alf_filter_chroma_16x16_10_neon: 10.0 vvc_alf_filter_chroma_16x16_12_c: 40.2 vvc_alf_filter_chroma_16x16_12_neon: 10.2 vvc_alf_filter_chroma_32x32_8_c: 162.0 vvc_alf_filter_chroma_32x32_8_neon: 45.0 vvc_alf_filter_chroma_32x32_10_c: 155.5 vvc_alf_filter_chroma_32x32_10_neon: 39.5 vvc_alf_filter_chroma_32x32_12_c: 155.5 vvc_alf_filter_chroma_32x32_12_neon: 40.0 vvc_alf_filter_chroma_64x64_8_c: 646.0 vvc_alf_filter_chroma_64x64_8_neon: 175.5 vvc_alf_filter_chroma_64x64_10_c: 708.2 vvc_alf_filter_chroma_64x64_10_neon: 166.7 vvc_alf_filter_chroma_64x64_12_c: 619.2 vvc_alf_filter_chroma_64x64_12_neon: 157.2 vvc_alf_filter_chroma_128x128_8_c: 2611.5 vvc_alf_filter_chroma_128x128_8_neon: 698.2 vvc_alf_filter_chroma_128x128_10_c: 2470.0 vvc_alf_filter_chroma_128x128_10_neon: 616.0 vvc_alf_filter_chroma_128x128_12_c: 2531.5 vvc_alf_filter_chroma_128x128_12_neon: 620.2 vvc_alf_filter_luma_8x8_8_c: 25.2 vvc_alf_filter_luma_8x8_8_neon: 4.2 vvc_alf_filter_luma_8x8_10_c: 18.5 vvc_alf_filter_luma_8x8_10_neon: 4.0 vvc_alf_filter_luma_8x8_12_c: 19.0 vvc_alf_filter_luma_8x8_12_neon: 4.0 vvc_alf_filter_luma_16x16_8_c: 106.5 vvc_alf_filter_luma_16x16_8_neon: 16.2 vvc_alf_filter_luma_16x16_10_c: 75.2 vvc_alf_filter_luma_16x16_10_neon: 14.7 vvc_alf_filter_luma_16x16_12_c: 79.7 vvc_alf_filter_luma_16x16_12_neon: 14.7 vvc_alf_filter_luma_32x32_8_c: 400.5 vvc_alf_filter_luma_32x32_8_neon: 63.2 vvc_alf_filter_luma_32x32_10_c: 299.2 vvc_alf_filter_luma_32x32_10_neon: 57.7 vvc_alf_filter_luma_32x32_12_c: 299.2 vvc_alf_filter_luma_32x32_12_neon: 57.7 vvc_alf_filter_luma_64x64_8_c: 1602.5 vvc_alf_filter_luma_64x64_8_neon: 251.7 vvc_alf_filter_luma_64x64_10_c: 1197.0 vvc_alf_filter_luma_64x64_10_neon: 235.5 vvc_alf_filter_luma_64x64_12_c: 1220.2 vvc_alf_filter_luma_64x64_12_neon: 235.7 vvc_alf_filter_luma_128x128_8_c: 6570.2 vvc_alf_filter_luma_128x128_8_neon: 1007.7 vvc_alf_filter_luma_128x128_10_c: 4822.7 vvc_alf_filter_luma_128x128_10_neon: 936.2 vvc_alf_filter_luma_128x128_12_c: 4791.2 vvc_alf_filter_luma_128x128_12_neon: 938.5 Signed-off-by: Zhao Zhili <zhilizhao@tencent.com> |
6 months ago |