mirror of https://github.com/FFmpeg/FFmpeg.git
Tag:
Branch:
Tree:
378d1b06c3
master
oldabi
release/0.10
release/0.11
release/0.5
release/0.6
release/0.7
release/0.8
release/0.9
release/1.0
release/1.1
release/1.2
release/2.0
release/2.1
release/2.2
release/2.3
release/2.4
release/2.5
release/2.6
release/2.7
release/2.8
release/3.0
release/3.1
release/3.2
release/3.3
release/3.4
release/4.0
release/4.1
release/4.2
release/4.3
release/4.4
release/5.0
release/5.1
release/6.0
release/6.1
release/7.0
release/7.1
N
ffmpeg-0.6.3
n0.10
n0.10.1
n0.10.10
n0.10.11
n0.10.12
n0.10.13
n0.10.14
n0.10.15
n0.10.16
n0.10.2
n0.10.3
n0.10.4
n0.10.5
n0.10.6
n0.10.7
n0.10.8
n0.10.9
n0.11
n0.11-dev
n0.11.1
n0.11.2
n0.11.3
n0.11.4
n0.11.5
n0.12-dev
n0.5.10
n0.5.11
n0.5.12
n0.5.13
n0.5.14
n0.5.15
n0.5.5
n0.5.6
n0.5.7
n0.5.8
n0.5.9
n0.6.4
n0.6.5
n0.6.6
n0.6.7
n0.7.1
n0.7.10
n0.7.11
n0.7.12
n0.7.13
n0.7.14
n0.7.15
n0.7.16
n0.7.17
n0.7.2
n0.7.3
n0.7.4
n0.7.5
n0.7.6
n0.7.7
n0.7.8
n0.7.9
n0.8
n0.8.1
n0.8.10
n0.8.11
n0.8.12
n0.8.13
n0.8.14
n0.8.15
n0.8.2
n0.8.3
n0.8.4
n0.8.5
n0.8.6
n0.8.7
n0.8.8
n0.8.9
n0.9
n0.9.1
n0.9.2
n0.9.3
n0.9.4
n1.0
n1.0.1
n1.0.10
n1.0.2
n1.0.3
n1.0.4
n1.0.5
n1.0.6
n1.0.7
n1.0.8
n1.0.9
n1.1
n1.1-dev
n1.1.1
n1.1.10
n1.1.11
n1.1.12
n1.1.13
n1.1.14
n1.1.15
n1.1.16
n1.1.2
n1.1.3
n1.1.4
n1.1.5
n1.1.6
n1.1.7
n1.1.8
n1.1.9
n1.2
n1.2-dev
n1.2.1
n1.2.10
n1.2.11
n1.2.12
n1.2.2
n1.2.3
n1.2.4
n1.2.5
n1.2.6
n1.2.7
n1.2.8
n1.2.9
n1.3-dev
n2.0
n2.0.1
n2.0.2
n2.0.3
n2.0.4
n2.0.5
n2.0.6
n2.0.7
n2.1
n2.1-dev
n2.1.1
n2.1.2
n2.1.3
n2.1.4
n2.1.5
n2.1.6
n2.1.7
n2.1.8
n2.2
n2.2-dev
n2.2-rc1
n2.2-rc2
n2.2.1
n2.2.10
n2.2.11
n2.2.12
n2.2.13
n2.2.14
n2.2.15
n2.2.16
n2.2.2
n2.2.3
n2.2.4
n2.2.5
n2.2.6
n2.2.7
n2.2.8
n2.2.9
n2.3
n2.3-dev
n2.3.1
n2.3.2
n2.3.3
n2.3.4
n2.3.5
n2.3.6
n2.4
n2.4-dev
n2.4.1
n2.4.10
n2.4.11
n2.4.12
n2.4.13
n2.4.14
n2.4.2
n2.4.3
n2.4.4
n2.4.5
n2.4.6
n2.4.7
n2.4.8
n2.4.9
n2.5
n2.5-dev
n2.5.1
n2.5.10
n2.5.11
n2.5.2
n2.5.3
n2.5.4
n2.5.5
n2.5.6
n2.5.7
n2.5.8
n2.5.9
n2.6
n2.6-dev
n2.6.1
n2.6.2
n2.6.3
n2.6.4
n2.6.5
n2.6.6
n2.6.7
n2.6.8
n2.6.9
n2.7
n2.7-dev
n2.7.1
n2.7.2
n2.7.3
n2.7.4
n2.7.5
n2.7.6
n2.7.7
n2.8
n2.8-dev
n2.8.1
n2.8.10
n2.8.11
n2.8.12
n2.8.13
n2.8.14
n2.8.15
n2.8.16
n2.8.17
n2.8.18
n2.8.19
n2.8.2
n2.8.20
n2.8.21
n2.8.22
n2.8.3
n2.8.4
n2.8.5
n2.8.6
n2.8.7
n2.8.8
n2.8.9
n2.9-dev
n3.0
n3.0.1
n3.0.10
n3.0.11
n3.0.12
n3.0.2
n3.0.3
n3.0.4
n3.0.5
n3.0.6
n3.0.7
n3.0.8
n3.0.9
n3.1
n3.1-dev
n3.1.1
n3.1.10
n3.1.11
n3.1.2
n3.1.3
n3.1.4
n3.1.5
n3.1.6
n3.1.7
n3.1.8
n3.1.9
n3.2
n3.2-dev
n3.2.1
n3.2.10
n3.2.11
n3.2.12
n3.2.13
n3.2.14
n3.2.15
n3.2.16
n3.2.17
n3.2.18
n3.2.19
n3.2.2
n3.2.3
n3.2.4
n3.2.5
n3.2.6
n3.2.7
n3.2.8
n3.2.9
n3.3
n3.3-dev
n3.3.1
n3.3.2
n3.3.3
n3.3.4
n3.3.5
n3.3.6
n3.3.7
n3.3.8
n3.3.9
n3.4
n3.4-dev
n3.4.1
n3.4.10
n3.4.11
n3.4.12
n3.4.13
n3.4.2
n3.4.3
n3.4.4
n3.4.5
n3.4.6
n3.4.7
n3.4.8
n3.4.9
n3.5-dev
n4.0
n4.0.1
n4.0.2
n4.0.3
n4.0.4
n4.0.5
n4.0.6
n4.1
n4.1-dev
n4.1.1
n4.1.10
n4.1.11
n4.1.2
n4.1.3
n4.1.4
n4.1.5
n4.1.6
n4.1.7
n4.1.8
n4.1.9
n4.2
n4.2-dev
n4.2.1
n4.2.10
n4.2.2
n4.2.3
n4.2.4
n4.2.5
n4.2.6
n4.2.7
n4.2.8
n4.2.9
n4.3
n4.3-dev
n4.3.1
n4.3.2
n4.3.3
n4.3.4
n4.3.5
n4.3.6
n4.3.7
n4.3.8
n4.4
n4.4-dev
n4.4.1
n4.4.2
n4.4.3
n4.4.4
n4.4.5
n4.5-dev
n5.0
n5.0.1
n5.0.2
n5.0.3
n5.1
n5.1-dev
n5.1.1
n5.1.2
n5.1.3
n5.1.4
n5.1.5
n5.1.6
n5.2-dev
n6.0
n6.0.1
n6.1
n6.1-dev
n6.1.1
n6.1.2
n6.2-dev
n7.0
n7.0.1
n7.0.2
n7.1
n7.1-dev
n7.2-dev
v0.5
v0.5.1
v0.5.2
v0.5.3
v0.6
v0.6.1
${ noResults }
5 Commits (378d1b06c350e09ac604566130b39126fd858478)
Author | SHA1 | Message | Date |
---|---|---|---|
Andreas Rheinhardt | 6106fb2b4c |
avcodec/hevcdsp: Offset ff_hevc_.pel_filters to simplify addressing
Besides simplifying address computations (it saves 432B of .text in hevcdsp.o alone here) it also fixes undefined behaviour that occurs if mx or my are 0 (happens when the filters are unused) because they lead to an array index of -1 in the old code. This happens in the checkasm-hevc_pel FATE-test. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com> Reviewed-by: Nuo Mi <nuomi2021@gmail.com> Signed-off-by: James Almer <jamrial@gmail.com> |
12 months ago |
jinbo |
9239081db3
|
avcodec/hevc: Add asm opt for the following functions
tests/checkasm/checkasm: C LSX LASX put_hevc_qpel_uni_h4_8_c: 5.7 1.2 put_hevc_qpel_uni_h6_8_c: 12.2 2.7 put_hevc_qpel_uni_h8_8_c: 21.5 3.2 put_hevc_qpel_uni_h12_8_c: 47.2 9.2 7.2 put_hevc_qpel_uni_h16_8_c: 87.0 11.7 9.0 put_hevc_qpel_uni_h24_8_c: 188.2 27.5 21.0 put_hevc_qpel_uni_h32_8_c: 335.2 46.7 28.5 put_hevc_qpel_uni_h48_8_c: 772.5 104.5 65.2 put_hevc_qpel_uni_h64_8_c: 1383.2 142.2 109.0 put_hevc_epel_uni_w_v4_8_c: 5.0 1.5 put_hevc_epel_uni_w_v6_8_c: 10.7 3.5 2.5 put_hevc_epel_uni_w_v8_8_c: 18.2 3.7 3.0 put_hevc_epel_uni_w_v12_8_c: 40.2 10.7 7.5 put_hevc_epel_uni_w_v16_8_c: 70.2 13.0 9.2 put_hevc_epel_uni_w_v24_8_c: 158.2 30.2 22.5 put_hevc_epel_uni_w_v32_8_c: 281.0 52.0 36.5 put_hevc_epel_uni_w_v48_8_c: 631.7 116.7 82.7 put_hevc_epel_uni_w_v64_8_c: 1108.2 207.5 142.2 put_hevc_epel_uni_w_h4_8_c: 4.7 1.2 put_hevc_epel_uni_w_h6_8_c: 9.7 3.5 2.7 put_hevc_epel_uni_w_h8_8_c: 17.2 4.2 3.5 put_hevc_epel_uni_w_h12_8_c: 38.0 11.5 7.2 put_hevc_epel_uni_w_h16_8_c: 69.2 14.5 9.2 put_hevc_epel_uni_w_h24_8_c: 152.0 34.7 22.5 put_hevc_epel_uni_w_h32_8_c: 271.0 58.0 40.0 put_hevc_epel_uni_w_h48_8_c: 597.5 136.7 95.0 put_hevc_epel_uni_w_h64_8_c: 1074.0 252.2 168.0 put_hevc_epel_bi_h4_8_c: 4.5 0.7 put_hevc_epel_bi_h6_8_c: 9.0 1.5 put_hevc_epel_bi_h8_8_c: 15.2 1.7 put_hevc_epel_bi_h12_8_c: 33.5 4.2 3.7 put_hevc_epel_bi_h16_8_c: 59.7 5.2 4.7 put_hevc_epel_bi_h24_8_c: 132.2 11.0 put_hevc_epel_bi_h32_8_c: 232.7 20.2 13.2 put_hevc_epel_bi_h48_8_c: 521.7 45.2 31.2 put_hevc_epel_bi_h64_8_c: 949.0 71.5 51.0 After this patch, the peformance of decoding H265 4K 30FPS 30Mbps on 3A6000 with 8 threads improves 1fps(55fps-->56fsp). Change-Id: I8cc1e41daa63ca478039bc55d1ee8934a7423f51 Reviewed-by: yinshiyou-hf@loongson.cn Signed-off-by: Michael Niedermayer <michael@niedermayer.cc> |
1 year ago |
jinbo |
1f642b99af
|
avcodec/hevc: Add epel_uni_w_hv4/6/8/12/16/24/32/48/64 asm opt
tests/checkasm/checkasm: C LSX LASX put_hevc_epel_uni_w_hv4_8_c: 9.5 2.2 put_hevc_epel_uni_w_hv6_8_c: 18.5 5.0 3.7 put_hevc_epel_uni_w_hv8_8_c: 30.7 6.0 4.5 put_hevc_epel_uni_w_hv12_8_c: 63.7 14.0 10.7 put_hevc_epel_uni_w_hv16_8_c: 107.5 22.7 17.0 put_hevc_epel_uni_w_hv24_8_c: 236.7 50.2 31.7 put_hevc_epel_uni_w_hv32_8_c: 414.5 88.0 53.0 put_hevc_epel_uni_w_hv48_8_c: 917.5 197.7 118.5 put_hevc_epel_uni_w_hv64_8_c: 1617.0 349.5 203.0 After this patch, the peformance of decoding H265 4K 30FPS 30Mbps on 3A6000 with 8 threads improves 3fps (52fps-->55fsp). Change-Id: If067e394cec4685c62193e7adb829ac93ba4804d Reviewed-by: yinshiyou-hf@loongson.cn Signed-off-by: Michael Niedermayer <michael@niedermayer.cc> |
1 year ago |
jinbo |
6c6bf18ce8
|
avcodec/hevc: Add qpel_uni_w_v|h4/6/8/12/16/24/32/48/64 asm opt
tests/checkasm/checkasm: C LSX LASX put_hevc_qpel_uni_w_h4_8_c: 6.5 1.7 1.2 put_hevc_qpel_uni_w_h6_8_c: 14.5 4.5 3.7 put_hevc_qpel_uni_w_h8_8_c: 24.5 5.7 4.5 put_hevc_qpel_uni_w_h12_8_c: 54.7 17.5 12.0 put_hevc_qpel_uni_w_h16_8_c: 96.5 22.7 13.2 put_hevc_qpel_uni_w_h24_8_c: 216.0 51.2 33.2 put_hevc_qpel_uni_w_h32_8_c: 385.7 87.0 53.2 put_hevc_qpel_uni_w_h48_8_c: 860.5 192.0 113.2 put_hevc_qpel_uni_w_h64_8_c: 1531.0 334.2 200.0 put_hevc_qpel_uni_w_v4_8_c: 8.0 1.7 put_hevc_qpel_uni_w_v6_8_c: 17.2 4.5 put_hevc_qpel_uni_w_v8_8_c: 29.5 6.0 5.2 put_hevc_qpel_uni_w_v12_8_c: 65.2 16.0 11.7 put_hevc_qpel_uni_w_v16_8_c: 116.5 20.5 14.0 put_hevc_qpel_uni_w_v24_8_c: 259.2 48.5 37.2 put_hevc_qpel_uni_w_v32_8_c: 459.5 80.5 56.0 put_hevc_qpel_uni_w_v48_8_c: 1028.5 180.2 126.5 put_hevc_qpel_uni_w_v64_8_c: 1831.2 319.2 224.2 Speedup of decoding H265 4K 30FPS 30Mbps on 3A6000 with 8 threads is 4fps(48fps-->52fps). Change-Id: I1178848541d90083869225ba98a02e6aa8bb8c5a Reviewed-by: yinshiyou-hf@loongson.cn Signed-off-by: Michael Niedermayer <michael@niedermayer.cc> |
1 year ago |
jinbo |
a28eea2a27
|
avcodec/hevc: Add pel_uni_w_pixels4/6/8/12/16/24/32/48/64 asm opt
tests/checkasm/checkasm: C LSX LASX put_hevc_pel_uni_w_pixels4_8_c: 2.7 1.0 put_hevc_pel_uni_w_pixels6_8_c: 6.2 2.0 1.5 put_hevc_pel_uni_w_pixels8_8_c: 10.7 2.5 1.7 put_hevc_pel_uni_w_pixels12_8_c: 23.0 5.5 5.0 put_hevc_pel_uni_w_pixels16_8_c: 41.0 8.2 5.0 put_hevc_pel_uni_w_pixels24_8_c: 91.0 19.7 13.2 put_hevc_pel_uni_w_pixels32_8_c: 161.7 32.5 16.2 put_hevc_pel_uni_w_pixels48_8_c: 354.5 73.7 43.0 put_hevc_pel_uni_w_pixels64_8_c: 641.5 130.0 64.2 Speedup of decoding H265 4K 30FPS 30Mbps on 3A6000 with 8 threads is 1fps(47fps-->48fps). Reviewed-by: yinshiyou-hf@loongson.cn Signed-off-by: Michael Niedermayer <michael@niedermayer.cc> |
1 year ago |