yuanhecai
a87a52ed0b
avcodec/hevc: Add ff_hevc_idct_32x32_lasx asm opt
...
tests/checkasm/checkasm:
C LSX LASX
hevc_idct_32x32_8_c: 1243.0 211.7 101.7
Speedup of decoding H265 4K 30FPS 30Mbps on
3A6000 with 8 threads is 1fps(56fps-->57fps).
Reviewed-by: yinshiyou-hf@loongson.cn
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
1 year ago
jinbo
a28eea2a27
avcodec/hevc: Add pel_uni_w_pixels4/6/8/12/16/24/32/48/64 asm opt
...
tests/checkasm/checkasm: C LSX LASX
put_hevc_pel_uni_w_pixels4_8_c: 2.7 1.0
put_hevc_pel_uni_w_pixels6_8_c: 6.2 2.0 1.5
put_hevc_pel_uni_w_pixels8_8_c: 10.7 2.5 1.7
put_hevc_pel_uni_w_pixels12_8_c: 23.0 5.5 5.0
put_hevc_pel_uni_w_pixels16_8_c: 41.0 8.2 5.0
put_hevc_pel_uni_w_pixels24_8_c: 91.0 19.7 13.2
put_hevc_pel_uni_w_pixels32_8_c: 161.7 32.5 16.2
put_hevc_pel_uni_w_pixels48_8_c: 354.5 73.7 43.0
put_hevc_pel_uni_w_pixels64_8_c: 641.5 130.0 64.2
Speedup of decoding H265 4K 30FPS 30Mbps on 3A6000 with
8 threads is 1fps(47fps-->48fps).
Reviewed-by: yinshiyou-hf@loongson.cn
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
1 year ago
jinbo
cfbdda607d
avcodec/hevc: Add add_residual_4/8/16/32 asm opt
...
After this patch, the peformance of decoding H265 4K 30FPS 30Mbps
on 3A6000 with 8 threads improves 2fps (45fps-->47fsp).
Reviewed-by: yinshiyou-hf@loongson.cn
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
1 year ago
yuanhecai
f6077cc666
avcodec/la: Add LSX optimization for h264 qpel.
...
./configure --disable-lasx
ffmpeg -i 1_h264_1080p_30fps_3Mbps.mp4 -f rawvideo -y /dev/null -an
before: 214fps
after: 274fps
Reviewed-by: Shiyou Yin <yinshiyou-hf@loongson.cn>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2 years ago
Lu Wang
8815a7719e
avcodec/la: Add LSX optimization for h264 chroma and intrapred.
...
./configure --disable-lasx
ffmpeg -i 1_h264_1080p_30fps_3Mbps.mp4 -f rawvideo -y /dev/null -an
before: 199fps
after: 214fps
Reviewed-by: Shiyou Yin <yinshiyou-hf@loongson.cn>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2 years ago
Hao Chen
7845b5ecd6
avcodec/la: Add LSX optimization for loop filter.
...
Replaced function(LSX is sufficient for these functions):
ff_h264_v_lpf_chroma_8_lasx
ff_h264_h_lpf_chroma_8_lasx
ff_h264_v_lpf_chroma_intra_8_lasx
ff_h264_h_lpf_chroma_intra_8_lasx
ff_weight_h264_pixels4_8_lasx
ff_biweight_h264_pixels4_8_lasx
./configure --disable-lasx
ffmpeg -i 1_h264_1080p_30fps_3Mbps.mp4 -f rawvideo -y /dev/null -an
before: 161fps
after: 199fps
Reviewed-by: Shiyou Yin <yinshiyou-hf@loongson.cn>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2 years ago
Shiyou Yin
e1b6ecd20a
avcodec/la: add LSX optimization for h264 idct.
...
loongson_asm.S is LoongArch asm optimization helper.
Add functions:
ff_h264_idct_add_8_lsx
ff_h264_idct8_add_8_lsx
ff_h264_idct_dc_add_8_lsx
ff_h264_idct8_dc_add_8_lsx
ff_h264_idct_add16_8_lsx
ff_h264_idct8_add4_8_lsx
ff_h264_idct_add8_8_lsx
ff_h264_idct_add8_422_8_lsx
ff_h264_idct_add16_intra_8_lsx
ff_h264_luma_dc_dequant_idct_8_lsx
Replaced function(LSX is sufficient for these functions):
ff_h264_idct_add_lasx
ff_h264_idct4x4_addblk_dc_lasx
ff_h264_idct_add16_lasx
ff_h264_idct8_add4_lasx
ff_h264_idct_add8_lasx
ff_h264_idct_add8_422_lasx
ff_h264_idct_add16_intra_lasx
ff_h264_deq_idct_luma_dc_lasx
Renamed functions:
ff_h264_idct8_addblk_lasx ==> ff_h264_idct8_add_8_lasx
ff_h264_idct8_dc_addblk_lasx ==> ff_h264_idct8_dc_add_8_lasx
./configure --disable-lasx
ffmpeg -i 1_h264_1080p_30fps_3Mbps.mp4 -f rawvideo -y /dev/null -an
before: 155fps
after: 161fps
Reviewed-by: Shiyou Yin <yinshiyou-hf@loongson.cn>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2 years ago
Lu Wang
72604b10f4
avcodec: [loongarch] Optimize Hevc_mc_uni/w with LSX.
...
ffmpeg -i 5_h265_1080p_60fps_3Mbps.mkv -f rawvideo -y /dev/null -an
before: 182fps
after : 191fps
Signed-off-by: Hao Chen <chenhao@loongson.cn>
Reviewed-by: 殷时友 <yinshiyou-hf@loongson.cn>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
3 years ago
Hao Chen
a70a5b7c62
avcodec: [loongarch] Optimize Hevc_mc_bi with LSX.
...
ffmpeg -i 5_h265_1080p_60fps_3Mbps.mkv -f rawvideo -y /dev/null -an
before: 124fps
after : 182fps
Signed-off-by: Hao Chen <chenhao@loongson.cn>
Reviewed-by: 殷时友 <yinshiyou-hf@loongson.cn>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
3 years ago
Lu Wang
b6ceeee16b
avcodec: [loongarch] Optimize Hevc_idct/lpf with LSX.
...
ffmpeg -i 5_h265_1080p_60fps_3Mbps.mkv -f rawvideo -y /dev/null -an
before: 110fps
after : 124fps
Signed-off-by: Hao Chen <chenhao@loongson.cn>
Reviewed-by: 殷时友 <yinshiyou-hf@loongson.cn>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
3 years ago
Lu Wang
20194d573d
avcodec: [loongarch] Optimize Hevcdsp with LSX.
...
ffmpeg -i 5_h265_1080p_60fps_3Mbps.mkv -f rawvideo -y /dev/null -an
before: 94fps
after : 110fps
Signed-off-by: Hao Chen <chenhao@loongson.cn>
Reviewed-by: 殷时友 <yinshiyou-hf@loongson.cn>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
3 years ago
gxw
8ca7d474c1
avcodec: [loongarch] Optimize prefetch with loongarch.
...
./ffmpeg -i ../1_h264_1080p_30fps_3Mbps.mp4 -f rawvideo -y /dev/null -an
before:296
after :308
Reviewed-by: 殷时友 <yinshiyou-hf@loongson.cn>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
3 years ago
Hao Chen
555b850bd5
avcodec: [loongarch] Optimize idctdstp with LASX.
...
./ffmpeg -i 8_mpeg4_1080p_24fps_12Mbps.avi -f rawvideo -y /dev/null -an
before:433fps
after :552fps
Reviewed-by: 殷时友 <yinshiyou-hf@loongson.cn>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
3 years ago
Shiyou Yin
5d58355bf1
avcodec: [loongarch] Optimize hpeldsp with LASX.
...
./ffmpeg -i 8_mpeg4_1080p_24fps_12Mbps.avi -f rawvideo -y /dev/null -an
before:376fps
after :433fps
Reviewed-by: 殷时友 <yinshiyou-hf@loongson.cn>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
3 years ago
Hao Chen
60ead5cd68
avcodec: [loongarch] Optimize vc1dsp with LASX.
...
./ffmpeg -i 11_wmv3_720p_24fps_7Mbps.wmv -f rawvideo -y /dev/null -an
before:131fps
after :229fps
Reviewed-by: Shiyou Yin <yinshiyou-hf@loongson.cn>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
3 years ago
Jin Bo
fea299f876
avcodec: [loongarch] Optimize vp9_lpf/idct with LSX.
...
ffmpeg -i ../10_vp9_1080p_30fps_3Mbps.webm -f rawvideo -y /dev/null -an
before:294fps
after :567fps
Reviewed-by: Shiyou Yin <yinshiyou-hf@loongson.cn>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
3 years ago
Hao Chen
2fd914e079
avcodec: [loongarch] Optimize vp9_mc/intra with LSX.
...
ffmpeg -i ../10_vp9_1080p_30fps_3Mbps.webm -f rawvideo -y /dev/null -an
before:170fps
after :294fps
Reviewed-by: Shiyou Yin <yinshiyou-hf@loongson.cn>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
3 years ago
yuanhecai
72bcbe216e
avcodec: [loongarch] Optimize vp8_lpf/mc with LSX.
...
./ffmpeg -i ../9_vp8_1080p_30fps_2Mbps.webm -f rawvideo -y /dev/null -an
before: 210fps
after : 585fps
Reviewed-by: Shiyou Yin <yinshiyou-hf@loongson.cn>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
3 years ago
Hao Chen
df46d7cb49
avcodec: [loongarch] Optimize pred16x16_plane with LASX.
...
./ffmpeg -i ../1_h264_1080p_30fps_3Mbps.mp4 -f rawvideo -y /dev/null -an
before:295
after :296
Change-Id: I281bc739f708d45f91fc3860150944c0b8a6a5ba
Reviewed-by: Shiyou Yin <yinshiyou-hf@loongson.cn>
Reviewed-by: guxiwei <guxiwei-hf@loongson.cn>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
3 years ago
Jin Bo
1ccc458960
avcodec: [loongarch] Optimize h264_deblock with LASX.
...
./ffmpeg -i ../1_h264_1080p_30fps_3Mbps.mp4 -f rawvideo -y /dev/null -an
before:293
after :295
Change-Id: I5ff6cba4eaca0c4218c0c97b880ca500e35f9c87
Signed-off-by: Hao Chen <chenhao@loongson.cn>
Reviewed-by: Shiyou Yin <yinshiyou-hf@loongson.cn>
Reviewed-by: guxiwei <guxiwei-hf@loongson.cn>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
3 years ago
Lu Wang
5ff58b77bb
avcodec: [loongarch] Optimize h264idct with LASX.
...
./ffmpeg -i ../1_h264_1080p_30fps_3Mbps.mp4 -f rawvideo -y /dev/null -an
before:282
after :293
Change-Id: Ia8889935a6359630dd5dbb61263287f1cb24a0a4
Reviewed-by: Shiyou Yin <yinshiyou-hf@loongson.cn>
Reviewed-by: guxiwei <guxiwei-hf@loongson.cn>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
3 years ago
gxw
3f294ec879
avcodec: [loongarch] Optimize h264dsp with LASX.
...
./ffmpeg -i ../1_h264_1080p_30fps_3Mbps.mp4 -f rawvideo -y /dev/null -an
before:225
after :282
Change-Id: Ibe245827dcdfe8fc1541c6b172483151bfa9e642
Reviewed-by: Shiyou Yin <yinshiyou-hf@loongson.cn>
Reviewed-by: guxiwei <guxiwei-hf@loongson.cn>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
3 years ago
Shiyou Yin
cba7c0267d
avcodec: [loongarch] Optimize h264qpel with LASX.
...
./ffmpeg -i ../1_h264_1080p_30fps_3Mbps.mp4 -f rawvideo -y /dev/null -an
before:183
after :225
Change-Id: I7c7d2f34cd82ef728aab5ce8f6bfb46dd81f0da4
Reviewed-by: Shiyou Yin <yinshiyou-hf@loongson.cn>
Reviewed-by: guxiwei <guxiwei-hf@loongson.cn>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
3 years ago
Shiyou Yin
6038a9eb92
avcodec: [loongarch] Optimize h264_chroma_mc with LASX.
...
./ffmpeg -i ../1_h264_1080p_30fps_3Mbps.mp4 -f rawvideo -y /dev/null -an
before:170
after :183
Change-Id: I42ff23cc2dc7c32bd1b7e4274da9d9ec87065f20
Reviewed-by: Shiyou Yin <yinshiyou-hf@loongson.cn>
Reviewed-by: guxiwei <guxiwei-hf@loongson.cn>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
3 years ago