Niklas Haas
2f77ecc6bc
avcodec/riscv: add h264 qpel
...
Benched on K230 for VLEN 128, SpaceMIT for VLEN 256. Variants for 4
width have no speedup for VLEN 256 vs VLEN 128 on available hardware,
so were disabled.
C RVV128 C RVV256
avg_h264_qpel_4_mc00_8 33.9 33.6 (1.01x)
avg_h264_qpel_4_mc01_8 218.8 89.1 (2.46x)
avg_h264_qpel_4_mc02_8 218.8 79.8 (2.74x)
avg_h264_qpel_4_mc03_8 218.8 89.1 (2.46x)
avg_h264_qpel_4_mc10_8 172.3 126.1 (1.37x)
avg_h264_qpel_4_mc11_8 339.1 190.8 (1.78x)
avg_h264_qpel_4_mc12_8 533.6 357.6 (1.49x)
avg_h264_qpel_4_mc13_8 348.4 190.8 (1.83x)
avg_h264_qpel_4_mc20_8 144.8 116.8 (1.24x)
avg_h264_qpel_4_mc21_8 478.1 385.6 (1.24x)
avg_h264_qpel_4_mc22_8 348.4 283.6 (1.23x)
avg_h264_qpel_4_mc23_8 478.1 394.6 (1.21x)
avg_h264_qpel_4_mc30_8 172.6 126.1 (1.37x)
avg_h264_qpel_4_mc31_8 339.4 191.1 (1.78x)
avg_h264_qpel_4_mc32_8 542.9 357.6 (1.52x)
avg_h264_qpel_4_mc33_8 339.4 191.1 (1.78x)
avg_h264_qpel_8_mc00_8 116.8 42.9 (2.72x) 123.6 50.6 (2.44x)
avg_h264_qpel_8_mc01_8 774.4 163.1 (4.75x) 779.8 165.1 (4.72x)
avg_h264_qpel_8_mc02_8 774.4 154.1 (5.03x) 779.8 144.3 (5.40x)
avg_h264_qpel_8_mc03_8 774.4 163.3 (4.74x) 779.8 165.3 (4.72x)
avg_h264_qpel_8_mc10_8 617.1 237.3 (2.60x) 613.1 227.6 (2.69x)
avg_h264_qpel_8_mc11_8 1209.3 376.4 (3.21x) 1206.8 363.1 (3.32x)
avg_h264_qpel_8_mc12_8 1913.3 598.6 (3.20x) 1894.3 561.1 (3.38x)
avg_h264_qpel_8_mc13_8 1218.6 376.4 (3.24x) 1217.1 363.1 (3.35x)
avg_h264_qpel_8_mc20_8 524.4 228.1 (2.30x) 519.3 227.6 (2.28x)
avg_h264_qpel_8_mc21_8 1709.6 681.9 (2.51x) 1707.1 644.3 (2.65x)
avg_h264_qpel_8_mc22_8 1274.3 459.6 (2.77x) 1279.8 436.1 (2.93x)
avg_h264_qpel_8_mc23_8 1700.3 672.6 (2.53x) 1706.8 644.6 (2.65x)
avg_h264_qpel_8_mc30_8 607.6 246.6 (2.46x) 623.6 238.1 (2.62x)
avg_h264_qpel_8_mc31_8 1209.6 376.4 (3.21x) 1206.8 363.1 (3.32x)
avg_h264_qpel_8_mc32_8 1904.1 607.9 (3.13x) 1894.3 571.3 (3.32x)
avg_h264_qpel_8_mc33_8 1209.6 376.1 (3.22x) 1206.8 363.1 (3.32x)
avg_h264_qpel_16_mc00_8 431.9 89.1 (4.85x) 436.1 71.3 (6.12x)
avg_h264_qpel_16_mc01_8 2894.6 376.1 (7.70x) 2842.3 300.6 (9.46x)
avg_h264_qpel_16_mc02_8 2987.3 348.4 (8.57x) 2967.3 290.1 (10.23x)
avg_h264_qpel_16_mc03_8 2885.3 376.4 (7.67x) 2842.3 300.6 (9.46x)
avg_h264_qpel_16_mc10_8 2404.1 524.4 (4.58x) 2404.8 456.8 (5.26x)
avg_h264_qpel_16_mc11_8 4709.4 811.6 (5.80x) 4675.6 706.8 (6.62x)
avg_h264_qpel_16_mc12_8 7477.9 1274.3 (5.87x) 7436.1 1061.1 (7.01x)
avg_h264_qpel_16_mc13_8 4718.6 820.6 (5.75x) 4655.1 706.8 (6.59x)
avg_h264_qpel_16_mc20_8 2052.1 487.1 (4.21x) 2071.3 446.3 (4.64x)
avg_h264_qpel_16_mc21_8 7440.6 1422.6 (5.23x) 6727.8 1217.3 (5.53x)
avg_h264_qpel_16_mc22_8 5051.9 950.4 (5.32x) 5071.6 790.3 (6.42x)
avg_h264_qpel_16_mc23_8 6764.9 1422.3 (4.76x) 6748.6 1217.3 (5.54x)
avg_h264_qpel_16_mc30_8 2413.1 524.4 (4.60x) 2415.1 467.3 (5.17x)
avg_h264_qpel_16_mc31_8 4681.6 839.1 (5.58x) 4675.6 727.6 (6.43x)
avg_h264_qpel_16_mc32_8 8579.6 1292.8 (6.64x) 7436.3 1071.3 (6.94x)
avg_h264_qpel_16_mc33_8 5375.9 829.9 (6.48x) 4665.3 717.3 (6.50x)
put_h264_qpel_4_mc00_8 24.4 24.4 (1.00x)
put_h264_qpel_4_mc01_8 987.4 79.8 (12.37x)
put_h264_qpel_4_mc02_8 190.8 79.8 (2.39x)
put_h264_qpel_4_mc03_8 209.6 89.1 (2.35x)
put_h264_qpel_4_mc10_8 163.3 117.1 (1.39x)
put_h264_qpel_4_mc11_8 339.4 181.6 (1.87x)
put_h264_qpel_4_mc12_8 533.6 348.4 (1.53x)
put_h264_qpel_4_mc13_8 339.4 190.8 (1.78x)
put_h264_qpel_4_mc20_8 126.3 116.8 (1.08x)
put_h264_qpel_4_mc21_8 468.9 376.1 (1.25x)
put_h264_qpel_4_mc22_8 330.1 274.4 (1.20x)
put_h264_qpel_4_mc23_8 468.9 376.1 (1.25x)
put_h264_qpel_4_mc30_8 163.3 126.3 (1.29x)
put_h264_qpel_4_mc31_8 339.1 191.1 (1.77x)
put_h264_qpel_4_mc32_8 533.6 348.4 (1.53x)
put_h264_qpel_4_mc33_8 339.4 181.8 (1.87x)
put_h264_qpel_8_mc00_8 98.6 33.6 (2.93x) 92.3 40.1 (2.30x)
put_h264_qpel_8_mc01_8 737.1 153.8 (4.79x) 738.1 144.3 (5.12x)
put_h264_qpel_8_mc02_8 663.1 135.3 (4.90x) 665.1 134.1 (4.96x)
put_h264_qpel_8_mc03_8 737.4 154.1 (4.79x) 1508.8 144.3 (10.46x)
put_h264_qpel_8_mc10_8 598.4 237.1 (2.52x) 592.3 227.6 (2.60x)
put_h264_qpel_8_mc11_8 1172.3 357.9 (3.28x) 1175.6 342.3 (3.43x)
put_h264_qpel_8_mc12_8 1867.1 589.1 (3.17x) 1863.1 561.1 (3.32x)
put_h264_qpel_8_mc13_8 1172.6 366.9 (3.20x) 1175.6 352.8 (3.33x)
put_h264_qpel_8_mc20_8 450.4 218.8 (2.06x) 446.3 206.8 (2.16x)
put_h264_qpel_8_mc21_8 1672.3 663.1 (2.52x) 1675.6 633.8 (2.64x)
put_h264_qpel_8_mc22_8 1144.6 1200.1 (0.95x) 1144.3 425.6 (2.69x)
put_h264_qpel_8_mc23_8 1672.6 672.4 (2.49x) 1665.3 634.1 (2.63x)
put_h264_qpel_8_mc30_8 598.6 237.3 (2.52x) 613.1 227.6 (2.69x)
put_h264_qpel_8_mc31_8 1172.3 376.1 (3.12x) 1175.6 352.6 (3.33x)
put_h264_qpel_8_mc32_8 1857.8 598.6 (3.10x) 1863.1 561.1 (3.32x)
put_h264_qpel_8_mc33_8 1172.3 376.1 (3.12x) 1175.6 352.8 (3.33x)
put_h264_qpel_16_mc00_8 320.6 61.4 (5.22x) 321.3 60.8 (5.28x)
put_h264_qpel_16_mc01_8 2774.3 339.1 (8.18x) 2759.1 279.8 (9.86x)
put_h264_qpel_16_mc02_8 2589.1 320.6 (8.08x) 2571.6 269.3 (9.55x)
put_h264_qpel_16_mc03_8 2774.3 339.4 (8.17x) 2738.1 290.1 (9.44x)
put_h264_qpel_16_mc10_8 2274.3 487.4 (4.67x) 2290.1 436.1 (5.25x)
put_h264_qpel_16_mc11_8 5237.1 792.9 (6.60x) 4529.8 685.8 (6.61x)
put_h264_qpel_16_mc12_8 7357.6 1255.8 (5.86x) 7352.8 1040.1 (7.07x)
put_h264_qpel_16_mc13_8 4579.9 792.9 (5.78x) 4571.6 686.1 (6.66x)
put_h264_qpel_16_mc20_8 1802.1 459.6 (3.92x) 1800.6 425.6 (4.23x)
put_h264_qpel_16_mc21_8 6644.6 2246.6 (2.96x) 6644.3 1196.6 (5.55x)
put_h264_qpel_16_mc22_8 4589.1 913.4 (5.02x) 4592.3 769.3 (5.97x)
put_h264_qpel_16_mc23_8 6644.6 1394.6 (4.76x) 6634.1 1196.6 (5.54x)
put_h264_qpel_16_mc30_8 2274.3 496.6 (4.58x) 2290.1 456.8 (5.01x)
put_h264_qpel_16_mc31_8 5255.6 802.1 (6.55x) 4550.8 706.8 (6.44x)
put_h264_qpel_16_mc32_8 7376.1 1265.1 (5.83x) 7352.8 1050.6 (7.00x)
put_h264_qpel_16_mc33_8 4579.9 802.1 (5.71x) 4561.1 696.3 (6.55x)
Signed-off-by: Niklas Haas <git@haasn.dev>
Signed-off-by: J. Dekker <jdek@itanimul.li>
5 months ago
Cameron Gutman
4ffd586e34
avcodec/amfenc: Add support for on-demand key frames
...
v2: Use forced_idr option instead of AV_FRAME_FLAG_KEY
Signed-off-by: Cameron Gutman <aicommander@gmail.com>
5 months ago
Cameron Gutman
e3ae57b0de
avcodec/amfenc: Update supported HEVC color ranges
...
We properly set AMF_VIDEO_ENCODER_HEVC_NOMINAL_RANGE since fb4dd4b6f4
.
Signed-off-by: Cameron Gutman <aicommander@gmail.com>
5 months ago
Cameron Gutman
4cbb997e15
avcodec/amfenc: Fix inverted loop filter option
...
The AMF HEVC encoder takes a bool option for whether deblocking filter
should be _disabled_ instead of whether it should _enabled_ like the
AMF H.264 encoder does. The logic was accidentally copied from H.264 to
HEVC without negating the bool value, so the deblocking filter was
actually disabled when AV_CODEC_FLAG_LOOP_FILTER was set.
Before this patch:
------------------
no flags set => deblocking filter on
flags +loop => deblocking filter off
flags -loop => deblocking filter on
After this patch:
-----------------
no flags set => deblocking filter on
flags +loop => deblocking filter on
flags -loop => deblocking filter off
Signed-off-by: Cameron Gutman <aicommander@gmail.com>
5 months ago
Lynne
81c6e6c9ee
vulkan_encode_h265: fix rate control VBV values
...
The values written were placeholder values.
5 months ago
Lynne
934be0ff50
vulkan_encode_h264: fix rate control VBV values
...
The values must be in milliseconds, not bytes.
5 months ago
Martin Storsjö
a3ec1f8c6c
aarch64: h26x: Fix the indentation of one function
...
Signed-off-by: Martin Storsjö <martin@martin.st>
5 months ago
Michael Niedermayer
a9c83e43f2
avcodec/ffv1enc: Fix >8bit context size
...
Fixes: Ticket5405
Sponsored-by: Sovereign Tech Fund
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
5 months ago
Michael Niedermayer
747e1a26af
avcodec/ffv1dec: Blow up if user asks for explosion
...
Fixes: Ticket8403
Sponsored-by: Sovereign Tech Fund
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
5 months ago
James Almer
feef692881
avcodec/hevc/sei: num_ref_displays can be up to 32
...
Signed-off-by: James Almer <jamrial@gmail.com>
6 months ago
James Almer
b8e74aa1cb
avcodec/cbs_h265: num_ref_displays can be up to 32
...
Signed-off-by: James Almer <jamrial@gmail.com>
6 months ago
James Almer
853c165386
avcodec/cbs_h265: fix valid range for {left,right}_view_id
...
view_id_len in VPS is 4 bits, so view_id values can be up to 15 bits long.
Signed-off-by: James Almer <jamrial@gmail.com>
6 months ago
Rémi Denis-Courmont
6611bf5484
lavc/h264dsp: optimise R-V V biweight for shorter heights
...
T-Head C908:
h264_biweight2_8_c: 313.7 ( 1.00x)
h264_biweight2_8_rvv_i32: before 239.5 ( 1.23x)
h264_biweight2_8_rvv_i32: after 72.7 ( 4.31x)
h264_biweight4_8_c: 582.0 ( 1.00x)
h264_biweight4_8_rvv_i32: before 471.0 ( 1.16x)
h264_biweight4_8_rvv_i32: after 91.5 ( 6.36x)
h264_biweight8_8_c: 1110.0 ( 1.00x)
h264_biweight8_8_rvv_i32: before 943.3 ( 1.10x)
h264_biweight8_8_rvv_i64: after 147.0 ( 7.55x)
SpacemiT X60:
h264_biweight2_8_c: 311.4 ( 1.00x)
h264_biweight2_8_rvv_i32: before 363.1 ( 0.83x)
h264_biweight2_8_rvv_i32: after 103.1 ( 3.02x)
h264_biweight4_8_c: 571.9 ( 1.00x)
h264_biweight4_8_rvv_i32: before 717.4 ( 0.78x)
h264_biweight4_8_rvv_i32: after 71.8 ( 7.96x)
h264_biweight8_8_c: 1103.1 ( 1.00x)
h264_biweight8_8_rvv_i32: before 1415.2 ( 0.76x)
h264_biweight8_8_rvv_i64: ater 92.8 (11.88x)
6 months ago
Rémi Denis-Courmont
459a1512f1
lavc/h264dsp: unroll R-V V weight16
...
As VLSE128.V does not exist, we have no other way to deal with latency.
T-Head C908:
h264_weight16_8_c: 989.4 ( 1.00x)
h264_weight16_8_rvv_i32: 193.2 ( 5.12x)
SpacemiT X60:
h264_weight16_8_c: 874.1 ( 1.00x)
h264_weight16_8_rvv_i32: 196.9 ( 4.44x)
6 months ago
Rémi Denis-Courmont
4936bb2508
lavc/h264dsp: optimise R-V V weight for shorter heights
...
The height is a power of two of up to 16 rows. The current code was
optimised for large sample counts.
T-Head C908:
h264_weight2_8_c: 211.7 ( 1.00x)
h264_weight2_8_rvv_i32: before 184.0 ( 1.15x)
h264_weight2_8_rvv_i32: after 54.2 ( 3.90x)
h264_weight4_8_c: 285.7 ( 1.00x)
h264_weight4_8_rvv_i32: before 341.2 ( 0.86x)
h264_weight4_8_rvv_i32: after 82.2 ( 3.47x)
h264_weight8_8_c: 498.7 ( 1.00x)
h264_weight8_8_rvv_i32: before 683.7 ( 0.73x)
h264_weight8_8_rvv_i64: after 128.5 ( 3.95x)
h264_weight16_8_c: 878.2 ( 1.00x)
h264_weight16_8_rvv_i32: unchanged 239.5 ( 3.67x)
SpacemiT X60:
h264_weight2_8_c: 207.2 ( 1.00x)
h264_weight2_8_rvv_i32: before 259.6 ( 0.80x)
h264_weight2_8_rvv_i32: after 82.2 ( 2.52x)
h264_weight4_8_c: 290.8 ( 1.00x)
h264_weight4_8_rvv_i32: before 509.6 ( 0.57x)
h264_weight4_8_rvv_i32: after 61.5 ( 4.73x)
h264_weight8_8_c: 498.8 ( 1.00x)
h264_weight8_8_rvv_i32: before 1019.8 ( 0.49x)
h264_weight8_8_rvv_i64: after 71.8 ( 6.95x)
h264_weight16_8_c: 874.0 ( 1.00x)
h264_weight16_8_rvv_i32: unchanged 249.0 ( 3.51x)
6 months ago
sunyuechi
ba7d0d5fc3
lavc/vvc_mc: R-V V avg w_avg
...
C908 X60
avg_8_2x2_c : 1.2 1.0
avg_8_2x2_rvv_i32 : 0.7 0.7
avg_8_2x4_c : 2.0 2.2
avg_8_2x4_rvv_i32 : 1.2 1.2
avg_8_2x8_c : 3.7 4.0
avg_8_2x8_rvv_i32 : 1.7 1.5
avg_8_2x16_c : 7.2 7.7
avg_8_2x16_rvv_i32 : 3.0 2.7
avg_8_2x32_c : 14.2 15.2
avg_8_2x32_rvv_i32 : 5.5 5.0
avg_8_2x64_c : 51.0 43.7
avg_8_2x64_rvv_i32 : 39.2 29.7
avg_8_2x128_c : 100.5 79.2
avg_8_2x128_rvv_i32 : 79.7 68.2
avg_8_4x2_c : 1.7 2.0
avg_8_4x2_rvv_i32 : 1.0 0.7
avg_8_4x4_c : 3.5 3.7
avg_8_4x4_rvv_i32 : 1.2 1.2
avg_8_4x8_c : 6.7 7.0
avg_8_4x8_rvv_i32 : 1.7 1.5
avg_8_4x16_c : 13.5 14.0
avg_8_4x16_rvv_i32 : 3.0 2.7
avg_8_4x32_c : 26.2 27.7
avg_8_4x32_rvv_i32 : 5.5 4.7
avg_8_4x64_c : 73.0 73.7
avg_8_4x64_rvv_i32 : 39.0 32.5
avg_8_4x128_c : 143.0 137.2
avg_8_4x128_rvv_i32 : 72.7 68.0
avg_8_8x2_c : 3.5 3.5
avg_8_8x2_rvv_i32 : 1.0 0.7
avg_8_8x4_c : 6.2 6.5
avg_8_8x4_rvv_i32 : 1.5 1.0
avg_8_8x8_c : 12.7 13.2
avg_8_8x8_rvv_i32 : 2.0 1.5
avg_8_8x16_c : 25.0 26.5
avg_8_8x16_rvv_i32 : 3.2 2.7
avg_8_8x32_c : 50.0 52.7
avg_8_8x32_rvv_i32 : 6.2 5.0
avg_8_8x64_c : 118.7 122.5
avg_8_8x64_rvv_i32 : 40.2 31.5
avg_8_8x128_c : 236.7 220.2
avg_8_8x128_rvv_i32 : 85.2 67.7
avg_8_16x2_c : 6.2 6.7
avg_8_16x2_rvv_i32 : 1.2 0.7
avg_8_16x4_c : 12.5 13.0
avg_8_16x4_rvv_i32 : 1.7 1.0
avg_8_16x8_c : 24.5 26.0
avg_8_16x8_rvv_i32 : 3.0 1.7
avg_8_16x16_c : 49.0 51.5
avg_8_16x16_rvv_i32 : 5.5 3.0
avg_8_16x32_c : 97.5 102.5
avg_8_16x32_rvv_i32 : 10.5 5.5
avg_8_16x64_c : 213.7 222.0
avg_8_16x64_rvv_i32 : 48.5 34.2
avg_8_16x128_c : 434.7 420.0
avg_8_16x128_rvv_i32 : 97.7 74.0
avg_8_32x2_c : 12.2 12.7
avg_8_32x2_rvv_i32 : 1.5 1.0
avg_8_32x4_c : 24.5 25.5
avg_8_32x4_rvv_i32 : 3.0 1.7
avg_8_32x8_c : 48.5 50.7
avg_8_32x8_rvv_i32 : 5.2 2.7
avg_8_32x16_c : 96.7 101.2
avg_8_32x16_rvv_i32 : 10.2 5.0
avg_8_32x32_c : 192.7 202.2
avg_8_32x32_rvv_i32 : 19.7 9.5
avg_8_32x64_c : 427.5 426.5
avg_8_32x64_rvv_i32 : 64.2 18.2
avg_8_32x128_c : 816.5 821.0
avg_8_32x128_rvv_i32 : 135.2 75.5
avg_8_64x2_c : 24.0 25.2
avg_8_64x2_rvv_i32 : 2.7 1.5
avg_8_64x4_c : 48.2 50.5
avg_8_64x4_rvv_i32 : 5.0 2.7
avg_8_64x8_c : 96.0 100.7
avg_8_64x8_rvv_i32 : 9.7 4.5
avg_8_64x16_c : 207.7 201.2
avg_8_64x16_rvv_i32 : 19.0 9.0
avg_8_64x32_c : 383.2 402.0
avg_8_64x32_rvv_i32 : 37.5 17.5
avg_8_64x64_c : 837.2 828.7
avg_8_64x64_rvv_i32 : 84.7 35.5
avg_8_64x128_c : 1640.7 1640.2
avg_8_64x128_rvv_i32 : 206.0 153.0
avg_8_128x2_c : 48.7 51.0
avg_8_128x2_rvv_i32 : 5.2 2.7
avg_8_128x4_c : 96.7 101.5
avg_8_128x4_rvv_i32 : 10.2 5.0
avg_8_128x8_c : 192.2 202.0
avg_8_128x8_rvv_i32 : 19.7 9.2
avg_8_128x16_c : 400.7 403.2
avg_8_128x16_rvv_i32 : 38.7 18.5
avg_8_128x32_c : 786.7 805.7
avg_8_128x32_rvv_i32 : 77.0 36.2
avg_8_128x64_c : 1615.5 1655.5
avg_8_128x64_rvv_i32 : 189.7 80.7
avg_8_128x128_c : 3182.0 3238.0
avg_8_128x128_rvv_i32 : 397.5 308.5
w_avg_8_2x2_c : 1.7 1.2
w_avg_8_2x2_rvv_i32 : 1.2 1.0
w_avg_8_2x4_c : 2.7 2.7
w_avg_8_2x4_rvv_i32 : 1.7 1.5
w_avg_8_2x8_c : 21.7 4.7
w_avg_8_2x8_rvv_i32 : 2.7 2.5
w_avg_8_2x16_c : 9.5 9.2
w_avg_8_2x16_rvv_i32 : 4.7 4.2
w_avg_8_2x32_c : 19.0 18.7
w_avg_8_2x32_rvv_i32 : 9.0 8.0
w_avg_8_2x64_c : 62.0 50.2
w_avg_8_2x64_rvv_i32 : 47.7 33.5
w_avg_8_2x128_c : 116.7 87.7
w_avg_8_2x128_rvv_i32 : 80.0 69.5
w_avg_8_4x2_c : 2.5 2.5
w_avg_8_4x2_rvv_i32 : 1.2 1.0
w_avg_8_4x4_c : 4.7 4.5
w_avg_8_4x4_rvv_i32 : 1.7 1.7
w_avg_8_4x8_c : 9.0 8.7
w_avg_8_4x8_rvv_i32 : 2.7 2.5
w_avg_8_4x16_c : 17.7 17.5
w_avg_8_4x16_rvv_i32 : 4.7 4.2
w_avg_8_4x32_c : 35.0 35.0
w_avg_8_4x32_rvv_i32 : 9.0 8.0
w_avg_8_4x64_c : 100.5 84.5
w_avg_8_4x64_rvv_i32 : 42.2 33.7
w_avg_8_4x128_c : 203.5 151.2
w_avg_8_4x128_rvv_i32 : 83.0 69.5
w_avg_8_8x2_c : 4.5 4.5
w_avg_8_8x2_rvv_i32 : 1.2 1.2
w_avg_8_8x4_c : 8.7 8.7
w_avg_8_8x4_rvv_i32 : 2.0 1.7
w_avg_8_8x8_c : 17.0 17.0
w_avg_8_8x8_rvv_i32 : 3.2 2.5
w_avg_8_8x16_c : 34.0 33.5
w_avg_8_8x16_rvv_i32 : 5.5 4.2
w_avg_8_8x32_c : 86.0 67.5
w_avg_8_8x32_rvv_i32 : 10.5 8.0
w_avg_8_8x64_c : 187.2 149.5
w_avg_8_8x64_rvv_i32 : 45.0 35.5
w_avg_8_8x128_c : 342.7 290.0
w_avg_8_8x128_rvv_i32 : 108.7 70.2
w_avg_8_16x2_c : 8.5 8.2
w_avg_8_16x2_rvv_i32 : 2.0 1.2
w_avg_8_16x4_c : 16.7 16.7
w_avg_8_16x4_rvv_i32 : 3.0 1.7
w_avg_8_16x8_c : 33.2 33.5
w_avg_8_16x8_rvv_i32 : 5.5 3.0
w_avg_8_16x16_c : 66.2 66.7
w_avg_8_16x16_rvv_i32 : 10.5 5.0
w_avg_8_16x32_c : 132.5 131.0
w_avg_8_16x32_rvv_i32 : 20.0 9.7
w_avg_8_16x64_c : 340.0 283.5
w_avg_8_16x64_rvv_i32 : 60.5 37.2
w_avg_8_16x128_c : 641.2 597.5
w_avg_8_16x128_rvv_i32 : 118.7 77.7
w_avg_8_32x2_c : 16.5 16.7
w_avg_8_32x2_rvv_i32 : 3.2 1.7
w_avg_8_32x4_c : 33.2 33.2
w_avg_8_32x4_rvv_i32 : 5.5 2.7
w_avg_8_32x8_c : 66.0 62.5
w_avg_8_32x8_rvv_i32 : 10.5 5.0
w_avg_8_32x16_c : 131.5 132.0
w_avg_8_32x16_rvv_i32 : 20.2 9.5
w_avg_8_32x32_c : 261.7 272.0
w_avg_8_32x32_rvv_i32 : 39.7 18.0
w_avg_8_32x64_c : 575.2 545.5
w_avg_8_32x64_rvv_i32 : 105.5 58.7
w_avg_8_32x128_c : 1154.2 1088.0
w_avg_8_32x128_rvv_i32 : 207.0 98.0
w_avg_8_64x2_c : 33.0 33.0
w_avg_8_64x2_rvv_i32 : 6.2 2.7
w_avg_8_64x4_c : 65.5 66.0
w_avg_8_64x4_rvv_i32 : 11.5 5.0
w_avg_8_64x8_c : 131.2 132.5
w_avg_8_64x8_rvv_i32 : 22.5 9.5
w_avg_8_64x16_c : 268.2 262.5
w_avg_8_64x16_rvv_i32 : 44.2 18.0
w_avg_8_64x32_c : 561.5 528.7
w_avg_8_64x32_rvv_i32 : 88.0 35.2
w_avg_8_64x64_c : 1136.2 1124.0
w_avg_8_64x64_rvv_i32 : 222.0 82.2
w_avg_8_64x128_c : 2345.0 2312.7
w_avg_8_64x128_rvv_i32 : 423.0 190.5
w_avg_8_128x2_c : 65.7 66.5
w_avg_8_128x2_rvv_i32 : 11.2 5.5
w_avg_8_128x4_c : 131.2 132.2
w_avg_8_128x4_rvv_i32 : 22.0 10.2
w_avg_8_128x8_c : 263.5 312.0
w_avg_8_128x8_rvv_i32 : 43.2 19.7
w_avg_8_128x16_c : 528.7 526.2
w_avg_8_128x16_rvv_i32 : 85.5 39.5
w_avg_8_128x32_c : 1067.7 1062.7
w_avg_8_128x32_rvv_i32 : 171.7 78.2
w_avg_8_128x64_c : 2234.7 2168.7
w_avg_8_128x64_rvv_i32 : 400.0 159.0
w_avg_8_128x128_c : 4752.5 4295.0
w_avg_8_128x128_rvv_i32 : 757.7 365.5
Signed-off-by: Rémi Denis-Courmont <remi@remlab.net>
6 months ago
Michael Niedermayer
38e224c2ba
*/version.h: bump after release/7.1 branch
...
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
6 months ago
Michael Niedermayer
e1094ac45d
*/version.h: bump minor versions for release/7.1
...
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
6 months ago
Michael Niedermayer
56bef2fd58
avcodec/xan: Add basic input size check
...
Fixes: Timeout
Fixes: 71739/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_XAN_WC3_fuzzer-6170301405134848
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpe
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
6 months ago
Michael Niedermayer
03050a0d90
avcodec/vble: Allocate buffer later
...
Fixes: Timeout
Fixes: 71727/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_VBLE_fuzzer-6126342574243840
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
6 months ago
Michael Niedermayer
5f471b500c
avcodec/sgirledec: Check input length
...
Fixes: Timeout
Fixes: 71712/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_SGIRLE_fuzzer-5763700835811328
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
6 months ago
Michael Niedermayer
8367d7e184
avcodec/imm4: Check input size
...
Fixes: Timeout
Fixes: 71324/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_IMM4_fuzzer-5388489435185152
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
6 months ago
Michael Niedermayer
050b5e85cb
avcodec/svq3: Check for minimum size input
...
Fixes: Timeout
Fixes: 71295/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_SVQ3_fuzzer-4999941125111808
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
6 months ago
Michael Niedermayer
c3a1cbbf5d
avcodec/eacmv: Check input size for intra frames
...
Fixes: Timeout
Fixes: 71135/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_EACMV_fuzzer-6251879028293632
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
6 months ago
Michael Niedermayer
74385dd496
avcodec/encode: Check bitrate
...
Fixes: -1.80923e+19 is outside the range of representable values of type 'long'
Fixes: 71103/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_SNOW_fuzzer-6542773681979392
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
6 months ago
Michael Niedermayer
36924fa306
avcodec/aac/aacdec: use correct index in deallocation
...
Fixes: memleak
Fixes: 71084/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_AAC_LATM_fuzzer-5857751899635712
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
6 months ago
Michael Niedermayer
9d25b9665e
avcodec/cbs_h266_syntax_template: Check bit depth with range extension
...
Fixes: shift exponent 62 is too large for 32-bit type 'int'
Fixes: 71020/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_VVC_fuzzer-6444916325023744
Fixes: 71285/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_VVC_fuzzer-4761971281428480
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
6 months ago
Michael Niedermayer
e9f588af95
avcodec/osq: use unsigned for decorrelation
...
Fixes: signed integer overflow: 1205469696 + 1901074655 cannot be represented in type 'int'
Fixes: 70773/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_OSQ_fuzzer-5419594888577024
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
6 months ago
Michael Niedermayer
f27c8b04d3
avcodec/jfdctint_template: use unsigned z* in row_fdct()
...
Fixes: signed integer overflow: 856827136 + 2123580416 cannot be represented in type 'int'
Fixes: 70772/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_PRORES_KS_fuzzer-5180569961431040
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
6 months ago
Michael Niedermayer
ad35eaf848
avcodec/osq: Treat sum = 0 as k = 0
...
We have no valid sample that triggers this so we do not know if this would decode
correctly, but -inf is not the correct k value
Fixes: Assertion n>=0 && n<=32 failed at libavcodec/get_bits.h:423
Fixes: -inf is outside the range of representable values of type 'int'
Fixes: 70709/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_OSQ_fuzzer-6223623839350784
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
6 months ago
James Almer
aef221b22a
avcodec/hevc/refs: export Stereo 3D side data
...
Use the 3D Reference Displays Info SEI message to link a view_id with
an eye.
Signed-off-by: James Almer <jamrial@gmail.com>
6 months ago
Anton Khirnov
14746871e1
lavc/hevcdec: implement decoding MV-HEVC
...
At most two layers are supported.
Aspects of this work were sponsored by Vimeo and Meta.
6 months ago
Anton Khirnov
0fde9c609f
lavc/decode: merge stereo3d information from decoder with packet side data
...
The HEVC decoder will start setting stereoscopic view position (left or
right) based on 3D Reference Displays Info SEI message in future
commits. This information should be merged with container-derived
stereo3D side data.
6 months ago
Anton Khirnov
327080c088
lavc/decode: make sure side data mapping does not produce duplicates
...
Also, deduplicate the code performing the mapping.
6 months ago
Anton Khirnov
bcbe999077
lavc/decode: clear side data in reget_buffer()
...
Otherwise it may accumulate when e.g. global side data is repeatedly
copied to the frame with in each subsequent reget_buffer() call.
6 months ago
Anton Khirnov
e19551d165
lavc/decode: do not clear the frame discard flag in ff_decode_frame_props_from_pkt()
...
Only do it in reget_buffer().
The purpose of this clearing this flag is to prevent it for
unintentionally persisting across multiple invocations of this function
on one frame, however that is only a problem if the frame is not
unreffed between uses, which is only the case with reget_buffer().
In other cases the caller may legitimately want to set the discard flag
and should have the option of doing so.
6 months ago
Anton Khirnov
75914b5822
lavc/hevc/hevcdec: implement MV-HEVC inter-layer prediction
...
The per-frame reference picture set contains two more lists -
INTER_LAYER[01]. Assuming at most two layers, INTER_LAYER1 is always
empty, but is added anyway for completeness.
When inter-layer prediction is enabled, INTER_LAYER0 for the
second-layer frame will contain the base-layer frame from the same
access unit, if it exists.
The new lists are then used in per-slice reference picture set
construction as per F.8.3.4 "Decoding process for reference picture
lists construction".
6 months ago
Anton Khirnov
02a9435cb0
lavc/hevcdec: implement slice header parsing for nuh_layer_id>0
...
Cf. F.7.3.6.1 "General slice segment header syntax"
6 months ago
Anton Khirnov
a811ab74f0
lavc/hevc/parser: only split packets on NALUs with nuh_layer_id=0
...
A packet should contain a full access unit, which for multilayer video
should contain all the layers.
6 months ago
Anton Khirnov
52ce2d2a04
lavc/hevcdec/parse: process NALUs with nuh_layer_id>0
...
Otherwise parameter sets from extradata with nuh_layer_id>0 would be
ignored. Needed for upcoming MV-HEVC support.
6 months ago
Anton Khirnov
81e9afa6c2
lavc/hevc/ps: reindent
6 months ago
Anton Khirnov
7d245866b8
lavc/hevc/ps: implement SPS parsing for nuh_layer_id>0
...
Cf. F.7.3.2.2 "Sequence parameter set RBSP syntax", which extends normal
SPS parsing with special clauses depending on MultiLayerExtSpsFlag.
6 months ago
Anton Khirnov
4359467ad6
lavc/hevc/ps: drop a warning for sps_multilayer_extension_flag
...
SPS multilayer extension contains a single flag that we are free to
ignore, no reason to print a warning.
6 months ago
Niklas Haas
7351e067bc
lavc/hevc_ps: parse VPS extension
...
Only implementing what's needed for MV-HEVC with two views.
Signed-off-by: Anton Khirnov <anton@khirnov.net>
6 months ago
James Almer
efa9d3deca
avcodec/hevc/sei: add support for 3D Reference Displays Information SEI
...
Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Anton Khirnov <anton@khirnov.net>
6 months ago
James Almer
660e7e6a0e
avcodec: add LCEVC decoding support via LCEVCdec
...
Signed-off-by: James Almer <jamrial@gmail.com>
6 months ago
James Almer
6147385393
avcodec: add an export_side_data flag to export picture enhancement layers
...
Signed-off-by: James Almer <jamrial@gmail.com>
6 months ago
James Almer
d250cc02e2
avcodec/hevc/refs: ensure LCEVC SEI payloads are exported as frame side data before get_buffer() calls
...
Signed-off-by: James Almer <jamrial@gmail.com>
6 months ago
James Almer
dbbf9a5ff7
avcodec/decode: split ProgressFrame allocator into two functions
...
Signed-off-by: James Almer <jamrial@gmail.com>
6 months ago
Víctor Manuel Jáquez Leal
2bcc124e1a
vulkan_encode: set the quality level in session parameters
...
While running this command
./ffmpeg_g -loglevel debug -hwaccel vulkan -init_hw_device vulkan=vk:0,debug=1 -hwaccel_output_format vulkan -i input.y4m -vf 'format=nv12,hwupload' -c:v h264_vulkan -quality 2 output.mp4 -y
It hit this validation error:
Validation Error: [ VUID-vkCmdEncodeVideoKHR-None-08318 ] Object 0: handle =
0x8f000000008f, type = VK_OBJECT_TYPE_VIDEO_SESSION_KHR; Object 1: handle =
0xfd00000000fd, type = VK_OBJECT_TYPE_VIDEO_SESSION_PARAMETERS_KHR;
| MessageID = 0x5dc3dd39
| vkCmdEncodeVideoKHR(): The currently configured encode quality level (2) for
VkVideoSessionKHR 0x8f000000008f[] does not match the encode quality level (0)
VkVideoSessionParametersKHR 0xfd00000000fd[] was created with. The Vulkan spec
states: The bound video session parameters object must have been created with
the currently set video encode quality level for the bound video session at the
time the command is executed on the
device (https://www.khronos.org/registry/vulkan/specs/1.3-extensions/html/vkspec.html#VUID-vkCmdEncodeVideoKHR-None-08318 )
This patch adds a new function helper for creating session parameters, which
also sets the quality level and it's called by the H.264 and H.265 Vulkan
encoders.
6 months ago