FFmpeg

Commit Graph

Author	SHA1	Message	Date
James Almer	17c3cc5bb6	swscale/x86/rgb_2_rgb: add missing wrap to ff_uyvytoyuv422_avx2 Fixes old yasm. Signed-off-by: James Almer <jamrial@gmail.com>	5 months ago
James Almer	03546f49a3	swscale/x86/rgb2rgb: add missing wrap for ff_uyvytoyuv422_avx2 Signed-off-by: James Almer <jamrial@gmail.com>	5 months ago
James Almer	287d139b77	checkasm/sw_rgb: fix alignment of buffers for rgb_to_yuv tests src is apparently not guaranteed to be >8 byte aligned, but align to 16 nonetheless as the x86 asm will do unaligned loads anyway. dst is guaranteed to be 32 byte aligned for the Y plane, but 16 byte for UV. Signed-off-by: James Almer <jamrial@gmail.com>	5 months ago
James Almer	e8cef5e152	swscale/x86/rgb2rgb: remove mmxext version of shuffle_bytes_2103 Signed-off-by: James Almer <jamrial@gmail.com>	5 months ago
James Almer	c578bb9864	swscale/x86/input: add AVX2 optimized uyvytoyuv422 uyvytoyuv422_c: 23991.8 uyvytoyuv422_sse2: 2817.8 uyvytoyuv422_avx: 2819.3 uyvytoyuv422_avx2: 1972.3 Signed-off-by: James Almer <jamrial@gmail.com>	5 months ago
James Almer	e9cfd53257	swscale/x86/input: add AVX2 optimized RGB32 to YUV functions abgr_to_uv_8_c: 43.3 abgr_to_uv_8_sse2: 14.3 abgr_to_uv_8_avx: 15.3 abgr_to_uv_8_avx2: 18.8 abgr_to_uv_128_c: 650.3 abgr_to_uv_128_sse2: 110.8 abgr_to_uv_128_avx: 112.3 abgr_to_uv_128_avx2: 64.8 abgr_to_uv_1080_c: 5456.3 abgr_to_uv_1080_sse2: 888.8 abgr_to_uv_1080_avx: 900.8 abgr_to_uv_1080_avx2: 518.3 abgr_to_uv_1920_c: 9692.3 abgr_to_uv_1920_sse2: 1593.8 abgr_to_uv_1920_avx: 1613.3 abgr_to_uv_1920_avx2: 864.8 abgr_to_y_8_c: 23.3 abgr_to_y_8_sse2: 12.8 abgr_to_y_8_avx: 13.3 abgr_to_y_8_avx2: 17.3 abgr_to_y_128_c: 308.3 abgr_to_y_128_sse2: 67.3 abgr_to_y_128_avx: 66.8 abgr_to_y_128_avx2: 44.8 abgr_to_y_1080_c: 2371.3 abgr_to_y_1080_sse2: 512.8 abgr_to_y_1080_avx: 505.8 abgr_to_y_1080_avx2: 314.3 abgr_to_y_1920_c: 4177.3 abgr_to_y_1920_sse2: 915.8 abgr_to_y_1920_avx: 926.8 abgr_to_y_1920_avx2: 519.3 bgra_to_uv_8_c: 37.3 bgra_to_uv_8_sse2: 13.3 bgra_to_uv_8_avx: 14.8 bgra_to_uv_8_avx2: 19.8 bgra_to_uv_128_c: 563.8 bgra_to_uv_128_sse2: 111.3 bgra_to_uv_128_avx: 112.3 bgra_to_uv_128_avx2: 64.8 bgra_to_uv_1080_c: 4691.8 bgra_to_uv_1080_sse2: 893.8 bgra_to_uv_1080_avx: 899.8 bgra_to_uv_1080_avx2: 517.8 bgra_to_uv_1920_c: 8332.8 bgra_to_uv_1920_sse2: 1590.8 bgra_to_uv_1920_avx: 1605.8 bgra_to_uv_1920_avx2: 867.3 bgra_to_y_8_c: 22.3 bgra_to_y_8_sse2: 12.8 bgra_to_y_8_avx: 12.8 bgra_to_y_8_avx2: 17.3 bgra_to_y_128_c: 291.3 bgra_to_y_128_sse2: 67.8 bgra_to_y_128_avx: 69.3 bgra_to_y_128_avx2: 45.3 bgra_to_y_1080_c: 2357.3 bgra_to_y_1080_sse2: 508.3 bgra_to_y_1080_avx: 518.3 bgra_to_y_1080_avx2: 399.8 bgra_to_y_1920_c: 4202.8 bgra_to_y_1920_sse2: 906.8 bgra_to_y_1920_avx: 907.3 bgra_to_y_1920_avx2: 526.3 Signed-off-by: James Almer <jamrial@gmail.com>	5 months ago
James Almer	d5fe99dc5f	swscale/x86/input: add AVX2 optimized RGB24 to YUV functions rgb24_to_uv_8_c: 39.3 rgb24_to_uv_8_sse2: 14.3 rgb24_to_uv_8_ssse3: 13.3 rgb24_to_uv_8_avx: 12.8 rgb24_to_uv_8_avx2: 14.3 rgb24_to_uv_128_c: 582.8 rgb24_to_uv_128_sse2: 127.3 rgb24_to_uv_128_ssse3: 107.3 rgb24_to_uv_128_avx: 111.3 rgb24_to_uv_128_avx2: 62.3 rgb24_to_uv_1080_c: 4981.3 rgb24_to_uv_1080_sse2: 1048.3 rgb24_to_uv_1080_ssse3: 876.8 rgb24_to_uv_1080_avx: 887.8 rgb24_to_uv_1080_avx2: 492.3 rgb24_to_uv_1280_c: 5906.8 rgb24_to_uv_1280_sse2: 1263.3 rgb24_to_uv_1280_ssse3: 1048.3 rgb24_to_uv_1280_avx: 1045.8 rgb24_to_uv_1280_avx2: 579.8 rgb24_to_uv_1920_c: 8665.3 rgb24_to_uv_1920_sse2: 1888.8 rgb24_to_uv_1920_ssse3: 1571.8 rgb24_to_uv_1920_avx: 1558.8 rgb24_to_uv_1920_avx2: 869.3 rgb24_to_y_8_c: 20.3 rgb24_to_y_8_sse2: 11.8 rgb24_to_y_8_ssse3: 10.3 rgb24_to_y_8_avx: 10.3 rgb24_to_y_8_avx2: 10.8 rgb24_to_y_128_c: 284.8 rgb24_to_y_128_sse2: 83.3 rgb24_to_y_128_ssse3: 66.8 rgb24_to_y_128_avx: 64.8 rgb24_to_y_128_avx2: 39.3 rgb24_to_y_1080_c: 2451.3 rgb24_to_y_1080_sse2: 696.3 rgb24_to_y_1080_ssse3: 516.8 rgb24_to_y_1080_avx: 518.8 rgb24_to_y_1080_avx2: 301.8 rgb24_to_y_1280_c: 2892.8 rgb24_to_y_1280_sse2: 816.8 rgb24_to_y_1280_ssse3: 623.3 rgb24_to_y_1280_avx: 616.3 rgb24_to_y_1280_avx2: 350.8 rgb24_to_y_1920_c: 4338.8 rgb24_to_y_1920_sse2: 1210.8 rgb24_to_y_1920_ssse3: 928.3 rgb24_to_y_1920_avx: 920.3 rgb24_to_y_1920_avx2: 534.8 Signed-off-by: James Almer <jamrial@gmail.com>	5 months ago
James Almer	6743c2fc6a	checkasm/sw_rgb: test rgb32/rgb32_1 to yuv Test all four pixel formats, but only bench the two native endian ones for a given target. Signed-off-by: James Almer <jamrial@gmail.com>	5 months ago
James Almer	91b9af0058	x86/aacencdsp: add AVX version of quantize_bands quant_bands_signed_c: 1928.0 quant_bands_signed_sse2: 406.0 quant_bands_signed_avx: 207.0 quant_bands_unsigned_c: 1702.0 quant_bands_unsigned_sse2: 404.0 quant_bands_unsigned_avx: 209.0 Signed-off-by: James Almer <jamrial@gmail.com>	5 months ago
Rémi Denis-Courmont	7a3369398f	sws/input: R-V V 32-bit RGB to halved UV T-Head C908: abgr_to_uv_half_8_c: 2.2 abgr_to_uv_half_8_rvv_i32: 3.5 abgr_to_uv_half_128_c: 44.0 abgr_to_uv_half_128_rvv_i32: 13.0 abgr_to_uv_half_1080_c: 245.0 abgr_to_uv_half_1080_rvv_i32: 107.2 abgr_to_uv_half_1920_c: 406.2 abgr_to_uv_half_1920_rvv_i32: 188.7 bgra_to_uv_half_8_c: 2.2 bgra_to_uv_half_8_rvv_i32: 3.5 bgra_to_uv_half_128_c: 26.5 bgra_to_uv_half_128_rvv_i32: 13.0 bgra_to_uv_half_1080_c: 219.7 bgra_to_uv_half_1080_rvv_i32: 107.0 bgra_to_uv_half_1920_c: 406.7 bgra_to_uv_half_1920_rvv_i32: 188.7 SpacemiT X60: abgr_to_uv_half_8_c: 2.2 abgr_to_uv_half_8_rvv_i32: 3.0 abgr_to_uv_half_128_c: 28.2 abgr_to_uv_half_128_rvv_i32: 5.7 abgr_to_uv_half_1080_c: 235.5 abgr_to_uv_half_1080_rvv_i32: 47.7 abgr_to_uv_half_1920_c: 418.2 abgr_to_uv_half_1920_rvv_i32: 84.0 bgra_to_uv_half_8_c: 2.0 bgra_to_uv_half_8_rvv_i32: 3.0 bgra_to_uv_half_128_c: 23.7 bgra_to_uv_half_128_rvv_i32: 5.7 bgra_to_uv_half_1080_c: 195.5 bgra_to_uv_half_1080_rvv_i32: 47.7 bgra_to_uv_half_1920_c: 346.5 bgra_to_uv_half_1920_rvv_i32: 84.0	5 months ago
Rémi Denis-Courmont	e2f069905e	sws/input: R-V V 32-bit RGB to UV	5 months ago
Rémi Denis-Courmont	f5555cb106	sws/input: R-V V 32-bit RGB to Y T-Head C908: abgr_to_y_8_c: 2.5 abgr_to_y_8_rvv_i32: 2.2 abgr_to_y_128_c: 37.0 abgr_to_y_128_rvv_i32: 8.5 abgr_to_y_1080_c: 327.0 abgr_to_y_1080_rvv_i32: 69.5 abgr_to_y_1920_c: 552.0 abgr_to_y_1920_rvv_i32: 122.2 bgra_to_y_8_c: 2.5 bgra_to_y_8_rvv_i32: 2.2 bgra_to_y_128_c: 37.2 bgra_to_y_128_rvv_i32: 8.5 bgra_to_y_1080_c: 310.2 bgra_to_y_1080_rvv_i32: 69.5 bgra_to_y_1920_c: 568.2 bgra_to_y_1920_rvv_i32: 122.5 SpacemiT X60: abgr_to_y_8_c: 2.5 abgr_to_y_8_rvv_i32: 2.0 abgr_to_y_128_c: 33.0 abgr_to_y_128_rvv_i32: 3.7 abgr_to_y_1080_c: 276.0 abgr_to_y_1080_rvv_i32: 31.5 abgr_to_y_1920_c: 493.7 abgr_to_y_1920_rvv_i32: 55.5 bgra_to_y_8_c: 2.2 bgra_to_y_8_rvv_i32: 2.0 bgra_to_y_128_c: 33.0 bgra_to_y_128_rvv_i32: 3.7 bgra_to_y_1080_c: 276.0 bgra_to_y_1080_rvv_i32: 31.5 bgra_to_y_1920_c: 490.7 bgra_to_y_1920_rvv_i32: 55.5	5 months ago
Andreas Rheinhardt	8b62fb231a	swscale/x86/rgb2rgb: Detemplatize Every function in rgb2rgb_template.c is only compiled exactly once; there is no overlap at all between the MMXEXT and the SSE2 functions, so detemplatize it. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	5 months ago
Andreas Rheinhardt	5421dee0e7	swscale/x86/rgb2rgb_template: Remove unused uyvytoyv12 Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	5 months ago
Andreas Rheinhardt	c1c35380a7	swscale/x86/rgb2rgb: Don't unnecessarily check for inline ASM The SSE2 and AVX versions of deinterleaveBytes are external ASM. Move them out of the inline ASM template. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	5 months ago
Andreas Rheinhardt	f7305eb3b3	swscale/x86/rgb2rgb_template: Remove unnecessary SFENCE The ff_nv12ToUV_* functions don't use non-temporal stores at all. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	5 months ago
Andreas Rheinhardt	fca796ac3b	tests/checkasm/sw_rgb: Be more strict about clobbering MMX state The MMXEXT versions of the rgb2rgb functions tested here always emit emms on their own. Therefore one can use a stricter test to ensure that it stays that way. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	5 months ago
Andreas Rheinhardt	3af6136669	avcodec/dnxhdenc: Simplify padding It is unnecessary to first pad to 32bits; the memset later will pad everything will with zeroes anyway. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	5 months ago
Andreas Rheinhardt	b0e0b3c58a	avcodec/dnxhdenc: Move PutBitContext from ctx to stack Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	5 months ago
Andreas Rheinhardt	542abee213	avcodec/cbs_h266_syntax_template: Use correct format specifier H266RawSliceHeader.num_entry_points is an uint32_t. Fixes -Wformat warnings: https://fate.ffmpeg.org/log.cgi?slot=aarch64-osx-clang-1200.0.32.29&time=20240604151047&log=compile Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	5 months ago
Andreas Rheinhardt	8f199cfb5b	avformat/evc: Fix format specifiers Fixes -Wformat warnings; see e.g. https://fate.ffmpeg.org/log.cgi?slot=aarch64-osx-clang-1200.0.32.29&time=20240604151047&log=compile Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	5 months ago
Andreas Rheinhardt	5f31a4fd16	avformat/vvc: Don't use uint8_t iterators, fix shadowing Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	5 months ago
Andreas Rheinhardt	1c4362cce9	avformat/vvc: Fix comment Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	5 months ago
Andreas Rheinhardt	fa77dc8c44	avformat/vvc: Reindent after the previous commit Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	5 months ago
Andreas Rheinhardt	8b6c7e7cda	avformat/vvc: Fix crash on allocation failure, avoid allocations This is the VVC version of `8b5d155301`. (Hint: This ensures that the order of NALU arrays is OPI-VPS-SPS-PPS- Prefix-SEI-Suffix-SEI, regardless of the order in the original extradata. I hope this is right.) Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	5 months ago
Andreas Rheinhardt	4482b3353d	avformat/vvc: Don't use ff_copy_bits() There is no benefit in using it: The fast path of copying is not taken because of misalignment; furthermore we are only dealing with a few byte here anyway, so simply copy the bytes manually, avoiding the dependency on bitstream.c in lavf (which also contains a function that is completely unused in lavf). Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	5 months ago
Andreas Rheinhardt	52fb49a8a3	avformat/vvc: Use put_bytes_output() The PutBitContext has just been flushed. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	5 months ago
Andreas Rheinhardt	dd8fb0aaae	avcodec/hevc/Makefile: Move rules for lavc/* files to lavc/Makefile If any of these files (say A) would be changed in such a way that A acquires a new dependency on another file B, building B would need to be added to all the rules that lead to A being built. Yet currently the rules for several files are spread over the lavc Makefile and the Makefile of the lavc/hevc subdir, making it more likely to be forgotten. So move the rules for these files to the lavc/Makefile. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	5 months ago
Rémi Denis-Courmont	daac101e61	lavc/aacencdsp: fix rounding in R-V V quantize_bands We need to round toward zero here.	5 months ago
Rémi Denis-Courmont	658439934b	lavc/vp8dsp: R-V V vp8_idct_add T-Head C908 (cycles): vp8_idct_add_c: 312.2 vp8_idct_add_rvv_i32: 117.0	5 months ago
Rémi Denis-Courmont	e0f4d185f1	sws/input: R-V V rgb24ToUV_half and bgr24ToUV_half T-Head C908: rgb24_to_uv_half_4_c: 2.0 rgb24_to_uv_half_4_rvv_i32: 3.5 rgb24_to_uv_half_64_c: 27.0 rgb24_to_uv_half_64_rvv_i32: 12.5 rgb24_to_uv_half_540_c: 223.7 rgb24_to_uv_half_540_rvv_i32: 105.2 rgb24_to_uv_half_640_c: 265.5 rgb24_to_uv_half_640_rvv_i32: 123.7 rgb24_to_uv_half_960_c: 414.5 rgb24_to_uv_half_960_rvv_i32: 249.5 SpacemiT X60: rgb24_to_uv_half_4_c: 1.7 rgb24_to_uv_half_4_rvv_i32: 4.2 rgb24_to_uv_half_64_c: 24.0 rgb24_to_uv_half_64_rvv_i32: 8.7 rgb24_to_uv_half_540_c: 199.2 rgb24_to_uv_half_540_rvv_i32: 72.5 rgb24_to_uv_half_640_c: 235.7 rgb24_to_uv_half_640_rvv_i32: 85.2 rgb24_to_uv_half_960_c: 353.5 rgb24_to_uv_half_960_rvv_i32: 127.5	5 months ago
Rémi Denis-Courmont	3ef5867e4b	sws/input: R-V V rgb24ToUV and bgr24ToUV T-Head C908: rgb24_to_uv_8_c: 2.7 rgb24_to_uv_8_rvv_i32: 3.2 rgb24_to_uv_128_c: 41.0 rgb24_to_uv_128_rvv_i32: 12.7 rgb24_to_uv_1080_c: 342.5 rgb24_to_uv_1080_rvv_i32: 105.7 rgb24_to_uv_1280_c: 406.0 rgb24_to_uv_1280_rvv_i32: 124.2 rgb24_to_uv_1920_c: 626.0 rgb24_to_uv_1920_rvv_i32: 186.0 SpacemiT X60: rgb24_to_uv_8_c: 2.5 rgb24_to_uv_8_rvv_i32: 3.0 rgb24_to_uv_128_c: 36.5 rgb24_to_uv_128_rvv_i32: 5.7 rgb24_to_uv_1080_c: 304.2 rgb24_to_uv_1080_rvv_i32: 49.0 rgb24_to_uv_1280_c: 360.5 rgb24_to_uv_1280_rvv_i32: 57.5 rgb24_to_uv_1920_c: 540.7 rgb24_to_uv_1920_rvv_i32: 86.2	5 months ago
Rémi Denis-Courmont	79dfdac4db	sws/input: R-V V rgb24ToY & bgr24ToY T-Head C908: rgb24_to_y_8_c: 2.0 rgb24_to_y_8_rvv_i32: 2.7 rgb24_to_y_128_c: 26.2 rgb24_to_y_128_rvv_i32: 9.2 rgb24_to_y_1080_c: 219.5 rgb24_to_y_1080_rvv_i32: 76.2 rgb24_to_y_1280_c: 276.2 rgb24_to_y_1280_rvv_i32: 89.7 rgb24_to_y_1920_c: 389.7 rgb24_to_y_1920_rvv_i32: 134.2 SpacemiT X60: rgb24_to_y_8_c: 1.7 rgb24_to_y_8_rvv_i32: 2.2 rgb24_to_y_128_c: 23.2 rgb24_to_y_128_rvv_i32: 4.2 rgb24_to_y_1080_c: 195.0 rgb24_to_y_1080_rvv_i32: 33.7 rgb24_to_y_1280_c: 231.0 rgb24_to_y_1280_rvv_i32: 40.0 rgb24_to_y_1920_c: 346.2 rgb24_to_y_1920_rvv_i32: 59.7	5 months ago
Wenbin Chen	7560db937d	libavfi/dnn: enable LibTorch xpu device option support Add xpu device support to libtorch backend. To enable xpu support you need to add "-Wl,--no-as-needed -lintel-ext-pt-gpu -Wl,--as-needed" to "--extra-libs" when configure ffmpeg. Signed-off-by: Wenbin Chen <wenbin.chen@intel.com>	5 months ago
Nuo Mi	f68f40736f	avcodec/vvcdec: support mv wraparound A 360 video specific tool see https://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=9503377 passed files: DMVR_A_Huawei_3.bit WRAP_D_InterDigital_4.bit WRAP_A_InterDigital_4.bit WRAP_B_InterDigital_4.bit WRAP_C_InterDigital_4.bit ERP_A_MediaTek_3.bit	5 months ago
Nuo Mi	685174069f	avcodec/vvcdec: misc, reindent inter.c	5 months ago
Nuo Mi	a4013e748a	avcodec/vvcdec: refact out emulated_edge_no_wrap prepare for refrence wraparound	5 months ago
Nuo Mi	8abdf0a28e	avcodec/vvcdec: misc, move src offset inside emulated_edge	5 months ago
Nuo Mi	2d98786fee	avcodec/vvcdec: refact, remove emulated_edge_dmvr and emulated_edge_bilinear to simplify code	5 months ago
Lynne	714596bcbf	aacdec_usac: zero out alpha values for the current frame	5 months ago
Lynne	c2d459cb51	aacdec_usac: fix stereo alpha values for transients Typo. Also added comments and fixed the branch underneath.	5 months ago
Lynne	7223523335	aacdec_usac: use correct TNS values The standard slightly modified the maximum TNS bands allowed.	5 months ago
Lynne	9b41cc0430	aacdec_usac: do not round noise amplitude values Use floating point division instead of integer division.	5 months ago
Lynne	a18d0659f4	aacdec_usac: skip coeff decoding if the number to be decoded is 0 Yet another thing not mentioned in the spec.	5 months ago
Lynne	1ad9a4008b	aacdec_usac: decouple TNS active from TNS data present flag The issue was that in case of common TNS parameters, TNS was entirely skipped, as tns.present was set to 0.	5 months ago
Lynne	c0fdb0cdfd	aacdec_usac: do not continue parsing bitstream on core_mode == 1 Although LPD is not functional yet, the bitstream ends at that point.	5 months ago
Lynne	8ecaa64b9b	aacdec_usac: respect tns_on_lr flag This was left out, and due to av_unused, forgotten about.	5 months ago
Lynne	25b848a0bd	aacdec_usac: correctly set and use the layout map	5 months ago
Lynne	ae495b56ff	aacdec_usac: remove fallback for custom maps with invalid position Not needed as every possible index is mapped.	5 months ago
Lynne	91ab17e2fe	aacdec_usac: tag LFE channels as such in the channel map Missed.	5 months ago

1 2 3 4 5 ...

115708 Commits (a8f9d52c227841929959cd414398cfa426b6024e) All Branches Search

115708 Commits (a8f9d52c227841929959cd414398cfa426b6024e)

All Branches