FFmpeg

Commit Graph

Author	SHA1	Message	Date
Lynne	157cd820ad	vulkan: remove pointless mutex locks This code was simply incorrect through and through. It did not protect what actually has to be protected in a multi-threaded setup. Perhaps it was used to silence threading errors? Either way, remove it, and document the correct way to use execution pools in a threaded environment.	3 months ago
Lynne	7239be07be	vulkan_decode: use a single execution pool Originally, the decoder had a single execution pool, with one execution context per thread. Execution pools were always intended to be thread-safe, as long as there were enough execution contexts in the pool to satisfy all threads. Due to synchronization issues, the threading part was removed at some point, and, for decoding, each thread had its own execution pool. Having a single execution pool per context is hacky, not to mention wasteful. Most importantly, we cannot associate single shaders across multiple execution pools for a single application. This means that we cannot use shaders to either apply film grain, or use this framework for software-defined decoders. The recent commits added threading capabilities back to the execution pool, and the number of contexts in each pool was increased. This was done with the assumption that the execution pool was singular, which it was not. This led to increased parallelism and number of frames in flight, which is taxing on memory. This commit finally restores proper threading behaviour. The validation layer has isses that are reported and addressed in the earlier commit.	3 months ago
Lynne	4ca2b86ed5	hwcontext_vulkan: disable validation layer threading warnings The layer is buggy currently: https://github.com/KhronosGroup/Vulkan-ValidationLayers/issues/9045	3 months ago
Lynne	18af3a1db2	hwcontext_vulkan: do not enable portability subset by default It doesn't make sense to, and could result in the implementation picking emulation layers.	3 months ago
Benjamin Cheng	bf9f921ef7	avcodec/hw_base_encode: restrict size of next_prev Some drivers are more strict about the size of the reference lists given (i.e. VAOn12 [1]). The next_prev list is used to handle multiple "L0" references in AV1 encode. Restrict the size of next_prev based on the value of ref_l0 when the GOP structure is initialized. [1] https://github.com/intel/cartwheel-ffmpeg/issues/278 v2: fix indentation issues	3 months ago
Nuo Mi	0a6388d1da	avcodec/hevcdec: remove hevc prefix for x86 asm files	3 months ago
Nuo Mi	8d27256a74	avcodec/vvcdec: remove vvc prefix for x86 and riscv	3 months ago
Peter Ross	350ebef112	avformat/iff: remove surplus if statement Fixes CID 1636854	3 months ago
Peter Ross	b2cba76d4f	avformat/riff: map 0069 twocc to ADPCM IMA XBOX decoder	3 months ago
Paul B Mahol	c3083b3266	avcodec: add ADPCM IMA XBOX decoder	3 months ago
Niklas Haas	095f8038fa	swscale/output: fix bilinear yuv2rgb chroma interpolation These functions were divided into two special cases; one assuming that uvalpha == 0, and the other assuming that uvalpha == 2048. This worked fine for simple 2x chroma upscaling but broke for e.g. yuv410p, non-centered chroma, or other special cases that involved non-aligned chroma filters. Fix it by instead dividing this check into two cases, a uvalpha==0 fast path and a uvalpha>0 general path. Instead of (A+B)/2 the general path now multiplies in the true uvalpha weight. I tried preserving the old fast path for the case of uvalpha == 2048, but this was significantly slower in practise versus having just one general path. However, we still need a uvalpha == 0 path for the unscaled case. Fixes: ticket #5083 Signed-off-by: Niklas Haas <git@haasn.dev> Sponsored-by: Sovereign Tech Fund	3 months ago
sunyuechi	6b31e42c47	lavc/riscv: vset macro for simplify if-else	3 months ago
Zhao Zhili	952508ae05	aarch64/vvc: Add apply_bdof Test on rpi 5 with gcc 12: apply_bdof_8_8x16_c: 7315.2 ( 1.00x) apply_bdof_8_8x16_neon: 1876.8 ( 3.90x) apply_bdof_8_16x8_c: 7170.5 ( 1.00x) apply_bdof_8_16x8_neon: 1752.8 ( 4.09x) apply_bdof_8_16x16_c: 14695.2 ( 1.00x) apply_bdof_8_16x16_neon: 3490.5 ( 4.21x) apply_bdof_10_8x16_c: 7371.5 ( 1.00x) apply_bdof_10_8x16_neon: 1863.8 ( 3.96x) apply_bdof_10_16x8_c: 7172.0 ( 1.00x) apply_bdof_10_16x8_neon: 1766.0 ( 4.06x) apply_bdof_10_16x16_c: 14551.5 ( 1.00x) apply_bdof_10_16x16_neon: 3576.0 ( 4.07x) apply_bdof_12_8x16_c: 7236.5 ( 1.00x) apply_bdof_12_8x16_neon: 1863.8 ( 3.88x) apply_bdof_12_16x8_c: 7316.5 ( 1.00x) apply_bdof_12_16x8_neon: 1758.8 ( 4.16x) apply_bdof_12_16x16_c: 14691.2 ( 1.00x) apply_bdof_12_16x16_neon: 3480.5 ( 4.22x)	3 months ago
Peter Ross	7aeae8d1ae	avcodec/Makefile: include aom_film_grain.o file for h264_sei component h264_sei depends on h2645_sei, which in turn depends on aom_film_grain for ff_aom_uninit_film_grain_params()	3 months ago
Peter Ross	6bf9252807	avformat/Makefile: include object files for image_vbn_pipe demuxer	3 months ago
Peter Ross	c90e0777da	avformat/iff: SndAnim decoding Fixes ticket #5553.	3 months ago
James Almer	4e2b9df48c	avformat/isom: use more of the existing channel layout bitmap defines Signed-off-by: James Almer <jamrial@gmail.com>	3 months ago
James Almer	76049d1c45	avformat/iamf_writer: fix setting num_samples_per_frame for OPUS As per section 3.11.1 of the IAMF spec, the sample rate used in Codec Config for Opus shall be 48kHz, regardless of the original sample rate used during encoding. Signed-off-by: James Almer <jamrial@gmail.com>	3 months ago
Dmitrii Ovchinnikov	95217872ad	avcodec/amfenc: B-Frame support for av1_amf encoder.	3 months ago
Dmitrii Ovchinnikov	c037eb8424	amfenc: Update the min version to 1.4.35.0 for AMF SDK.	3 months ago
Cameron Gutman	a40cbf9792	avcodec/amfenc: Implement async_depth option This option, which is also available on other FFmpeg hardware encoders, allows the user to trade throughput for reduced output latency. This is useful for ultra low latency applications like game streaming. Signed-off-by: Cameron Gutman <aicommander@gmail.com>	3 months ago
Peter Ross	494c961379	avformat/Makefile: add iso_writer golomb_tab from shared library dependency	3 months ago
Niklas Haas	b38f6f9990	tests/swscale: allow nonzero positive return codes from sws_scale_frame() See previous commit. Signed-off-by: Niklas Haas <git@haasn.dev> Sponsored-by: Sovereign Tech Fund	3 months ago
Niklas Haas	e05a1bb879	swscale: fix documentation of sws_scale_frame() Since its introduction, this function has claimed to return 0 on success, yet never actually did so (until the introduction of the new graph based API). It always returned the number of scaled lines, and continues to do so. To avoid confusion, but also avoid regressing possible clients that relied on the existing semantics, simply update the documentation to reflect the actual behavior. Remain ambiguous about the exact interpretation of the return value on account of the unfortunate difference in behavior between the legacy and new scaling APIs. Signed-off-by: Niklas Haas <git@haasn.dev> Sponsored-by: Sovereign Tech Fund	3 months ago
Niklas Haas	2df655bc2c	swscale/utils: fix sws_getCachedContext check This logic was inverted, but \|\| was not replaced by &&. Fixes: `ed5dd67562` Fixes: ticket #11353 Signed-off-by: Niklas Haas <git@haasn.dev> Sponsored-by: Sovereign Tech Fund	3 months ago
Martin Storsjö	d1e37eb0cd	avutil/mem_internal: Don't include stdalign.h on MSVC It's currently actually not used in MSVC builds, since `6e49b86996`. Older versions of MSVC (or, in particular, older versions of UCRT) don't have stdalign.h; it's available since WinSDK 10.0.20348.0; such a new enough version has been installed by default only since MSVC 2022 17.4 and newer. With this change, ffmpeg can still be built with MSVC 2019 16.8 (v19.28). Signed-off-by: Martin Storsjö <martin@martin.st>	3 months ago
Martin Storsjö	2bb00ef59c	aarch64: vvc: Fix building the dmvr_hv assembly with older MSVC versions Explicitly use ldur for unaligned offsets; newer versions of armasm64 implicitly convert ldr to ldur as necessary, but older versions require it explicitly written out. This fixes these build errors: ffmpeg\libavcodec\aarch64\vvc\inter.o.asm(2039) : error A2518: operand 2: Memory offset must be aligned ldr s5, [x1, #1] ffmpeg\libavcodec\aarch64\vvc\inter.o.asm(2250) : error A2518: operand 2: Memory offset must be aligned ldr d7, [x1, #2] Signed-off-by: Martin Storsjö <martin@martin.st>	3 months ago
Peter Ross	8272d34377	configure: add iso_writer golomb dependency since commit `fce0622d0b`, libavformat/hevc.c depends on golomb vlc tables.	3 months ago
David Rosca	d0facac679	lavc/vaapi_encode_h265: Use surface alignment This is needed to correctly set conformance window crop with Mesa AMD. Signed-off-by: Timo Rothenpieler <timo@rothenpieler.org>	3 months ago
David Rosca	bcfbf2bac8	lavc/vaapi_encode: Query surface alignment It needs to create temporary config to query surface attribute. Signed-off-by: Timo Rothenpieler <timo@rothenpieler.org>	3 months ago
Bin Peng	72a3656e84	lavc/aarch64: Fix ff_pred16x16_plane_neon_10 Fix test failure on aarch64: ./tests/checkasm/checkasm --test=h264pred 367840 Signed-off-by: Peng Bin <pengbin@visionular.com> Signed-off-by: Martin Storsjö <martin@martin.st>	3 months ago
Bin Peng	decc9e643c	lavc/aarch64: Fix ff_pred8x8_plane_neon_10 Fix test failure on aarch64: ./tests/checkasm/checkasm --test=h264pred 479612 The mismatch between neon and C functions can also be reproduced using the following bitstream and command line. wget https://streams.videolan.org/ffmpeg/incoming/intra8x8pred_10bit.264 ./ffmpeg -cpuflags 0 -threads 1 -i intra8x8pred_10bit.264 -f framemd5 -y md5_ref ./ffmpeg -threads 1 -i intra8x8pred_10bit.264 -f framemd5 -y md5_neon Signed-off-by: Bin Peng <pengbin@visionular.com> Signed-off-by: Martin Storsjö <martin@martin.st>	3 months ago
Zhao Zhili	7b0bd6c4a7	avutil/vulkan_glslang: Fix build failure compile_only isn't available until 13.1.0. Let default initialization set it to zero, so the code works with version before and after 13.1.0.	3 months ago
Rémi Denis-Courmont	bd226fdd74	lavc/h264dsp: R-V V intra loop filter As with the inter loop filter, performance metrics seem to be biased in favour of the C implementation because checkasm inputs almost always fall in the no-op case. h264_h_loop_filter_chroma_intra_8bpp_c: 82.8 ( 1.00x) h264_h_loop_filter_chroma_intra_8bpp_rvv_i32: 72.6 ( 1.14x) h264_h_loop_filter_chroma_mbaff_intra_8bpp_c: 41.1 ( 1.00x) h264_h_loop_filter_chroma_mbaff_intra_8bpp_rvv_i32: 72.6 ( 0.57x) h264_h_loop_filter_luma_intra_8bpp_c: 166.1 ( 1.00x) h264_h_loop_filter_luma_intra_8bpp_rvv_i32: 395.4 ( 0.42x) h264_h_loop_filter_luma_mbaff_intra_8bpp_c: 93.3 ( 1.00x) h264_h_loop_filter_luma_mbaff_intra_8bpp_rvv_i32: 395.4 ( 0.24x) h264_v_loop_filter_chroma_intra_8bpp_c: 134.8 ( 1.00x) h264_v_loop_filter_chroma_intra_8bpp_rvv_i32: 51.6 ( 2.61x) h264_v_loop_filter_luma_intra_8bpp_c: 468.1 ( 1.00x) h264_v_loop_filter_luma_intra_8bpp_rvv_i32: 134.8 ( 3.47x)	3 months ago
sunyuechi	16d4945e9a	lavc/vvc_mc R-V V sad k230 banana_f3 sad_8x16_c: 387.7 ( 1.00x) 394.9 ( 1.00x) sad_8x16_rvv_i32: 109.7 ( 3.53x) 103.5 ( 3.82x) sad_16x8_c: 378.2 ( 1.00x) 384.7 ( 1.00x) sad_16x8_rvv_i32: 82.0 ( 4.61x) 61.7 ( 6.24x) sad_16x16_c: 748.7 ( 1.00x) 759.7 ( 1.00x) sad_16x16_rvv_i32: 128.5 ( 5.83x) 113.7 ( 6.68x)	3 months ago
sunyuechi	b3f7440298	lavc/hevc: R-V V put_pixels(pow2) k230 banana_f3 put_hevc_pel_pixels4_8_c: 61.6 ( 1.00x) 69.5 ( 1.00x) put_hevc_pel_pixels4_8_rvv_i32: 24.6 ( 2.50x) 28.0 ( 2.48x) put_hevc_pel_pixels8_8_c: 209.8 ( 1.00x) 215.5 ( 1.00x) put_hevc_pel_pixels8_8_rvv_i32: 52.6 ( 3.99x) 38.2 ( 5.64x) put_hevc_pel_pixels16_8_c: 839.4 ( 1.00x) 830.0 ( 1.00x) put_hevc_pel_pixels16_8_rvv_i32: 126.6 ( 6.63x) 90.5 ( 9.17x) put_hevc_pel_pixels32_8_c: 3246.6 ( 1.00x) 3246.7 ( 1.00x) put_hevc_pel_pixels32_8_rvv_i32: 311.6 (10.42x) 257.0 (12.63x) put_hevc_pel_pixels64_8_c: 12894.6 ( 1.00x) 12892.7 ( 1.00x) put_hevc_pel_pixels64_8_rvv_i32: 1135.8 (11.35x) 778.0 (16.57x)	3 months ago
sunyuechi	dad062c4f8	lavc/vvc_mc: R-V V put_pixels k230 banana_f3 put_chroma_pixels_8_4x4_c: 63.5 ( 1.00x) 59.2 ( 1.00x) put_chroma_pixels_8_4x4_rvv_i32: 26.5 ( 2.39x) 28.0 ( 2.12x) put_chroma_pixels_8_8x8_c: 211.8 ( 1.00x) 215.5 ( 1.00x) put_chroma_pixels_8_8x8_rvv_i32: 54.3 ( 3.90x) 48.8 ( 4.42x) put_chroma_pixels_8_16x16_c: 841.3 ( 1.00x) 830.0 ( 1.00x) put_chroma_pixels_8_16x16_rvv_i32: 137.5 ( 6.12x) 121.8 ( 6.82x) put_chroma_pixels_8_32x32_c: 3248.8 ( 1.00x) 3288.2 ( 1.00x) put_chroma_pixels_8_32x32_rvv_i32: 350.5 ( 9.27x) 288.5 (11.40x) put_chroma_pixels_8_64x64_c: 12998.3 ( 1.00x) 12976.2 ( 1.00x) put_chroma_pixels_8_64x64_rvv_i32: 1100.5 (11.81x) 924.0 (14.04x) put_chroma_pixels_8_128x128_c: 54284.0 ( 1.00x) 52654.5 ( 1.00x) put_chroma_pixels_8_128x128_rvv_i32: 7192.8 ( 7.55x) 2934.2 (17.94x) put_luma_pixels_8_4x4_c: 63.5 ( 1.00x) 69.5 ( 1.00x) put_luma_pixels_8_4x4_rvv_i32: 26.5 ( 2.39x) 28.0 ( 2.48x) put_luma_pixels_8_8x8_c: 211.5 ( 1.00x) 225.8 ( 1.00x) put_luma_pixels_8_8x8_rvv_i32: 54.3 ( 3.90x) 38.5 ( 5.86x) put_luma_pixels_8_16x16_c: 850.5 ( 1.00x) 830.0 ( 1.00x) put_luma_pixels_8_16x16_rvv_i32: 137.5 ( 6.18x) 100.8 ( 8.24x) put_luma_pixels_8_32x32_c: 3248.8 ( 1.00x) 3257.2 ( 1.00x) put_luma_pixels_8_32x32_rvv_i32: 341.3 ( 9.52x) 246.8 (13.20x) put_luma_pixels_8_64x64_c: 13007.5 ( 1.00x) 13038.8 ( 1.00x) put_luma_pixels_8_64x64_rvv_i32: 1119.0 (11.62x) 684.2 (19.06x) put_luma_pixels_8_128x128_c: 54219.3 ( 1.00x) 52060.8 ( 1.00x) put_luma_pixels_8_128x128_rvv_i32: 6813.5 ( 7.96x) 2548.8 (20.43x)	3 months ago
sunyuechi	9288196c0d	lavc/riscv: Move VVC macro to h26x	3 months ago
sunyuechi	89df9c4404	lavc/vvc_mc: R-V V dmvr k230 banana_f3 dmvr_8_12x20_c: 619.3 ( 1.00x) 624.1 ( 1.00x) dmvr_8_12x20_rvv_i32: 128.6 ( 4.82x) 103.4 ( 6.04x) dmvr_8_20x12_c: 610.0 ( 1.00x) 665.6 ( 1.00x) dmvr_8_20x12_rvv_i32: 137.6 ( 4.44x) 92.9 ( 7.17x) dmvr_8_20x20_c: 1008.0 ( 1.00x) 1082.7 ( 1.00x) dmvr_8_20x20_rvv_i32: 221.1 ( 4.56x) 155.4 ( 6.97x) dmvr_h_8_12x20_c: 2008.0 ( 1.00x) 2009.7 ( 1.00x) dmvr_h_8_12x20_rvv_i32: 239.6 ( 8.38x) 186.7 (10.77x) dmvr_h_8_20x12_c: 1989.5 ( 1.00x) 2009.4 ( 1.00x) dmvr_h_8_20x12_rvv_i32: 230.3 ( 8.64x) 155.4 (12.93x) dmvr_h_8_20x20_c: 3304.1 ( 1.00x) 3342.9 ( 1.00x) dmvr_h_8_20x20_rvv_i32: 378.3 ( 8.73x) 248.9 (13.43x) dmvr_hv_8_12x20_c: 3609.8 ( 1.00x) 3603.4 ( 1.00x) dmvr_hv_8_12x20_rvv_i32: 369.1 ( 9.78x) 322.1 (11.19x) dmvr_hv_8_20x12_c: 3628.3 ( 1.00x) 3624.2 ( 1.00x) dmvr_hv_8_20x12_rvv_i32: 322.8 (11.24x) 238.7 (15.19x) dmvr_hv_8_20x20_c: 5933.8 ( 1.00x) 5936.6 ( 1.00x) dmvr_hv_8_20x20_rvv_i32: 526.5 (11.27x) 374.1 (15.87x) dmvr_v_8_12x20_c: 2156.3 ( 1.00x) 2155.4 ( 1.00x) dmvr_v_8_12x20_rvv_i32: 239.6 ( 9.00x) 176.2 (12.24x) dmvr_v_8_20x12_c: 2137.6 ( 1.00x) 2165.9 ( 1.00x) dmvr_v_8_20x12_rvv_i32: 230.3 ( 9.28x) 155.2 (13.96x) dmvr_v_8_20x20_c: 4183.8 ( 1.00x) 3592.9 ( 1.00x) dmvr_v_8_20x20_rvv_i32: 369.3 (11.33x) 249.2 (14.42x)	3 months ago
sunyuechi	b86766d610	Update R-V V vvc_mc vset to support more lengths	3 months ago
Tong Wu	715a35dadb	d3d12va_encode_hevc: use base to init VPS/SPS/PPS This commit uses hw_base_encode_h265 to generate the VPS/SPS/PPS. Signed-off-by: Tong Wu <wutong1208@outlook.com>	3 months ago
Niklas Haas	ce457bfccd	swscale/slice: fix init of 32 bpc planes In input.c and output.c and many other places, swscale follows the rule of using 15-bit intermediate if output bpc is <= 8, and 19-bit (inside int32_t) intermediate otherwise. See e.g. the comments on hyScale() on swscale_internal.h. These are also the coefficients that yuv2gbrpf32_full_X_c() is using. In contrast to this, the plane init code in slice.c (function fill_ones) is assuming that we use 35-bit intermediates (inside 64-bit integers) for this case, seemingly added by commit `b4967fc71c` with no further justification. This causes a mismatch whenever the implicitly initialized plane contents leak out to the output, e.g. when converting from grayscale to RGB. Fixes: ticket #10716 Signed-off-by: Niklas Haas <git@haasn.dev> Sponsored-by: Sovereign Tech Fund	3 months ago
Anton Khirnov	d2096679d5	compat/w32pthreads: change pthread_t into pointer to malloced struct pthread_t is currently defined as a struct, which gets placed into caller's memory and filled by pthread_create() (which accepts a pthread_t*). The problem with this approach is that pthread_join() accepts pthread_t itself rather than a pointer to it, so it gets a _copy_ of this structure. This causes non-deterministic failures of pthread_join() to produce the correct return value - depending on whether the thread already finished before pthread_join() is called (and thus the copy contains the correct value), or not (then it contains 0). Change the definition of pthread_t into a pointer to a struct, that gets malloced by pthread_create() and freed by pthread_join(). Fixes random failures of fate-ffmpeg-error-rate-fail on Windows after `433cf391f5`. See also [1] for an alternative approach that does not require dynamic allocation, but relies on an assumption that the pthread_t value remains in a fixed memory location. [1] `23829dd2b2` Reviewed-By: Martin Storsjö <martin@martin.st>	3 months ago
Timo Rothenpieler	17e4746687	avcodec/libx265: add alpha layer encoding support	3 months ago
Timo Rothenpieler	fce0622d0b	avformat/hevc: add support for writing alpha layer	3 months ago
James Almer	fb5e8ea971	avformat/iamf_parse: fix setting duration for the last subblock in a parameter definition When subblock durations are constant, the last block may be smaller and the value needs to be calculated. Signed-off-by: James Almer <jamrial@gmail.com>	3 months ago
James Almer	d38fc25519	avformat/iamf_parse: add checks to parameter definition durations Section 3.6.1 of the IAMF spec states "When constant_subblock_duration is equal to 0, the summation of all subblock_duration in this parameter block SHALL be equal to duration.". Signed-off-by: James Almer <jamrial@gmail.com>	3 months ago
Anton Khirnov	8ad34e97b6	fftools/sync_queue: switch from AVFifo+ObjPool to AVContainerFifo Remove now-unused objpool.	3 months ago
Anton Khirnov	8e0cceffa0	fftools/thread_queue: switch from AVFifo+ObjPool to AVContainerFifo The queue needs to track each frame/packet's stream index, this is achieved by maintaining a parallel AVFifo instance for that purpose. This is simpler than implementing custom AVContainerFifo callbacks.	3 months ago
Anton Khirnov	2ac34d0854	lavc/packet: add API for an AVPacket-based AVContainerFifo	3 months ago

1 2 3 4 5 ...

118114 Commits (157cd820adbbfcfd1870a6ce12d072dc0f623e9b) All Branches Search

118114 Commits (157cd820adbbfcfd1870a6ce12d072dc0f623e9b)

All Branches