FFmpeg

Commit Graph

Author	SHA1	Message	Date
Rémi Denis-Courmont	295092b46d	lavc/flacdsp: R-V V LPC32 The entire set of 32 coefficients and corresponding past 32 samples can fit in a single vector (with LMUL=8) exactly, but... since widening double the needed vector sizes, we still end up too short with 128-bit vectors. This adds a very simple version for future 256+-bit hardware, and for pred_orders values up to 16, and a bit more involved loop for for 128-bit hardware with pred_orders between 17 and 32. With 128-bit hardware, the benchmarks look like this: flac_lpc_32_13_c: 30152.0 flac_lpc_32_13_rvv_i32: 10244.7 flac_lpc_32_16_c: 37314.2 flac_lpc_32_16_rvv_i32: 10126.2 flac_lpc_32_29_c: 61910.0 flac_lpc_32_29_rvv_i32: 14495.2 flac_lpc_32_32_c: 68204.0 flac_lpc_32_32_rvv_i32: 13273.7	1 year ago
Rémi Denis-Courmont	8a984aca59	checkasm/flacdsp: add LPC test	1 year ago
Rémi Denis-Courmont	cd6089dc9c	riscv: fix builds without Zbb support	1 year ago
Jun Zhao	2d4aef8982	lavfi/Makefile: fix vf_cropdetect missed edge_common vf_cropdetect depends on edge_common, it's missing in Makefile. Fix trac issue: http://trac.ffmpeg.org/ticket/10664 Signed-off-by: Jun Zhao <barryjzhao@tencent.com>	1 year ago
Diederik de Haas via ffmpeg-devel	c07ed10b0e	apply spelling fixes Fix spelling issue as reported by Debian's lintian tool: accomodate -> accommodate addtional -> additional auxillary -> auxiliary bellow -> below betweeen -> between Calulate -> Calculate coefficents -> coefficients Defalt -> Default defaul -> default higer -> higher neccesary -> necessary orignal -> original ouput -> output precison -> precision processsing -> processing substract -> subtract Transfered -> Transferred upto -> up to Also add several of them to the 'common typos' check in patcheck. Signed-off-by: Diederik de Haas <didi.debian@cknow.org>	1 year ago
Paul B Mahol	5452cbdc15	avfilter/af_afir: add irnorm and irlink options Deprecate gtype option.	1 year ago
Rémi Denis-Courmont	07c303b708	lavc/flacdsp: R-V V decorrelate_indep 16-bit packed flac_decorrelate_indep2_16_c: 981.7 flac_decorrelate_indep2_16_rvv_i32: 199.2 flac_decorrelate_indep4_16_c: 1749.7 flac_decorrelate_indep4_16_rvv_i32: 401.2 flac_decorrelate_indep6_16_c: 2517.7 flac_decorrelate_indep6_16_rvv_i32: 858.0 flac_decorrelate_indep8_16_c: 3285.7 flac_decorrelate_indep8_16_rvv_i32: 1123.5	1 year ago
Rémi Denis-Courmont	fb0295e5fd	lavc/flacdsp: R-V V decorrelate_indep 32-bit packed flac_decorrelate_indep2_32_c: 981.7 flac_decorrelate_indep2_32_rvv_i32: 183.7 flac_decorrelate_indep4_32_c: 1749.7 flac_decorrelate_indep4_32_rvv_i32: 362.5 flac_decorrelate_indep6_32_c: 2517.7 flac_decorrelate_indep6_32_rvv_i32: 715.2 flac_decorrelate_indep8_32_c: 3285.7 flac_decorrelate_indep8_32_rvv_i32: 909.0	1 year ago
Rémi Denis-Courmont	6183a69c0b	lavc/flacdsp: R-V V decorrelate_ms packed flac_decorrelate_ms_16_c: 585.5 flac_decorrelate_ms_16_rvv_i32: 263.0 flac_decorrelate_ms_32_c: 584.7 flac_decorrelate_ms_32_rvv_i32: 250.0	1 year ago
Rémi Denis-Courmont	636ae0e0bc	lavc/flacdsp: R-V V packed decorrelate_{l,r}s flac_decorrelate_ms_16_c: 457.2 flac_decorrelate_ms_16_rvv_i32: 203.0 flac_decorrelate_ms_32_c: 457.2 flac_decorrelate_ms_32_rvv_i32: 203.5 flac_decorrelate_rs_16_c: 456.2 flac_decorrelate_rs_16_rvv_i32: 207.0 flac_decorrelate_rs_32_c: 456.2 flac_decorrelate_rs_32_rvv_i32: 210.5	1 year ago
Rémi Denis-Courmont	be1675035f	checkasm/flacdsp: fix ls/rs/ms tests decorrelate_ls, _rs and _ms are decorrelate[1], [2] and [3] respectively. The code ended up testing indep ([0]) as twice, ms never, and misnaming the other two.	1 year ago
Paul B Mahol	08e97dae20	avfilter/af_adynamicequalizer: add adaptive detection mode	1 year ago
Paul B Mahol	82be1e5c0d	avfilter/af_adynamicequalizer: do gain calculations in log domain	1 year ago
sunyuechi	afb967b81e	af_afir: RISC-V V fcmul_add Segmented loads are slow, so here we use unit-strided load and narrowing shifts. c910: fcmul_add_c: 2179 fcmul_add_rvv_f64: 1652 c908: fcmul_add_c: 4891.2 fcmul_add_rvv_f64: 2399.5 Signed-off-by: Rémi Denis-Courmont <remi@remlab.net>	1 year ago
Rémi Denis-Courmont	d076517056	lavc/llauddsp: R-V V scalarproduct_and_madd_int32 scalarproduct_and_madd_int32_c: 10899.7 scalarproduct_and_madd_int32_rvv_i32: 1749.0	1 year ago
Rémi Denis-Courmont	45d0eb3f70	lavc/llauddsp: R-V V scalarproduct_and_madd_int16 scalarproduct_and_madd_int16_c: 10355.7 scalarproduct_and_madd_int16_rvv_i32: 1480.0	1 year ago
Rémi Denis-Courmont	6720a509a7	checkasm: add lossless audio DSP	1 year ago
James Almer	78f55457c9	x86/flacds: clear the high bits from pred_order in lpc_32 functions Reviewed-by: Ronald S. Bultje <rsbultje@gmail.com> Signed-off-by: James Almer <jamrial@gmail.com>	1 year ago
Dai, Jianhui J	c9fe9fb863	avcodec/cbs_vp8: Add support for VP8 codec bitstream This commit adds support for VP8 bitstream read methods to the cbs codec. This enables the trace_headers bitstream filter to support VP8, in addition to AV1, H.264, H.265, and VP9. This can be useful for debugging VP8 stream issues. The CBS VP8 implements a simple VP8 boolean decoder using GetBitContext to read the bitstream. Only the read methods `read_unit` and `split_fragment` are implemented. The write methods `write_unit` and `assemble_fragment` return the error code AVERROR_PATCHWELCOME. This is because CBS VP8 write is unlikely to be used by any applications at the moment. The write methods can be added later if there is a real need for them. TESTS: ffmpeg -i fate-suite/vp8/frame_size_change.webm -vcodec copy -bsf:v trace_headers -f null - Signed-off-by: Jianhui Dai <jianhui.j.dai@intel.com> Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>	1 year ago
Dai, Jianhui J	5cb8accd09	avcodec/vp8: Export `vp8_token_update_probs` variable This commit exports the `vp8_token_update_probs` variable to internal library scope to facilitate its reuse within the library. Signed-off-by: Jianhui Dai <jianhui.j.dai@intel.com> Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>	1 year ago
Rémi Denis-Courmont	90a779bed6	lavc/huffyuvdsp: basic R-V V add_hfyu_left_pred_bgr32 Better performance can probably be achieved with a more intricate unrolled loop, but this is a start: add_hfyu_left_pred_bgr32_c: 15084.0 add_hfyu_left_pred_bgr32_rvv_i32: 10280.2 This would actually be cleaner with the RISC-V P extension, but that is not ratified yet (I think?) and usually not supported if V is supported.	1 year ago
Rémi Denis-Courmont	6b708cd783	checkasm/huffyuvdsp: test for add_hfyu_left_pred_bgr32	1 year ago
Cosmin Stejerean via ffmpeg-devel	575efc0406	tools/general_assembly.pl - add options to print names, emails or both Signed-off-by: Cosmin Stejerean <cosmin@cosmin.at> Signed-off-by: Anton Khirnov <anton@khirnov.net>	1 year ago
James Almer	b360c91752	avcodec/codecpar: mention how to allocate coded_side_data Signed-off-by: James Almer <jamrial@gmail.com>	1 year ago
Zhao Zhili	a1a6a328f0	fftools/ffplay: add hwaccel decoding support Add vulkan renderer via libplacebo. Simple usage: $ ffplay -hwaccel vulkan foo.mp4 Use cuda to vulkan map: $ ffplay -hwaccel cuda foo.mp4 Create vulkan instance by libplacebo, and enable debug: $ ffplay -hwaccel vulkan \ -vulkan_params create_by_placebo=1:debug=1 foo.mp4 Signed-off-by: Zhao Zhili <zhilizhao@tencent.com>	1 year ago
Anton Khirnov	889a022cce	fftools/ffmpeg: rework keeping track of file duration for -stream_loop Current code tracks min/max pts for each stream separately; then when the file ends it combines them with last frame's duration to compute the total duration of each stream; finally it selects the longest of those durations as the file duration. This is incorrect - the total file duration is the largest timestamp difference between any frames, regardless of the stream. Also change the way the last frame information is reported from decoders to the muxer - previously it would be just the last frame's duration, now the end timestamp is sent, which is simpler. Changes the result of the fate-ffmpeg-streamloop-transcode-av test, where the timestamps are shifted slightly forward. Note that the matroska demuxer does not return the first audio packet after seeking (due to buggy interaction betwen the generic code and the demuxer), so there is a gap in audio.	1 year ago
Anton Khirnov	87016e031f	fftools/thread_queue: count receive-finished streams as finished This ensures that tq_receive() will always return EOF after all streams were receive-finished, even though the sending side might not have closed them yet. This may allow the receiver to avoid manually tracking which streams it has already closed.	1 year ago
Anton Khirnov	4f7b91a698	fftools/thread_queue: do not return elements for receive-finished streams It does not cause any issues in current callers, but still should not happen.	1 year ago
Anton Khirnov	7c97a0c63f	fftools/ffmpeg: move a few inline function into a new header Will allow to use them in future commits without including the whole ffmpeg.h.	1 year ago
Anton Khirnov	6dbde68cb5	lavc/8bps: fix exporting palette after `63767b79a5` It would be left empty on each frame whose packet does not come with palette attached.	1 year ago
Paul B Mahol	7282137f48	lavfi/af_amix: make sure the output does not depend on input ordering Signed-off-by: Anton Khirnov <anton@khirnov.net>	1 year ago
Anton Khirnov	de85815bfa	lavf/mux: do not apply max_interleave_delta to subtitles It is common for subtitle streams to have large gaps between packets. When the caller is interleaving packets from multiple files, it can easily happen that two successive subtitle packets trigger this limit, even though no excessive buffering is happening. Should fix #7064	1 year ago
Anton Khirnov	436b972fc8	doc/ffmpeg: expand -bsf documentation Explain how to pass options to filters.	1 year ago
Anton Khirnov	a8d9d6b08d	tests/fate: replace deprecated -vsync with -fps_mode	1 year ago
Anton Khirnov	23de85d1ec	tests/fate/ffmpeg: replace deprecated -vbsf with -bsf:v	1 year ago
Rémi Denis-Courmont	ce467421dc	lavc/exrdsp: unroll predictor With explicit unrolling, we can skip half of the sign bit flips, and the compiler is then better able to optimise the scalar loop: predictor_c: 31376.0 (before) predictor_c: 23703.0 (after)	1 year ago
Rémi Denis-Courmont	c536e92207	lavc/sbrdsp: R-V V hf_apply_noise functions This is restricted to 128-bit vectors as larger vector sizes could read past the end of the noise array. Support for future hardware with larger vector sizes is left for some other time. hf_apply_noise_0_c: 2319.7 hf_apply_noise_0_rvv_f32: 1229.0 hf_apply_noise_1_c: 2539.0 hf_apply_noise_1_rvv_f32: 1244.7 hf_apply_noise_2_c: 2319.7 hf_apply_noise_2_rvv_f32: 1232.7 hf_apply_noise_3_c: 2541.2 hf_apply_noise_3_rvv_f32: 1244.2	1 year ago
Rémi Denis-Courmont	20e6195c54	checkasm: test the noise case of sbrdsp.hf_apply_noise The tested functions treat s_m[i] == 0 as a special case. Other than that, the functions are slightly complicated vector additions. This actually makes the zero case happen pseudorandomly.	1 year ago
Rémi Denis-Courmont	6d60cc7baf	sws/rgb2rgb: fix unaligned accesses in R-V V YUYV to I422p In my personal opinion, we should not need to support unaligned YUY2 pixel maps. They should always be aligned to at least 32 bits, and the current code assumes just 16 bits. However checkasm does test for unaligned input bitmaps. QEMU accepts it, but real hardware dose not. In this particular case, we can at the same time improve performance and handle unaligned inputs, so do just that. uyvytoyuv422_c: 104379.0 uyvytoyuv422_c: 104060.0 uyvytoyuv422_rvv_i32: 25284.0 (before) uyvytoyuv422_rvv_i32: 19303.2 (after)	1 year ago
Rémi Denis-Courmont	5b8b5ec9c5	sws/rgb2rgb: rework R-V V YUY2 to 4:2:2 planar This saves three scratch registers and three instructions per line. The performance gains are mostly negligible. The main point is to free up registers for further rework.	1 year ago
Rémi Denis-Courmont	5b33104fca	lavc/sbrdsp: R-V V hf_gen hf_gen_c: 2922.7 hf_gen_rvv_f32: 731.5	1 year ago
Gyan Doshi	67a2571a55	avcodec/libsvtav1: add version guard for external param Setting of external param 'force_key_frames' was added in `7bcc1b4eb8`. It is available since v1.1.0 but ffmpeg allows linking against v0.9.0.	1 year ago
Paul B Mahol	84e400ae37	avfilter/buffersrc: switch to activate Fixes OOM when caller keeps adding frames into filtergraph that reached EOF by other means, for example EOF is signalled by other filter in filtergraph or by buffersink.	1 year ago
Evgeny Pavlov	da3ce21f68	libavcodec/amfenc: Fix issue with missing headers in AV1 encoder This commit fixes issue with missing SPS/PPS headers in video encoded by AMF AV1 encoder. Missing headers leads to broken seek in MPV video player. Default value for property AV1_HEADER_INSERTION_MODE shouldn't be setup to NONE (no headers insertion). We need to skip definition of this property, because default value depends on USAGE property. Signed-off-by: Dmitrii Ovchinnikov <ovchinnikov.dmitrii@gmail.com>	1 year ago
Rémi Denis-Courmont	427347309b	checkasm: test with random bw value With a value of zero, the function is a glorified memory copy.	1 year ago
Sebastian Ramacher	250471ea17	avcoded/fft: Fix memory leak if ctx2 is used Signed-off-by: James Almer <jamrial@gmail.com>	1 year ago
Sebastian Ramacher	a562cfee2e	avcodec/fft: Use av_mallocz to avoid invalid free/uninit Signed-off-by: James Almer <jamrial@gmail.com>	1 year ago
Rémi Denis-Courmont	cd7b352c53	lavc/sbrdsp: R-V V autocorrelate With 5 accumulator vectors and 6 inputs, this can only use LMUL=2. Also the number of vector loop iterations is small, just 5 on 128-bit vector hardware. The vector loop is somewhat unusual in that it processes data in descending memory order, in order to save on vector slides: in descending order, we can extract elements to carry over to the next iteration from the bottom of the vectors directly. With ascending order (see in the Opus postfilter function), there are no ways to get the top elements directly. On the downside, this requires the use of separate shift and sub (the would-be SH3SUB instruction does not exist), with a small pipeline stall on the vector load address. The edge cases in scalar are done in scalar as this saves on loads and remains significantly faster than C. autocorrelate_c: 669.2 autocorrelate_rvv_f32: 421.0	1 year ago
Rémi Denis-Courmont	f576a0835b	lavc/aacpsdsp: rework R-V V hybrid_synthesis_deint Given the size of the data set, strided memory accesses cannot be avoided. We can still do better than the current code. ps_hybrid_synthesis_deint_c: 12065.5 ps_hybrid_synthesis_deint_rvv_i32: 13650.2 (before) ps_hybrid_synthesis_deint_rvv_i64: 8181.0 (after)	1 year ago
Rémi Denis-Courmont	eb508702a8	lavc/aacpsdsp: rework R-V V add_squares Segmented loads may be slower than not. So this advantageously uses a unit-strided load and narrowing shifts instead. Before: ps_add_squares_c: 60757.7 ps_add_squares_rvv_f32: 22242.5 After: ps_add_squares_c: 60516.0 ps_add_squares_rvv_i64: 17067.7	1 year ago

... 3 4 5 6 7 ...

112989 Commits (aa1e7681203694c6e2b38e2a627ff90eb3524d37) All Branches Search

112989 Commits (aa1e7681203694c6e2b38e2a627ff90eb3524d37)

All Branches