FFmpeg

Commit Graph

Author	SHA1	Message	Date
Andreas Rheinhardt	790f793844	avutil/common: Don't auto-include mem.h There are lots of files that don't need it: The number of object files that actually need it went down from 2011 to 884 here. Keep it for external users in order to not cause breakages. Also improve the other headers a bit while just at it. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	8 months ago
Michael Niedermayer	907743239d	avutil/tx_template: fix integer ovberflwo in fft3() Fixes: signed integer overflow: -1028966111 + -1314089526 cannot be represented in type 'int' Fixes: 63174/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_AAC_FIXED_fuzzer-5853273711837184 Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg Reviewed-by: Lynne <dev@lynne.ee> Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>	1 year ago
Michael Niedermayer	c42a89309a	avutil/tx_template: Fix some signed integer overflows in DECL_FFT5() Fixes: signed integer overflow: -1364715454 + -1468954671 cannot be represented in type 'int' Fixes: 62093/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_AAC_FIXED_fuzzer-5538774254485504 Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>	1 year ago
Lynne	d40672e661	lavu/tx: fix scaling of R2R transforms Still slightly inaccurate, but it's good enough now.	1 year ago
Lynne	59b39d241e	lavu/tx: improve rdft table generation precision slightly	1 year ago
Lynne	ef8fd7bc3c	lavu/tx: add DCT-I and DST-I transforms These are true, actual DCT-I and DST-I transforms, unlike the libavcodec versions, which are plainly not.	1 year ago
Lynne	11e22730e1	lavu/tx: add real to real and real to imaginary RDFT transforms These are in-place transforms, required for DCT-I and DST-I. Templated as the mod2 variant requires minor modifications, and is required specifically for DCT-I/DST-I.	1 year ago
Michael Niedermayer	8f48a62151	avutil/tx_template: extend to 2M Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>	1 year ago
Michael Niedermayer	9f04055669	avutil/tx_template: Use more unsigned ints to avoid undefined overflows Fixes: signed integer overflow: 574590586 - -1875616554 cannot be represented in type 'int' Fixes: 53914/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_AAC_FIXED_fuzzer-5037125846564864 Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>	2 years ago
Lynne	710d83bdde	lavu/tx: zero-out imaginary of last coefficient in forward RDFTs We didn't do this, because it's zero anyway, but it prevents users from using uninitialized memory in calculations.	2 years ago
Michael Niedermayer	7792825ad6	avutil/tx: Use unsigned in ff_tx_fft_sr_combine() to avoid undefined behavior Fixes: signed integer overflow: -1284837070 - 982101618 cannot be represented in type 'int' Fixes: 53105/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_AC3_FIXED_fuzzer-4848015827664896 Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>	2 years ago
Lynne	a56d7e0ca3	lavu/tx: add DCT-III implementation	2 years ago
Lynne	504b7bec1a	lavu/tx: add DCT-II implementation	2 years ago
Lynne	43d285a40f	lavu/tx: fix last coefficient scaling for R2C transforms This was a typo.	2 years ago
Lynne	8547123f3b	lavu/tx: generalize PFA FFTs This commit permits any stacking of FFTs of any size.	2 years ago
Lynne	87bae6b018	lavu/tx: refactor to explicitly track and convert lookup table order Necessary for generalizing PFAs.	2 years ago
Lynne	6ddd10c3e2	lavu/tx: allow codelets to specify a minimum number of matching factors	2 years ago
Lynne	fbe4fd992f	lavu/tx: support output stride in naive transforms Allows them to be used in general PFAs.	2 years ago
Lynne	68cabf8750	lavu/tx: add fft_inplace_small transforms This is much faster than the loop.	2 years ago
Lynne	fff3e1d848	lavu/tx: support out-of-place transforms in fft_inplace This makes testing easier, as a unified path can be used for in/out of place transforms.	2 years ago
Lynne	d260796f11	lavu/tx: make C ptwo transforms in+out of place We assume that _all_ in-place transforms can operate out of place, which isn't true, because the C ptwo transforms were always in-place (dst).	2 years ago
Lynne	37008dc402	lavu/tx: add naive_small FFT The same as naive but with precomputed tables. Makes it more useful for odd-factors we don't support yet.	2 years ago
Lynne	e8a9b7b298	lavu/tx: list all odd-length FFT factors as regular codelets Allows them to be picked just like any other transform.	2 years ago
Lynne	45bd4bf79f	lavu/tx: generalize single-factor transforms Not that useful, but it gives us fast small odd-length transforms.	2 years ago
Lynne	79f11e2409	lavu/tx: make prime factor transforms truly in-place They all overwrote in[0] and then used it as a DC.	2 years ago
Andreas Rheinhardt	f8efd890bf	avutil/tx_template: Move function pointers to const memory This can be achieved by moving the AVOnce out of the structure containing the function pointers; the latter can then be made const. This also has the advantage of eliminating padding in the structure (sizeof(AVOnce) is four here) and allowing the AVOnces to be put into .bss (dependening upon the implementation). Reviewed-by: Lynne <dev@lynne.ee> Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2 years ago
Andreas Rheinhardt	188216581b	avutil/tx_template: Avoid code duplication Reviewed-by: Lynne <dev@lynne.ee> Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2 years ago
Andreas Rheinhardt	2af5f55b2e	avutil/tx_template: Don't waste space for inexistent factors It is possible to avoid the factors array for the power-of-two tables for which said array is unused by using a different structure for initialization for power-of-two tables than for non-power-of-two-tables. This saves 31516B from .data. Reviewed-by: Lynne <dev@lynne.ee> Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2 years ago
Lynne	ace42cf581	x86/tx_float: add 15xN PFA FFT AVX SIMD ~4x faster than the C version. The shuffles in the 15pt dim1 are seriously expensive. Not happy with it, but I'm contempt. Can be easily converted to pure AVX by removing all vpermpd/vpermps instructions.	2 years ago
Lynne	7e7baf8ab8	lavu/tx: do not steal lookup tables of subcontexts in the iMDCT As it happens, some still need their contexts.	2 years ago
Lynne	f1b35fc8f0	lavu/tx: remove av_cold from table definitions How did this get here?	2 years ago
Lynne	c92edd969a	lavu/tx: rotate 3 & 15-point exptabs This just inverts their signs. Simplifies SIMD.	2 years ago
Lynne	51172223fd	lavu/tx: generalize MDCTs The same code can perform any-length MDCTs with minimal changes.	2 years ago
Lynne	645a1f4422	lavu/tx: add the inplace flag to PFA FFTs They support in-place, because they have to use a temporary buffer.	2 years ago
Lynne	ae66a9db7b	lavu/tx: optimize and simplify inverse MDCTs Convert the input from a scatter to a gather instead, which is faster and better for SIMD. Also, add a pre-shuffled exptab version to avoid gathering there at all. This doubles the exptab size, but the speedup makes it worth it. In SIMD, the exptab will likely be purged to a higher cache anyway because of the FFT in the middle, and the amount of loads stays identical. For a 960-point inverse MDCT, the speedup is 10%. This makes it possible to write sane and fast SIMD versions of inverse MDCTs.	2 years ago
Lynne	af94ab7c7c	lavu/tx: add an RDFT implementation RDFTs are full of conventions that vary between implementations. What I've gone for here is what's most common between both fftw, avcodec's rdft and what we use, the equivalent of which is DFT_R2C for forward and IDFT_C2R for inverse. The other 2 conventions (IDFT_R2C and DFT_C2R) were not used at all in our code, and their names are also not appropriate. If there's a use for either, we can easily add a flag which would just flip the sign on one exptab. For some unknown reason, possibly to allow reusing FFT's exp tables, av_rdft's C2R output is 0.5x lower than what it should be to ensure a proper back-and-forth conversion. This code outputs its real samples at the correct level, which matches FFTW's level, and allows the user to change the level and insert arbitrary multiplies for free by setting the scale option.	3 years ago
Lynne	ef4bd81615	lavu/tx: rewrite internal code as a tree-based codelet constructor This commit rewrites the internal transform code into a constructor that stitches transforms (codelets). This allows for transforms to reuse arbitrary parts of other transforms, and allows transforms to be stacked onto one another (such as a full iMDCT using a half-iMDCT which in turn uses an FFT). It also permits for each step to be individually replaced by assembly or a custom implementation (such as an ASIC).	3 years ago
Lynne	1978b143eb	checkasm: add av_tx FFT SIMD testing code This sadly required making changes to the code itself, due to the same context needing to be reused for both versions. The lookup table had to be duplicated for both versions.	4 years ago
Lynne	0072a42388	lavu/tx: add full-sized iMDCT transform flag	4 years ago
Lynne	8c55c82583	lavu/tx: add a 9-point FFT and (i)MDCT	4 years ago
Lynne	bd9ea917a3	lavu/tx: add a 7-point FFT and (i)MDCT	4 years ago
Lynne	89da62f2fc	lavu/tx: refactor power-of-two FFT This commit refactors the power-of-two FFT, making it faster and halving the size of all tables, making the code much smaller on all systems. This removes the big/small pass split, because on modern systems the "big" pass is always faster, and even on older machines there is no measurable speed difference.	4 years ago
Lynne	e20a39a375	lavu/tx: do not invert permutes on MDCTs	4 years ago
Lynne	8e94b7cff0	lavu/tx: invert permutation lookups out[lut[i]] = in[i] lookups were 4.04 times(!) slower than out[i] = in[lut[i]] lookups for an out-of-place FFT of length 4096. The permutes remain unchanged for anything but out-of-place monolithic FFT, as those benefit quite a lot from the current order (it means there's only 1 lookup necessary to add to an offset, rather than a full gather). The code was based around non-power-of-two FFTs, so this wasn't benchmarked early on.	4 years ago
Lynne	10341743d2	lavu/tx: require output argument to match input for inplace transforms This simplifies some assembly code by a lot, by either saving a branch or saving an entire duplicated function.	4 years ago
Lynne	5ca40d6d94	lavu/tx: support in-place FFT transforms This commit adds support for in-place FFT transforms. Since our internal transforms were all in-place anyway, this only changes the permutation on the input. Unfortunately, research papers were of no help here. All focused on dry hardware implementations, where permutes are free, or on software implementations where binary bloat is of no concern so storing dozen times the transforms for each permutation and version is not considered bad practice. Still, for a pure C implementation, it's only around 28% slower than the multi-megabyte FFTW3 in unaligned mode. Unlike a closed permutation like with PFA, split-radix FFT bit-reversals contain multiple NOPs, multiple simple swaps, and a few chained swaps, so regular single-loop single-state permute loops were not possible. Instead, we filter out parts of the input indices which are redundant. This allows for a single branch, and with some clever AVX512 asm, could possibly be SIMD'd without refactoring. The inplace_idx array is guaranteed to never be larger than the revtab array, and in practice only requires around log2(len) entries. The power-of-two MDCTs can be done in-place as well. And it's possible to eliminate a copy in the compound MDCTs too, however it'll be slower than doing them out of place, and we'd need to dirty the input array.	4 years ago
James Almer	f6477ac9f4	avutil/tx: use ENOSYS instead of ENOTSUP It's the standard error code used across the codebase to signal unimplemented or unsupported features. Signed-off-by: James Almer <jamrial@gmail.com>	4 years ago
Lynne	06a8596825	lavu: support arbitrary-point FFTs and all even (i)MDCT transforms This patch adds support for arbitrary-point FFTs and all even MDCT transforms. Odd MDCTs are not supported yet as they're based on the DCT-II and DCT-III and they're very niche. With this we can now write tests.	4 years ago
Lynne	2465fe1302	lavu/tx: add 2-point FFT transform By itself, this allows 6-point, 10-point and 30-point transforms. When the 9-point transform is added it allows for 18-point FFT, and also for a 36-point MDCT (used by MP3).	5 years ago
Lynne	e1c84856bb	lavu/tx: improve 3-point fixed precision There's just no reason not to when its so easy (albeit messy) and its also reducing the precision of all non-power-of-two transforms that use it.	5 years ago

1 2

54 Commits (d163eefd4756111d188c40b3ee4b6cd91e8b9d64)