FFmpeg

Commit Graph

Author	SHA1	Message	Date
Claudio Freire	ca203e9985	AAC encoder: improve SF range utilization This patch does 4 things, all of which interact and thus it woudln't be possible to commit them separately without causing either quality regressions or assertion failures. Fate comparison targets don't all reflect improvements in quality, yet listening tests show substantially improved quality and stability. 1. Increase SF range utilization. The spec requires SF delta values to be constrained within the range -60..60. The previous code was applying that range to the whole SF array and not only the deltas of consecutive values, because doing so requires smarter code: zeroing or otherwise skipping a band may invalidate lots of SF choices. This patch implements that logic to allow the coders to utilize the full dynamic range of scalefactors, increasing quality quite considerably, and fixing delta-SF-related assertion failures, since now the limitation is enforced rather than asserted. 2. PNS tweaks The previous modification makes big improvements in twoloop's efficiency, and every time that happens PNS logic needs to be tweaked accordingly to avoid it from stepping all over twoloop's decisions. This patch includes modifications of the sort. 3. Account for lowpass cutoff during PSY analysis The closer PSY's allocation is to final allocation the better the quality is, and given these modifications, twoloop is now very efficient at avoiding holes. Thus, to compute accurate thresholds, PSY needs to account for the lowpass applied implicitly during twoloop (by zeroing high bands). This patch makes twoloop set the cutoff in psymodel's context the first time it runs, and makes PSY account for it during threshold computation, making PE and threshold computations closer to the final allocation and thus achieving better subjective quality. 4. Tweaks to RC lambda tracking loop in relation to PNS Without this tweak some corner cases cause quality regressions. Basically, lambda needs to react faster to overall bitrate efficiency changes since now PNS can be quite successful in enforcing maximum bitrates, when PSY allocates too many bits to the lower bands, suppressing the signals RC logic uses to lower lambda in those cases and causing aggressive PNS. This tweak makes PNS much less aggressive, though it can still use some further tweaks. Also update MIPS specializations and adjust fuzz Also in lavc/mips/aacpsy_mips.h: remove trailing whitespace	9 years ago
Claudio Freire	88e498a87e	AAC encoder: make pe.min a local minimum As noted in a comment, pe.min in the reference encoder is centered around current pe. The bit reservoir algo needs pe.min to be a local minimum, because it can only account for local PE variations. If it's set to a global minimum as was being done, bit reservoir logic doesn't work as efficiently. This patch tries to forget old minimums and converge to a local minimum without losing the stability of the previous solution. Listening tests until now suggest this solves numerous RC issues.	9 years ago
Claudio Freire	323d37521d	AAC encoder: cosmetics from last commit Reindent	9 years ago
Claudio Freire	01ecb7172b	AAC encoder: Extensive improvements This finalizes merging of the work in the patches in ticket #2686. Improvements to twoloop and RC logic are extensive. The non-exhaustive list of twoloop improvments includes: - Tweaks to distortion limits on the RD optimization phase of twoloop - Deeper search in twoloop - PNS information marking to let twoloop decide when to use it (turned out having the decision made separately wasn't working) - Tonal band detection and priorization - Better band energy conservation rules - Strict hole avoidance For rate control: - Use psymodel's bit allocation to allow proper use of the bit reservoir. Don't work against the bit reservoir by moving lambda in the opposite direction when psymodel decides to allocate more/less bits to a frame. - Retry the encode if the effective rate lies outside a reasonable margin of psymodel's allocation or the selected ABR. - Log average lambda at the end. Useful info for everyone, but especially for tuning of the various encoder constants that relate to lambda feedback. Psy: - Do not apply lowpass with a FIR filter, instead just let the coder zero bands above the cutoff. The FIR filter induces group delay, and while zeroing bands causes ripple, it's lost in the quantization noise. - Experimental VBR bit allocation code - Tweak automatic lowpass filter threshold to maximize audio bandwidth at all bitrates while still providing acceptable, stable quality. I/S: - Phase decision fixes. Unrelated to #2686, but the bugs only surfaced when the merge was finalized. Measure I/S band energy accounting for phase, and prevent I/S and M/S from being applied both. PNS: - Avoid marking short bands with PNS when they're part of a window group in which there's a large variation of energy from one window to the next. PNS can't preserve those and the effect is extremely noticeable. M/S: - Implement BMLD protection similar to the specified in ISO-IEC/13818:7-2003, Appendix C Section 6.1. Since M/S decision doesn't conform to section 6.1, a different method had to be implemented, but should provide equivalent protection. - Move the decision logic closer to the method specified in ISO-IEC/13818:7-2003, Appendix C Section 6.1. Specifically, make sure M/S needs less bits than dual stereo. - Don't apply M/S in bands that are using I/S Now, this of course needed adjustments in the compare targets and fuzz factors of the AAC encoder's fate tests, but if wondering why the targets go up (more distortion), consider the previous coder was using too many bits on LF content (far more than required by psy), and thus those signals will now be more distorted, not less. The extra distortion isn't audible though, I carried extensive ABX testing to make sure. A very similar patch was also extensively tested by Kamendo2 in the context of #2686.	9 years ago
Claudio Freire	7ec74ae4aa	AAC encoder: tweak rate-distortion logic This patch modifies the encode frame function to retry encoding the frame when the resulting bit count is too far off target, but only adjusting lambda in small, incremental step. It also makes the logic more conservative - otherwise it will contend with bit reservoir-related variations in bit allocation, and result in artifacts when frame have to be truncated (usually at high bit rates transitioning from low complexity to high complexity).	9 years ago
Vittorio Giovara	7c6eb0a1b7	lavc: AV-prefix all codec flags Convert doxygen to multiline and express bitfields more simply. Signed-off-by: Vittorio Giovara <vittorio.giovara@gmail.com>	9 years ago
Claudio Freire	59216e0525	AAC Encoder: clipping avoidance Avoid clipping due to quantization noise to produce audible artifacts, by detecting near-clipping signals and both attenuating them a little and encoding escape-encoded bands (usually the loudest) rounding towards zero instead of nearest, which tends to decrease overall energy and thus clipping. Currently fate tests measure numerical error so this change makes tests using asynth (which are near clipping) report higher error not less, because of window attenuation. Yet, they sound better, not worse (albeit subtle, other samples aren't subtle at all). Only measuring psychoacoustically weighted error would make for a representative test, so that will be left for a future patch. Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>	9 years ago
Rostislav Pehlivanov	8e607c747e	aacpsy: use a different metric for the spread of a band This commit modifies `02dbed6` to use band->active_lines to better gauge how much information is contained within a single band and thus allow the perceptual noise subsitution to more accurately determine which bands to code as noise. The spread[w+g] used before this patch behaved more like a low-pass filter for PNS band_types, which could mistakingly mark some low frequency bands as noise. Reviewed-by: Claudio Freire <klaussfreire@gmail.com> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	10 years ago
Rostislav Pehlivanov	02dbed6e71	aacpsy: Add energy spread for each band This commit adds the energy spread to the struct for each band and removes 2 unused fields. distortion and perceptual_weight were not referenced in any file nor were they set to any value, so it was safe to remove them. The energy spread is currently only used in the aac psy model. It's defined as being proportional to the tonality of each band. Reviewed-by: Claudio Freire <klaussfreire@gmail.com> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	10 years ago
Michael Niedermayer	e7a65142b9	avcodec/aacpsy: Clear the correct pointer Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	10 years ago
Vittorio Giovara	074a1b3732	aacpsy: Check memory allocation	10 years ago
Andreas Cadhalpun	110f7f35fb	aacpsy: correct calculation of minath in psy_3gpp_init The minimum of the ath(x, ATH_ADD) function depends on ATH_ADD. This patch uses the first order approximation to determine it. For ATH_ADD = 4 this results in the value at 3407.06812 (-5.24241638) not the one at 3410 (-5.24237967). CC: libav-stabl@libav.org Signed-off-by: Andreas Cadhalpun <Andreas.Cadhalpun@googlemail.com> Signed-off-by: Anton Khirnov <anton@khirnov.net>	10 years ago
Andreas Cadhalpun	ca9849eecd	aacpsy: correct calculation of minath in psy_3gpp_init The minimum of the ath(x, ATH_ADD) function depends on ATH_ADD. This patch uses the first order approximation to determine it. For ATH_ADD = 4 this results in the value at 3407.06812 (-5.24241638) not the one at 3410 (-5.24237967). Signed-off-by: Andreas Cadhalpun <Andreas.Cadhalpun@googlemail.com> Approved-by: Claudio Freire <klaussfreire@gmail.com> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	10 years ago
Andreas Cadhalpun	e224aa4191	aacpsy: avoid psy_band->threshold becoming NaN If band->thr is 0.0f, the division is undefined, making norm_fac not a number or infinity, which causes psy_band->threshold to become NaN. This is passed on to other variables until it finally reaches sce->sf_idx and is converted to an integer (-2147483648). This causes a segmentation fault when it is used as array index. Signed-off-by: Andreas Cadhalpun <Andreas.Cadhalpun@googlemail.com> Reviewed-by: Claudio Freire <klaussfreire@gmail.com> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	10 years ago
Claudio Freire	84f4be424d	avcodec/aacpsy: Fix AAC Psy PE reduction calculation when multiple iterations are required This is a small change, but it does have a big impact on bit allocation. all the regressions marked in the report have no audible difference (I didn't check them all though), but the improvements can be heard. This affects mostly high bit rates. It's related to issue #2686. In the report, A is the patched version, B is unpatched, all comparisons show deltas in the form (A-B), so a positive pSNR delta means a better quality in the patched version, and negative a regression. Regressions are only considered for pSNR deltas below -1db, they're considered serious below -6db. All measurements were done with tiny_psnr. The summary of the report inline for quick reading: Files: 58 Bitrates: 6 Tests: 347 Serious Regressions: 0 (0%) Regressions: 10 (2%) Improvements: 54 (15%) Big improvements: 26 (7%) Worst regression - sine_tester.flac - 384k - StdDev: 1.68 pSNR: -3.05 maxdiff: -178.00 Best improvement - 07 - Bound.flac - 384k - StdDev: -1700.05 pSNR: 20.64 maxdiff: -29595.00 Average - StdDev: -55.67 pSNR: 1.20 maxdiff: -1593.00 Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	10 years ago
Diego Biurrun	7f9f771eac	avcodec: Don't anonymously typedef structs	10 years ago
Michael Niedermayer	ee5145c05d	avcodec/aacpsy: Use av_mallocz_array() Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	11 years ago
Claudio Freire	7c71ada4ca	aacenc: Fix a rounding bug in aacpsy channel bitrate computation Signed-off-by: Martin Storsjö <martin@martin.st>	12 years ago
Claudio Freire	c545876d1b	AAC encoder: Fixed a rounding bug in psy's channel bitrate computation. Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	12 years ago
Diego Biurrun	6fee1b90ce	avcodec: Add av_cold attributes to init functions missing them	12 years ago
Diego Biurrun	a5f8873620	silly typo fixes	12 years ago
Bojan Zivkovic	e54eb8db9c	mips: Optimization of AAC psychoacoustic model functions Signed-off-by: Bojan Zivkovic <bojan@mips.com> Reviewed-by: Nedeljko Babic <Nedeljko.Babic@imgtec.com> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	12 years ago
Bojan Zivkovic	1f5b5b8062	libavcodec: changed mathematical functions in aacpsy.c This patch changes existing mathematical functions with faster ones. Speeds up encoding more than 10%. Tested on x86 and MIPS platforms. Signed-off-by: Bojan Zivkovic <bojan@mips.com> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	12 years ago
Diego Biurrun	511cf612ac	miscellaneous typo fixes	12 years ago
Michael Niedermayer	570931d411	aacpsy: psy_3gpp_analyze_channel() handle energy == 0 Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	12 years ago
Michael Niedermayer	413b32f808	aacpsy: calc_reduction_3gpp() handle active_lines = 0 Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	12 years ago
Michael Niedermayer	4819d43d7f	aacpsy: use exp2(f) instead of pow(f)(2,...) Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	12 years ago
Michael Niedermayer	72dabdfc58	aacenc: new default cutoff Improves subjective quality Formula and testing by: kamedo2 <fujisakihir90@yahoo.co.jp> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	12 years ago
Michael Niedermayer	21e5dd93d7	aacpsy: fix "may be used uninitialized" warning Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	13 years ago
Diego Biurrun	a92be9b856	Replace memset(0) by zero initializations. Also remove one pointless zero initialization in rangecoder.c.	13 years ago
Nathan Caldwell	9b8e2a8709	aacenc: Deinterleave input samples before processing. Signed-off-by: Alex Converse <alex.converse@gmail.com>	13 years ago
Nathan Caldwell	025ccf1f8b	aacenc: Request normalized float samples instead of converting s16 samples to float. Signed-off-by: Alex Converse <alex.converse@gmail.com>	13 years ago
Nathan Caldwell	6381f913d1	aacpsy: Replace an if with FFMAX in LAME windowing. Signed-off-by: Alex Converse <alex.converse@gmail.com>	13 years ago
Nathan Caldwell	843cd4a3ed	aacpsy: cosmetics, change a FIXME to a NOTE about subshort comparisons Also fix a typo. Signed-off-by: Alex Converse <alex.converse@gmail.com>	13 years ago
Diego Biurrun	58c42af722	doxygen: misc consistency, spelling and wording fixes	13 years ago
Nathan Caldwell	d3a6c2ab7e	psymodel: Remove the single channel analysis function	14 years ago
Nathan Caldwell	01344fe409	aacenc: Implement dummy channel group analysis that just calls the single channel analysis for each channel.	14 years ago
Nathan Caldwell	0bc01cc9fe	psymodel: Add channels and channel groups to the psymodel.	14 years ago
Diego Biurrun	3a0d0ff5e6	aacenc: Mark psy_3gpp_window() as av_unused. It is intentionally left in to allow adding 3GPP-style windowing in the future. Marking it av_unused silences an annoying unused function warning.	14 years ago
Nathan Caldwell	f50d937725	aacenc: Fix whitespace after last commit. Signed-off-by: Martin Storsjö <martin@martin.st>	14 years ago
Nathan Caldwell	230c1a9075	aacenc: Finish 3GPP psymodel analysis for non mid/side cases. There is still are still a few sections missing relating to TNS (not present) and mid/side (contains other bugs). Overall this improves quality, and vastly improves rate-control. Signed-off-by: Martin Storsjö <martin@martin.st>	14 years ago
Mans Rullgard	2912e87a6c	Replace FFmpeg with Libav in licence headers Signed-off-by: Mans Rullgard <mans@mansr.com>	14 years ago
Nathan Caldwell	350785a662	aacenc: 10l, missed a reference when refactoring the psymodel.	14 years ago
Nathan Caldwell	4afedfd8e5	aacenc: cosmetics, indentation, and comment clarification Correct bad indentation in aaccoder Clarify and correct comments in 3GPP psymodel, other cosmetics.	14 years ago
Nathan Caldwell	b7c96769c5	aacenc: Refactor the parts of the AAC psymodel. 3GPP: Remove ffac from and move min_snr out of AacPsyBand. Rearrange AacPsyCoeffs to make it easier to implement energy spreading. Rename the band[] array to bands[] Copy energies and thresholds at the end of analysis. LAME: Use a loop instead of an if chain in LAME windowing.	14 years ago
Nathan Caldwell	d56920e206	aacenc: Correct spreading calculation for high spreading. The 3GPP spec uses the following calculation for high spreading: thr'_spr = max(thr_scaled, s_h(n) * thr_scaled(n-1)) where, n is defined as the current band, and s_h() is defined as "[...] the distance of adjacent bands in Bark and a constant slope that is 15 dB/Bark [...]". This is a little ambiguous as you would assume you want the Bark width of the previous band for this calculation. However, this assumption appears to be incorrect, and you really want the Bark width of the current band. Coincidentally this is exactly what the spec calls for! =P This noticeably improves Tom's Diner at low bitrates (I tested at 64kbps, with mid/side disabled). Patch by: Nathan Caldwell <saintdev@gmail.com> Originally committed as revision 25622 to svn://svn.ffmpeg.org/ffmpeg/trunk	14 years ago
Nathan Caldwell	3ea12f65ba	aacenc: cosmetics: Swap spreading_hi/low name to match the 3GPP spec. Patch by: Nathan Caldwell <saintdev@gmail.com> Originally committed as revision 25621 to svn://svn.ffmpeg.org/ffmpeg/trunk	14 years ago
Nathan Caldwell	c8dcb9dee1	aacenc: Remove energy 'normalization' modification from the 3GPP psymodel This greatly improves bitrate handling. You will now get within a few kbps of your requested bitrate instead of 20-40kbps higher. There is absolutely no analog to this line in the 3GPP spec, that I can find. patch by Nathan Caldwell saintdev (at) gmail Originally committed as revision 25589 to svn://svn.ffmpeg.org/ffmpeg/trunk	14 years ago
Nathan Caldwell	4df5aebb81	aacenc: Fix threshold-in-quiet calculation in the 3GPP psymodel. Removing the modification vastly improves quality (at a slight bitrate cost) for some samples. castanets.wav is a good example. The closest equivalent I see to the modification in the 3GPP spec is a similar modification (over a specific frequency range) when TNS is used. This also changes the threshold-in-quiet calculation to match the 3GPP spec. patch by Nathan Caldwell saintdev (at) gmail Originally committed as revision 25588 to svn://svn.ffmpeg.org/ffmpeg/trunk	14 years ago
Nathan Caldwell	eafadadaf5	aacenc: Fix the conditions under which 3GPP pre-echo control is run. According to the 3GPP spec: "Thus the pre-echo control is inactive for the first short window (but not all short windows in a short frame) after a start block and for all frames with a stop window sequence." Currently, pre-echo control is only run when the current frame is not a short frame, and the previous frame is not a short frame. patch by Nathan Caldwell saintdev (at) gmail Originally committed as revision 25587 to svn://svn.ffmpeg.org/ffmpeg/trunk	14 years ago

1 2

80 Commits (102842d5fbb7c38a437bc128938466b231fe0ce9)