Prevents int->float conversions on every loop.
Performance gain on synthetic benchmarks: 13%.
Suggested by kamedo2.
Signed-off-by: Rostislav Pehlivanov <atomnuker@gmail.com>
Most code between the 2 functions was duplicated which made keeping
both in sync difficult.
This also fixes some discovered issues with encoding (incorrect
TF switching buffers) and reduces stack usage (reuse the already
allocated CeltFrame->scratch buffer for the quantized coefficients).
Signed-off-by: Rostislav Pehlivanov <atomnuker@gmail.com>
Since the PVQ search has been well fuzzed and is guaranteed to never
break SUM(abs(y[])) == K, the assert is no longer needed.
Also the assert only prevented coding the wrong vector index but didn't
prevent crashes during searching for it, which made the assert rather
informational than practical.
Signed-off-by: Rostislav Pehlivanov <atomnuker@gmail.com>
Since the probelm mentioned only happened when the phase was negative
(e.g. the sum had to be decreased), only discarding dimensions with a
zero pulse in that case restored the search's previously low distortion
at low Ks when the phase is never negative.
Signed-off-by: Rostislav Pehlivanov <atomnuker@gmail.com>
If the PVQ search picked a place to increment/decrement on the y[]
vector which had no pulse then it would cause a desync since it would
change the sum in the wrong direction. Fix this by not considering
places without pulses as viable.
This makes the PVQ search slightly worse at K < 5 which isn't all that
common. Still, this is a workaround to prevent making broken files until
I can think of a better way of fixing it.
Also add an assertion, which can be removed or moved to assert1/2 once
the PVQ search is stable.
Signed-off-by: Rostislav Pehlivanov <atomnuker@gmail.com>
This marks the first time anyone has written an Opus encoder without
using any libopus code. The aim of the encoder is to prove how far
the format can go by writing the craziest encoder for it.
Right now the encoder's basic, it only supports CBR encoding, however
internally every single feature the CELT layer has is implemented
(except the pitch pre-filter which needs to work well with the rest of
whatever gets implemented). Psychoacoustic and rate control systems are
under development.
The encoder takes in frames of 120 samples and depending on the value of
opus_delay the plan is to use the extra buffered frames as lookahead.
Right now the encoder will pick the nearest largest legal frame size and
won't use the lookahead, but that'll change once there's a
psychoacoustic system.
Even though its a pretty basic encoder its already outperforming
any other native encoder FFmpeg has by a huge amount.
The PVQ search algorithm is faster and more accurate than libopus's
algorithm so the encoder's performance is close to that of libopus
at zero complexity (libopus has more SIMD).
The algorithm might be ported to libopus or other codecs using PVQ in
the future.
The encoder still has a few minor bugs, like desyncs at ultra low
bitrates (below 9kbps with 20ms frames).
Signed-off-by: Rostislav Pehlivanov <atomnuker@gmail.com>
This is meant to be applied on top of my previous patch which
split PVQ into celt_pvq.c and made opus_celt.h
Essentially nothing has been changed other than renaming CeltFrame
to CeltBlock (CeltFrame had absolutely nothing at all to do with
a frame) and CeltContext to CeltFrame.
3 variables have been put in CeltFrame as they make more sense
there rather than being passed around as arguments.
The coefficients have been moved to the CeltBlock structure
(why the hell were they in CeltContext and not in CeltFrame??).
Now the encoder would be able to use the exact context the decoder
uses (plus a couple of extra fields in there).
FATE passes, no slowdowns, etc.
Signed-off-by: Rostislav Pehlivanov <atomnuker@gmail.com>
A huge amount can be reused by the encoder, as the only thing
which needs to be done would be to add a 10 line celt_icwrsi,
a wrapper around it (celt_alg_quant) and templating the
ff_celt_decode_band to replace entropy decoding functions
with entropy encoding.
There is no performance loss but in fact a performance gain of
around 6% which is caused by the compiler being able to optimize
the decoding more efficiently.
Signed-off-by: Rostislav Pehlivanov <atomnuker@gmail.com>