a few minor optimisations
removed unneeded variable shifts from alg_quant()
separated the two passes from interp_bits2pulses()
optimisation: removed the shifts from the vq_index() inner loop
optimisation: better indexing/looping in vq_index()
pseudo-stack no longer checks on every function entry whether it has been
optimisation: one less conditional branch in pulse2comb()
optimisation: Removed a bunch of conditional branches from comb2pulse()
Saturation in SIG2INT16 using MIN/MAX
Removed implicit 32=>16 conversion (changed to EXTRACT16)
optimisation: Got rid of the 32-bit mul in find_spectral_pitch()
minor simplification in alg_quant()
fixed an issue (lacking parentheses) in the no-op version of BITREV
Made twiddle pointer in mdct more explicit
optimisation: Making it clear to the compiler that many of the loops in cwrs
Make use of CELT_MEMSET() in find_spectral_pitch()
Fixed incorrect energy calculation in stereo intra prediction
optimisation: reworked intra_prediction() so that yy is computed fully only
optimisation: intra_prediction() uses a 16-bit numerator for the search
optimisation: managed to avoid dividing in the "full gain" case of alg_quant()
oops, forgot to make the gain a 16-bit var
optimisation: simplified the "full gain" case of alg_quant() to remove some
optimisation: another bunch of simplifications to the "simple case" of the
a few loop optimisations.
optimisation: merged the init loop of vq_quant().
fix minor compilation error/warning
fixed three declaration-after-statement issues
fixed a few warnings, no real change
Reworked the static modes. Now, if all static modes have the same frame size,
some index work (simplifications for dumb compilers) on IMDCT
Added a missing RESTORE_STACK in intra_prediction()
Not all compilers are equal -- making it clearer how the MDCT indexing is done
Defining IMUL32 for 32x32=>32 int multiplications and using it in the range
Simplified indexing in intra_prediction()
fixed ordering of the channels in the intra prediction.
Defining DISABLE_STEREO now optimises for the mono case
Fixed a stereo regression introduced in e28f25f0d14959d521fda0cdb8f1220995bc50e8
Fixed rsqrt testcase for float
Rework CWRS code.
Changed the rules for using the pulse spreading. It should be used less often
Revert ABS16/32 on C55 -- ended up being slower
ABS16 and ABS32 for the C55
Making the pulsesAtOnce code 16-bit safe.
Just commenting -- nothing to see.
Optimisation: got rid of about 10% of the 32-bit divisions by using ec_enc_uint
Removed a few int divisions from the intra prediction code.
fixed-point: using MULT16_16 instead of * in compute_band_energies()
Making a few functions static inline
Trying to clean up celt_ilog2() vs. EC_ILOG a bit.
making {next|prev}_cwrs* inline
optimisation: changed some for() loops to do-while() to give the compiler
Making it obvious to the compiler how to generate a dual-MAC in
mix_pitch_and_residual() no longer computing Ryp twice
optimisation: defined a reciprocal square root (celt_rsqrt) for use in
Fixed the rcp() testcase for new assumptions (x is positive)
optimisation: shaving a few cycles off prev_cwrs* by not computed the values
optimisations: faster handling of the zero for compute_band_energies() and
changed 1*rcp(x) to just rcp(x)
optimisation: intra_prediction() no longer needs to divide inside the search
optimisation: The "simple" Rxy/sqrt(Ryy) case in alg_quant no longer requires
Decision on whether to use pitch is now taken only based on energy in the
properly defined EPSILON for the float case
A bunch of pointers marked as "restrict" to ease the job of the compiler
optimisation: spreading_func now in-place with no branch in the loop and half
oops. find_max32() now uses VERY_LARGE32 (instead of VERY_LARGE16)
optimisations: Another bunch of simplifications to alg_quant(), mainly to
optimisation: Making use of restrict in find_spectral_pitch() to disambiguate
optimisations: caching sign of x in alg_quant(), changed celt_div()/celt_rcp()
Optimised intra prediction a bit -- removed a conditional branch and replaced
Removed support for more than one MDCT blocks per frame. I don't think there's
Removed the "pitch compression" in the residual quantisation. Also, removed
Unrolled the inner loop in vq_index() so that the codebook unpacking doesn't
Making bits2pulses() use a fixed number of iterations to allow further
include "dsplib.h" in fixed_c5x.h
replaced divisions by recoprocals in intra prediction and folding
defined find_max16 and overrode it for C55x
Made a second version of ec_{en|de}code optimised for encoding bits (no div
No longer trying to save bits when encoding integers near the upper limit
fixed-point: added cheap celt_div() division using a reciprocal
Using restrict to make it clear there's no aliasing issues in the mdct.
Added a few "restrict" keywords and changed some divisions to shifts
fixed TI fft code -- again
Removed potentially unused var in MDCT init
local var name maxval was shadowing the TI function used to compute it
make sure TRIG_UPSCALE is properly defined
fix for TI version of celt_maxabs16()
fixed-point: fix for 32-bit TI FFT
fixed-point: Wrapper for the 32-bit complex FFT used in the MDCT so we can use
fixed-point: defined celt_maxabs16() as basic operator
fixed-point: MULT16_32_Q15 for TI DSP (not entirely happy with it)
fixed-point: using TI intrinsic for celt_ilog2() if available.
Wrapper for the TI dsplib FFT
Making the real/single FFT easier to replace
Random numbers should work on 16-bit archs.