ref: 19caaddba86b86f3152f455fbcfc90bf39a19fc8
parent: 905fa5ba04b512c23f239549c28b9432cd5a63f6
author: Jean-Marc Valin <[email protected]>
date: Wed Feb 23 12:29:42 EST 2011
energy decoding
--- a/doc/draft-ietf-codec-opus.xml
+++ b/doc/draft-ietf-codec-opus.xml
@@ -506,25 +506,26 @@
<ttcol align='center'>Symbol(s)</ttcol>
<ttcol align='center'>PDF</ttcol>
<ttcol align='center'>Condition</ttcol>
-<c>silence</c> <c>logp=15</c> <c></c>
-<c>post-filter</c> <c>logp=1</c> <c></c>
+<c>silence</c> <c>[32767, 1]/32768</c> <c></c>
+<c>post-filter</c> <c>[1, 1]/2</c> <c></c>
<c>octave</c> <c>uniform (6)</c><c>post-filter</c>
<c>period</c> <c>raw bits (4+octave)</c><c>post-filter</c>
<c>gain</c> <c>raw bits (3)</c><c>post-filter</c>
<c>tapset</c> <c>[2, 1, 1]/4</c><c>post-filter</c>
-<c>transient</c> <c>logp=3</c><c></c>
+<c>transient</c> <c>[7, 1]/8</c><c></c>
+<c>intra</c> <c>[7, 1]/8</c><c></c>
<c>coarse energy</c><c><xref target="energy-decoding"/></c><c></c>
<c>tf_change</c> <c><xref target="transient-decoding"/></c><c></c>
-<c>tf_select</c> <c>logp=1</c><c><xref target="transient-decoding"/></c>
+<c>tf_select</c> <c>[1, 1]/2</c><c><xref target="transient-decoding"/></c>
<c>spread</c> <c>[7, 2, 21, 2]/32</c><c></c>
<c>dyn. alloc.</c> <c><xref target="allocation"/></c><c></c>
<c>alloc. trim</c> <c>[2, 2, 5, 10, 22, 46, 22, 10, 5, 2, 2]/128</c><c></c>
-<c>skip (*)</c> <c>logp=1</c><c><xref target="allocation"/></c>
+<c>skip (*)</c> <c>[1, 1]/2</c><c><xref target="allocation"/></c>
<c>intensity (*)</c><c>uniform</c><c><xref target="allocation"/></c>
-<c>dual (*)</c> <c>logp=1</c><c></c>
+<c>dual (*)</c> <c>[1, 1]/2</c><c></c>
<c>fine energy</c> <c><xref target="energy-decoding"/></c><c></c>
<c>residual</c> <c><xref target="PVQ-decoder"/></c><c></c>
-<c>anti-collapse</c><c>logp=1</c><c>stereo && transient</c>
+<c>anti-collapse</c><c>[1, 1]/2</c><c>stereo && transient</c>
<c>finalize</c> <c><xref target="energy-decoding"/></c><c></c>
<postamble>Order of the symbols in the CELT section of the bit-stream</postamble>
</texttable>
@@ -555,22 +556,70 @@
</section>
<section anchor="energy-decoding" title="Energy Envelope Decoding">
+
<t>
-The energy of each band is extracted from the bit-stream in two steps according
-to the same coarse-fine strategy used in the encoder. First, the coarse energy is
-decoded in unquant_coarse_energy() (quant_bands.c)
-based on the probability of the Laplace model used by the encoder.
-</t>
+It is important to quantize the energy with sufficient resolution because
+any energy quantization error cannot be compensated for at a later
+stage. Regardless of the resolution used for encoding the shape of a band,
+it is perceptually important to preserve the energy in each band. CELT uses a
+three-step coarse-fine-fine strategy for encoding the energy in the base-2 log
+domain, as implemented in quant_bands.c</t>
+<section anchor="coarse-energy-decoding" title="Coarse energy decoding">
<t>
-After the coarse energy is decoded, the same allocation function as used in the
-encoder is called. This determines the number of
-bits to decode for the fine energy quantization. The decoding of the fine energy bits
-is performed by unquant_fine_energy() (quant_bands.c).
-Finally, like the encoder, the remaining bits in the stream (that would otherwise go unused)
-are decoded using unquant_energy_finalise() (quant_bands.c).
+Coarse quantization of the energy uses a fixed resolution of 6 dB
+(integer part of base-2 log). To minimize the bitrate, prediction is applied
+both in time (using the previous frame) and in frequency (using the previous
+bands). The part of the prediction that is based on the
+previous frame can be disabled, creating an "intra" frame where the energy
+is coded without reference to prior frames. The decoder first reads the intra flag
+to determine what prediction is used.
+The 2-D z-transform of
+the prediction filter is: A(z_l, z_b)=(1-a*z_l^-1)*(1-z_b^-1)/(1-b*z_b^-1)
+where b is the band index and l is the frame index. The prediction coefficients
+applied depend on the frame size in use when not using intra energy and a=0 b=4915/32768
+when using intra energy.
+The time-domain prediction is based on the final fine quantization of the previous
+frame, while the frequency domain (within the current frame) prediction is based
+on coarse quantization only (because the fine quantization has not been computed
+yet). The prediction is clamped internally so that fixed point implementations with
+limited dynamic range to not suffer desynchronization.
+We approximate the ideal
+probability distribution of the prediction error using a Laplace distribution
+with seperate parameters for each frame size in intra and inter-frame modes. The
+coarse energy quantization is performed by unquant_coarse_energy() and
+unquant_coarse_energy_impl() (quant_bands.c). The encoding of the Laplace-distributed values is
+implemented in ec_laplace_decode() (laplace.c).
</t>
+
</section>
+
+<section anchor="fine-energy-decoding" title="Fine energy quantization">
+<t>
+The number of bits assigned to fine energy quantization in each band is determined
+by the bit allocation computation described in <xref target="allocation"></xref>.
+Let B_i be the number of fine energy bits
+for band i; the refinement is an integer f in the range [0,2^B_i-1]. The mapping between f
+and the correction applied to the coarse energy is equal to (f+1/2)/2^B_i - 1/2. Fine
+energy quantization is implemented in quant_fine_energy() (quant_bands.c).
+</t>
+<t>
+When some bits are left "unused" after all other flags have been decoded, these bits
+are assigned to a "final" step of fine allocation. In effect, these bits are used
+to add one extra fine energy bit per band per channel. The allocation process
+determines two <spanx style="emph">priorities</spanx> for the final fine bits.
+Any remaining bits are first assigned only to bands of priority 0, starting
+from band 0 and going up. If all bands of priority 0 have received one bit per
+channel, then bands of priority 1 are assigned an extra bit per channel,
+starting from band 0. If any bit is left after this, they are left unused.
+This is implemented in unquant_energy_finalise() (quant_bands.c).
+</t>
+
+</section> <!-- fine energy -->
+
+</section> <!-- Energy decode -->
+
+
<section anchor="allocation" title="Bit allocation">
<t>Bit allocation is performed based only on information available to both