ref: 2ad6eafcda3945c7c48f6d4001402346c5dc8205
parent: 25ffd5cd91e61856ffb70549f16cc74273688f84
author: Ralph Giles <[email protected]>
date: Fri May 24 14:28:58 EDT 2013
Merge JM's encoder suggestions. I've done some editing for clarity, but more needs to be done. The language needs clean-up, we should forward-reference the LPC Extrapolation section, and we need a reference for actually computing linear prediction coefficients.
--- a/doc/draft-ietf-codec-oggopus.xml
+++ b/doc/draft-ietf-codec-oggopus.xml
@@ -1138,6 +1138,81 @@
</t>
</section>
+<section anchor="encoder" title="Encoder Guidelines">
+<t>
+When encoding Opus files, Ogg encoders should take into account the
+ algorithmic delay of the Opus encoder.
+In encoders derived from the reference implementation, the number of
+ samples can be queried with:
+
+ opus_encoder_ctl(encoder_state, OPUS_GET_LOOKAHEAD, &samples_delay);
+
+To achieve good quality in the very first samples of a stream, the Ogg encoder
+ MAY use LPC extrapolation to generate at least 120 extra samples
+ (extra_samples) at the beginning to avoid the Opus encoder having to encode
+ a discontinuous signal.
+For an input file containing length samples, the Ogg encoder, SHOULD set the
+ preskip header flag to samples_delay+extra_samples, encode at least
+ length+samples_delay+extra_samples samples, and set the granulepos of the last
+ page to length+samples_delay+extra_samples.
+This ensures that the encoded file has the same duration as the original, with
+ no time offset. The best way to pad the end of the stream is to also use LPC
+ extrapolation, but zero-padding is also acceptable.
+</t>
+
+<section anchor="lpc" title="LPC Extrapolation">
+<t>
+The first step in LPC extrapolation is to compute linear prediction
+ coefficients.
+When extending the end of the signal, order-N (typically with N ranging from 8
+ to 40) LPC analysis is performed on a window near the end of the signal.
+The last N samples are used as memory to an infinite impulse response (IIR)
+ filter.
+The filter is then applied on a zero input to extrapolate the end of the signal.
+Let a(k) be the kth LPC coefficient and x(n) be the nth sample of the signal,
+ each new sample past the end of the signal is computed as:
+<artwork align="center"><![CDATA[
+ N
+ ---
+x(n) = \ a(k)*x(n-k)
+ /
+ ---
+ k=1
+]]></artwork>
+The process is repeated independently for each channel.
+It is possible to extend the beginning of the signal by applying the same
+ process backward in time.
+When extending the beginning of the signal, it is best to apply a "fade in" to
+ the extrapolated signal, e.g. by multiplying it by a half-Hanning window.
+</t>
+
+</section>
+
+<section anchor="continuous_chaining" title="Continuous Chaining">
+<t>
+In some applications, such as Internet radio, it is desirable to cut a long
+ streams into smaller chains, e.g. so the comment header can be updated.
+This can be done simply by separating the input streams into segments and
+ encoding each segment independently.
+The drawback of this approach is that it creates a small discontinuity
+ at the boundary due to the lossy nature of Opus.
+An encoder MAY avoid this discontinuity by using the following procedure:
+<list style="numbers">
+<t>Encode the last frame of the first segment as an independent frame by
+ turning off all forms of inter-frame prediction.
+De-emphasis is allowed.</t>
+<t>Set the granulepos of the last page to a point near the end of the last
+ frame.</t>
+<t>Begin the second segment with a copy of the last frame of the first
+ segment.</t>
+<t>Set the preskip flag of the second stream in such a way as to properly
+ join the two streams.</t>
+<t>Continue the encoding process normally from there, without any reset to
+ the encoder.</t>
+</list>
+</t>
+</section>
+
<section anchor="implementation" title="Implementation Status">
<t>
A brief summary of major implementations of this draft is available