ref: 22757c390ad0f75016910283cb34e359a73df7aa
parent: e8c437c43278ba95d4de1c0139cc61b0b98cb980
author: Jean-Marc Valin <[email protected]>
date: Fri May 11 12:00:45 EDT 2012
More Gen-art changes
--- a/doc/draft-ietf-codec-opus.xml
+++ b/doc/draft-ietf-codec-opus.xml
@@ -554,13 +554,13 @@
<t>
The top five bits of the TOC byte, labeled "config", encode one of 32 possible
configurations of operating mode, audio bandwidth, and frame size.
-As described, the LP layer and MDCT layer can be combined in three possible
+As described, the LP (SILK) layer and MDCT (CELT) layer can be combined in three possible
operating modes:
<list style="numbers">
-<t>An LP-only mode for use in low bitrate connections with an audio bandwidth
+<t>A SILK-only mode for use in low bitrate connections with an audio bandwidth
of WB or less,</t>
-<t>A Hybrid (LP+MDCT) mode for SWB or FB speech at medium bitrates, and</t>
-<t>An MDCT-only mode for very low delay speech transmission as well as music
+<t>A Hybrid (SILK+CELT) mode for SWB or FB speech at medium bitrates, and</t>
+<t>A CELT-only mode for very low delay speech transmission as well as music
transmission (NB to FB).</t>
</list>
The 32 possible configurations each identify which one of these operating modes
@@ -712,7 +712,7 @@
<section title="Code 2: Two Frames in the Packet, with Different Compressed Sizes">
<t>
For code 2 packets, the TOC byte is followed by a one- or two-byte sequence
- indicating the length of the first frame (marked N1 in the figure below),
+ indicating the length of the first frame (marked N1 in <xref target='code2_packet'/>),
followed by N1 bytes of compressed data for the first frame.
The remaining N-N1-2 or N-N1-3 bytes are the compressed data for the
second frame.
@@ -752,9 +752,9 @@
Opus layer, rather than at the transport layer.
Code 3 packets MUST have at least 2 bytes.
The TOC byte is followed by a byte encoding the number of frames in the packet
- in bits 2 to 7 (marked "M" in the figure below), with bit 1 indicating whether
- or not Opus padding is inserted (marked "p" in the figure below), and bit 0
- indicating VBR (marked "v" in the figure below).
+ in bits 2 to 7 (marked "M" in <xref target='frame_count_byte'/>), with bit 1 indicating whether
+ or not Opus padding is inserted (marked "p" in <xref target='frame_count_byte'/>), and bit 0
+ indicating VBR (marked "v" in <xref target='frame_count_byte'/>).
M MUST NOT be zero, and the audio duration contained within a packet MUST NOT
exceed 120 ms.
This limits the maximum frame count for any frame size to 48 (for 2.5 ms
@@ -802,9 +802,9 @@
</t>
<t>
In the CBR case, the compressed length of each frame in bytes is equal to the
- number of remaining bytes in the packet after subtracting the (optional)
- padding, (N-2-P), divided by M.
-This number MUST be a non-negative integer multiple of M.
+ number of remaining bytes R in the packet after subtracting the (optional)
+ padding, (R=N-2-P), divided by M.
+The value R MUST be a non-negative integer multiple of M.
The compressed data for all M frames then follows, each of size
(N-2-P)/M bytes, as illustrated in <xref target="code3cbr_packet"/>.
</t>
@@ -839,7 +839,7 @@
<t>
In the VBR case, the (optional) padding length is followed by M-1 frame
- lengths (indicated by "N1" to "N[M-1]" in the figure below), each encoded in a
+ lengths (indicated by "N1" to "N[M-1]" in <xref target='code3vbr_packet'/>), each encoded in a
one- or two-byte sequence as described above.
The packet MUST contain enough data for the M-1 lengths after removing the
(optional) padding, and the sum of these lengths MUST be no larger than the
@@ -848,8 +848,8 @@
indicated number of bytes, with the final frame consuming any remaining bytes
before the final padding, as illustrated in <xref target="code3cbr_packet"/>.
The number of header bytes (TOC byte, frame count byte, padding length bytes,
- and frame length bytes), plus the length of the first M-1 frames themselves,
- plus the length of the padding MUST be no larger than N, the total size of the
+ and frame length bytes), plus the signalled length of the first M-1 frames themselves,
+ plus the signalled length of the padding MUST be no larger than N, the total size of the
packet.
</t>
@@ -890,7 +890,7 @@
Simplest case, one NB mono 20 ms SILK frame:
</t>
-<figure>
+<figure anchor='framing_example_1'>
<artwork><![CDATA[
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
@@ -904,7 +904,7 @@
Two FB mono 5 ms CELT frames of the same compressed size:
</t>
-<figure>
+<figure anchor='framing_example_2'>
<artwork><![CDATA[
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
@@ -918,7 +918,7 @@
Two FB mono 20 ms Hybrid frames of different compressed size:
</t>
-<figure>
+<figure anchor='framing_example_3'>
<artwork><![CDATA[
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
@@ -934,7 +934,7 @@
Four FB stereo 20 ms CELT frames of the same compressed size:
</t>
-<figure>
+<figure anchor='framing_example_4'>
<artwork><![CDATA[
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1