ref: 71019f5f37db5dea30b21d7bd90c3f0d0d456736
parent: cfaf14788e28a561099d8bb4898e6bebd3098174
author: Timothy B. Terriberry <[email protected]>
date: Fri Dec 11 06:30:06 EST 2015
oggopus: First pass updates for AD review comments.
--- a/doc/draft-ietf-codec-oggopus.xml
+++ b/doc/draft-ietf-codec-oggopus.xml
@@ -12,7 +12,8 @@
]>
<?rfc toc="yes" symrefs="yes" ?>
-<rfc ipr="trust200902" category="std" docName="draft-ietf-codec-oggopus-09">
+<rfc ipr="trust200902" category="std" docName="draft-ietf-codec-oggopus-09"
+ updates="5334">
<front>
<title abbrev="Ogg Opus">Ogg Encapsulation for the Opus Audio Codec</title>
@@ -105,8 +106,8 @@
Packets can be split arbitrarily across pages, and continued from one page to
the next (allowing packets much larger than would fit on a single page).
Each page contains 'lacing values' that indicate how the data is partitioned
- into packets, allowing a demuxer to recover the packet boundaries without
- examining the encoded data.
+ into packets, allowing a demultiplexer (demuxer) to recover the packet
+ boundaries without examining the encoded data.
A packet is said to 'complete' on a page when the page contains the final
lacing value corresponding to that packet.
</t>
@@ -128,14 +129,6 @@
document are to be interpreted as described in <xref target="RFC2119"/>.
</t>
-<t>
-Implementations that fail to satisfy one or more "MUST" requirements are
- considered non-compliant.
-Implementations that satisfy all "MUST" requirements, but fail to satisfy one
- or more "SHOULD" requirements are said to be "conditionally compliant".
-All other implementations are "unconditionally compliant".
-</t>
-
</section>
<section anchor="packet_organization" title="Packet Organization">
@@ -180,15 +173,18 @@
same duration.
An implementation of this specification SHOULD treat any Opus packet whose
duration is different from that of the first Opus packet in an Ogg packet as
- if it were a malformed Opus packet with an invalid TOC sequence.
+ if it were a malformed Opus packet with an invalid Table Of Contents (TOC)
+ sequence.
</t>
<t>
-The coding mode (SILK, Hybrid, or CELT), audio bandwidth, channel count,
- duration (frame size), and number of frames per packet, are indicated in the
- TOC (table of contents) sequence at the beginning of each Opus packet, as
- described in Section 3.1 of <xref target="RFC6716"/>.
-The combination of mode, audio bandwidth, and frame size is referred to as
- the configuration of an Opus packet.
+The TOC sequence at the beginning of each Opus packet indicates the coding
+ mode, audio bandwidth, channel count, duration (frame size), and number of
+ frames per packet, as described in Section 3.1
+ of <xref target="RFC6716"/>.
+The coding mode is one of SILK, Hybrid, or Constrained Energy Lapped Transform
+ (CELT),
+The combination of coding mode, audio bandwidth, and frame size is referred to
+ as the configuration of an Opus packet.
</t>
<t>
The first audio data page SHOULD NOT have the 'continued packet' flag set
@@ -269,8 +265,9 @@
<section anchor="gap-repair" title="Repairing Gaps in Real-time Streams">
<t>
In order to support capturing a real-time stream that has lost or not
- transmitted packets, a muxer SHOULD emit packets that explicitly request the
- use of Packet Loss Concealment (PLC) in place of the missing packets.
+ transmitted packets, a multiplexer (muxer) SHOULD emit packets that explicitly
+ request the use of Packet Loss Concealment (PLC) in place of the missing
+ packets.
Implementations that fail to do so still MUST NOT increment the granule
position for a page by anything other than the number of samples contained in
packets that actually complete on that page.
@@ -379,11 +376,11 @@
<t>
A 'pre-skip' field in the ID header (see <xref target="id_header"/>) signals
- the number of samples which SHOULD be skipped (decoded but discarded) at the
+ the number of samples that SHOULD be skipped (decoded but discarded) at the
beginning of the stream.
This amount need not be a multiple of 2.5 ms, MAY be smaller than a single
packet, or MAY span the contents of several packets.
-These samples are not valid audio, and SHOULD NOT be played.
+These samples are not valid audio.
</t>
<t>
@@ -644,6 +641,7 @@
<t>Input Sample Rate (32 bits, unsigned, little
endian):
<vspace blankLines="1"/>
+This is the sample rate of the original input (before encoding), in Hz.
This field is <spanx style="emph">not</spanx> the sample rate to use for
playback of the encoded data.
<vspace blankLines="1"/>
@@ -701,7 +699,7 @@
</figure>
where output_gain is the raw 16-bit value from the header.
<vspace blankLines="1"/>
-Virtually all players and media frameworks SHOULD apply it by default.
+Players and media frameworks SHOULD apply it by default.
If a player chooses to apply any volume adjustment or gain modification, such
as the R128_TRACK_GAIN (see <xref target="comment_header"/>), the adjustment
MUST be applied in addition to this output gain in order to achieve playback
@@ -725,15 +723,13 @@
<vspace blankLines="1"/>
This octet indicates the order and semantic meaning of the output channels.
<vspace blankLines="1"/>
-Each possible value of this octet indicates a mapping family, which defines a
- set of allowed channel counts, and the ordered set of channel names for each
- allowed channel count.
+Each currently specified value of this octet indicates a mapping family, which
+ defines a set of allowed channel counts, and the ordered set of channel names
+ for each allowed channel count.
The details are described in <xref target="channel_mapping"/>.
</t>
<t>Channel Mapping Table:
This table defines the mapping from encoded streams to output channels.
-It MUST be omitted when the channel mapping family is 0, but is
- REQUIRED otherwise.
Its contents are specified in <xref target="channel_mapping"/>.
</t>
</list>
@@ -743,8 +739,8 @@
All fields in the ID headers are REQUIRED, except for the channel mapping
table, which MUST be omitted when the channel mapping family is 0, but
is REQUIRED otherwise.
-Implementations SHOULD reject ID headers which do not contain enough data for
- these fields, even if they contain a valid Magic Signature.
+Implementations SHOULD reject streams with ID headers that do not contain
+ enough data for these fields, even if they contain a valid Magic Signature.
Future versions of this specification, even backwards-compatible versions,
might include additional fields in the ID header.
If an ID header has a compatible major version, but a larger minor version,
@@ -874,7 +870,7 @@
<section anchor="channel_mapping_1" title="Channel Mapping Family 1">
<t>
Allowed numbers of channels: 1...8.
-Vorbis channel order.
+Vorbis channel order (see below).
</t>
<t>
Each channel is assigned to a speaker location in a conventional surround
@@ -897,7 +893,7 @@
as those used by the Vorbis codec <xref target="vorbis-mapping"/>.
The ordering is different from the one used by the
WAVE <xref target="wave-multichannel"/> and
- FLAC <xref target="flac"/> formats,
+ Free Lossless Audio Codec (FLAC) <xref target="flac"/> formats,
so correct ordering requires permutation of the output channels when decoding
to or encoding from those formats.
'LFE' here refers to a Low Frequency Effects channel, often mapped to a
@@ -929,8 +925,8 @@
title="Undefined Channel Mappings">
<t>
The remaining channel mapping families (2...254) are reserved.
-An implementation encountering a reserved channel mapping family value SHOULD
- act as though the value is 255.
+An implementation encountering a reserved channel mapping family value MUST act
+ as though the value is 255.
</t>
</section>
@@ -1193,7 +1189,7 @@
<t>
The comment header can be arbitrarily large and might be spread over a large
number of Ogg pages.
-Implementations SHOULD avoid attempting to allocate excessive amounts of memory
+Implementations MUST avoid attempting to allocate excessive amounts of memory
when presented with a very large comment header.
To accomplish this, implementations MAY reject a comment header larger than
125,829,120 octets, and MAY ignore individual comments that are not fully
@@ -1238,8 +1234,8 @@
'output gain' field.
</t>
<t>
-An Ogg Opus stream MUST NOT have more than one of each tag, and if present
- their values MUST be an integer from -32768 to 32767, inclusive,
+An Ogg Opus stream MUST NOT have more than one of each of these tags, and if
+ present their values MUST be an integer from -32768 to 32767, inclusive,
represented in ASCII as a base 10 number with no whitespace.
A leading '+' or '-' character is valid.
Leading zeros are also permitted, but the value MUST be represented by
@@ -1255,8 +1251,8 @@
<spanx style="emph">in addition</spanx> to the 'output gain' value.
If a tool modifies the ID header's 'output gain' field, it MUST also update or
remove the R128_TRACK_GAIN and R128_ALBUM_GAIN comment tags if present.
-A muxer SHOULD assume that by default tools will respect the 'output gain'
- field, and not the comment tag.
+A muxer SHOULD place the gain it wants other tools to use by default into the
+ 'output gain' field, and not the comment tag.
</t>
<t>
To avoid confusion with multiple normalization schemes, an Opus comment header
@@ -1282,10 +1278,11 @@
When encoding, implementations SHOULD limit the use of padding in audio data
packets to no more than is necessary to make a variable bitrate (VBR) stream
constant bitrate (CBR).
-Demuxers SHOULD reject audio data packets larger than 61,440 octets per
+Demuxers SHOULD reject audio data packets (treat them as if they were malformed
+ Opus packets with an invalid TOC sequence) larger than 61,440 octets per
Opus stream.
Such packets necessarily contain more padding than needed for this purpose.
-Demuxers SHOULD avoid attempting to allocate excessive amounts of memory when
+Demuxers MUST avoid attempting to allocate excessive amounts of memory when
presented with a very large packet.
Demuxers MAY reject or partially process audio data packets larger than
61,440 octets in an Ogg Opus stream with channel mapping families 0
@@ -1336,8 +1333,9 @@
algorithmic delay of the Opus encoder.
</t>
<t>
-In encoders derived from the reference implementation, the number of
- samples can be queried with:
+In encoders derived from the reference
+ implementation <xref target="RFC6716"/>, the number of samples can be
+ queried with:
</t>
<figure align="center">
<artwork align="center"><![CDATA[
@@ -1550,6 +1548,7 @@
&rfc2119;
&rfc3533;
&rfc3629;
+ &rfc4732;
&rfc5334;
&rfc6381;
&rfc6716;
@@ -1580,7 +1579,6 @@
<references title="Informative References">
<!--?rfc include="http://xml.resource.org/public/rfc/bibxml/reference.RFC.3550.xml"?-->
- &rfc4732;
&rfc6982;
&rfc7587;