shithub: opus

Download patch

ref: 71019f5f37db5dea30b21d7bd90c3f0d0d456736
parent: cfaf14788e28a561099d8bb4898e6bebd3098174
author: Timothy B. Terriberry <[email protected]>
date: Fri Dec 11 06:30:06 EST 2015

oggopus: First pass updates for AD review comments.

--- a/doc/draft-ietf-codec-oggopus.xml
+++ b/doc/draft-ietf-codec-oggopus.xml
@@ -12,7 +12,8 @@
 ]>
 <?rfc toc="yes" symrefs="yes" ?>
 
-<rfc ipr="trust200902" category="std" docName="draft-ietf-codec-oggopus-09">
+<rfc ipr="trust200902" category="std" docName="draft-ietf-codec-oggopus-09"
+ updates="5334">
 
 <front>
 <title abbrev="Ogg Opus">Ogg Encapsulation for the Opus Audio Codec</title>
@@ -105,8 +106,8 @@
 Packets can be split arbitrarily across pages, and continued from one page to
  the next (allowing packets much larger than would fit on a single page).
 Each page contains 'lacing values' that indicate how the data is partitioned
- into packets, allowing a demuxer to recover the packet boundaries without
- examining the encoded data.
+ into packets, allowing a demultiplexer (demuxer) to recover the packet
+ boundaries without examining the encoded data.
 A packet is said to 'complete' on a page when the page contains the final
  lacing value corresponding to that packet.
 </t>
@@ -128,14 +129,6 @@
  document are to be interpreted as described in <xref target="RFC2119"/>.
 </t>
 
-<t>
-Implementations that fail to satisfy one or more "MUST" requirements are
- considered non-compliant.
-Implementations that satisfy all "MUST" requirements, but fail to satisfy one
- or more "SHOULD" requirements are said to be "conditionally compliant".
-All other implementations are "unconditionally compliant".
-</t>
-
 </section>
 
 <section anchor="packet_organization" title="Packet Organization">
@@ -180,15 +173,18 @@
  same duration.
 An implementation of this specification SHOULD treat any Opus packet whose
  duration is different from that of the first Opus packet in an Ogg packet as
- if it were a malformed Opus packet with an invalid TOC sequence.
+ if it were a malformed Opus packet with an invalid Table Of Contents (TOC)
+ sequence.
 </t>
 <t>
-The coding mode (SILK, Hybrid, or CELT), audio bandwidth, channel count,
- duration (frame size), and number of frames per packet, are indicated in the
- TOC (table of contents) sequence at the beginning of each Opus packet, as
- described in Section&nbsp;3.1 of&nbsp;<xref target="RFC6716"/>.
-The combination of mode, audio bandwidth, and frame size is referred to as
- the configuration of an Opus packet.
+The TOC sequence at the beginning of each Opus packet indicates the coding
+ mode, audio bandwidth, channel count, duration (frame size), and number of
+ frames per packet, as described in Section&nbsp;3.1
+ of&nbsp;<xref target="RFC6716"/>.
+The coding mode is one of SILK, Hybrid, or Constrained Energy Lapped Transform
+ (CELT),
+The combination of coding mode, audio bandwidth, and frame size is referred to
+ as the configuration of an Opus packet.
 </t>
 <t>
 The first audio data page SHOULD NOT have the 'continued packet' flag set
@@ -269,8 +265,9 @@
 <section anchor="gap-repair" title="Repairing Gaps in Real-time Streams">
 <t>
 In order to support capturing a real-time stream that has lost or not
- transmitted packets, a muxer SHOULD emit packets that explicitly request the
- use of Packet Loss Concealment (PLC) in place of the missing packets.
+ transmitted packets, a multiplexer (muxer) SHOULD emit packets that explicitly
+ request the use of Packet Loss Concealment (PLC) in place of the missing
+ packets.
 Implementations that fail to do so still MUST NOT increment the granule
  position for a page by anything other than the number of samples contained in
  packets that actually complete on that page.
@@ -379,11 +376,11 @@
 
 <t>
 A 'pre-skip' field in the ID header (see <xref target="id_header"/>) signals
- the number of samples which SHOULD be skipped (decoded but discarded) at the
+ the number of samples that SHOULD be skipped (decoded but discarded) at the
  beginning of the stream.
 This amount need not be a multiple of 2.5&nbsp;ms, MAY be smaller than a single
  packet, or MAY span the contents of several packets.
-These samples are not valid audio, and SHOULD NOT be played.
+These samples are not valid audio.
 </t>
 
 <t>
@@ -644,6 +641,7 @@
 <t>Input Sample Rate (32 bits, unsigned, little
  endian):
 <vspace blankLines="1"/>
+This is the sample rate of the original input (before encoding), in Hz.
 This field is <spanx style="emph">not</spanx> the sample rate to use for
  playback of the encoded data.
 <vspace blankLines="1"/>
@@ -701,7 +699,7 @@
 </figure>
  where output_gain is the raw 16-bit value from the header.
 <vspace blankLines="1"/>
-Virtually all players and media frameworks SHOULD apply it by default.
+Players and media frameworks SHOULD apply it by default.
 If a player chooses to apply any volume adjustment or gain modification, such
  as the R128_TRACK_GAIN (see <xref target="comment_header"/>), the adjustment
  MUST be applied in addition to this output gain in order to achieve playback
@@ -725,15 +723,13 @@
 <vspace blankLines="1"/>
 This octet indicates the order and semantic meaning of the output channels.
 <vspace blankLines="1"/>
-Each possible value of this octet indicates a mapping family, which defines a
- set of allowed channel counts, and the ordered set of channel names for each
- allowed channel count.
+Each currently specified value of this octet indicates a mapping family, which
+ defines a set of allowed channel counts, and the ordered set of channel names
+ for each allowed channel count.
 The details are described in <xref target="channel_mapping"/>.
 </t>
 <t>Channel Mapping Table:
 This table defines the mapping from encoded streams to output channels.
-It MUST be omitted when the channel mapping family is 0, but is
- REQUIRED otherwise.
 Its contents are specified in <xref target="channel_mapping"/>.
 </t>
 </list>
@@ -743,8 +739,8 @@
 All fields in the ID headers are REQUIRED, except for the channel mapping
  table, which MUST be omitted when the channel mapping family is 0, but
  is REQUIRED otherwise.
-Implementations SHOULD reject ID headers which do not contain enough data for
- these fields, even if they contain a valid Magic Signature.
+Implementations SHOULD reject streams with ID headers that do not contain
+ enough data for these fields, even if they contain a valid Magic Signature.
 Future versions of this specification, even backwards-compatible versions,
  might include additional fields in the ID header.
 If an ID header has a compatible major version, but a larger minor version,
@@ -874,7 +870,7 @@
 <section anchor="channel_mapping_1" title="Channel Mapping Family 1">
 <t>
 Allowed numbers of channels: 1...8.
-Vorbis channel order.
+Vorbis channel order (see below).
 </t>
 <t>
 Each channel is assigned to a speaker location in a conventional surround
@@ -897,7 +893,7 @@
  as those used by the Vorbis codec <xref target="vorbis-mapping"/>.
 The ordering is different from the one used by the
  WAVE <xref target="wave-multichannel"/> and
- FLAC <xref target="flac"/> formats,
+ Free Lossless Audio Codec (FLAC) <xref target="flac"/> formats,
  so correct ordering requires permutation of the output channels when decoding
  to or encoding from those formats.
 'LFE' here refers to a Low Frequency Effects channel, often mapped to a
@@ -929,8 +925,8 @@
  title="Undefined Channel Mappings">
 <t>
 The remaining channel mapping families (2...254) are reserved.
-An implementation encountering a reserved channel mapping family value SHOULD
- act as though the value is 255.
+An implementation encountering a reserved channel mapping family value MUST act
+ as though the value is 255.
 </t>
 </section>
 
@@ -1193,7 +1189,7 @@
 <t>
 The comment header can be arbitrarily large and might be spread over a large
  number of Ogg pages.
-Implementations SHOULD avoid attempting to allocate excessive amounts of memory
+Implementations MUST avoid attempting to allocate excessive amounts of memory
  when presented with a very large comment header.
 To accomplish this, implementations MAY reject a comment header larger than
  125,829,120&nbsp;octets, and MAY ignore individual comments that are not fully
@@ -1238,8 +1234,8 @@
  'output gain' field.
 </t>
 <t>
-An Ogg Opus stream MUST NOT have more than one of each tag, and if present
- their values MUST be an integer from -32768 to 32767, inclusive,
+An Ogg Opus stream MUST NOT have more than one of each of these tags, and if
+ present their values MUST be an integer from -32768 to 32767, inclusive,
  represented in ASCII as a base 10 number with no whitespace.
 A leading '+' or '-' character is valid.
 Leading zeros are also permitted, but the value MUST be represented by
@@ -1255,8 +1251,8 @@
  <spanx style="emph">in addition</spanx> to the 'output gain' value.
 If a tool modifies the ID header's 'output gain' field, it MUST also update or
  remove the R128_TRACK_GAIN and R128_ALBUM_GAIN comment tags if present.
-A muxer SHOULD assume that by default tools will respect the 'output gain'
- field, and not the comment tag.
+A muxer SHOULD place the gain it wants other tools to use by default into the
+ 'output gain' field, and not the comment tag.
 </t>
 <t>
 To avoid confusion with multiple normalization schemes, an Opus comment header
@@ -1282,10 +1278,11 @@
 When encoding, implementations SHOULD limit the use of padding in audio data
  packets to no more than is necessary to make a variable bitrate (VBR) stream
  constant bitrate (CBR).
-Demuxers SHOULD reject audio data packets larger than 61,440 octets per
+Demuxers SHOULD reject audio data packets (treat them as if they were malformed
+ Opus packets with an invalid TOC sequence) larger than 61,440 octets per
  Opus stream.
 Such packets necessarily contain more padding than needed for this purpose.
-Demuxers SHOULD avoid attempting to allocate excessive amounts of memory when
+Demuxers MUST avoid attempting to allocate excessive amounts of memory when
  presented with a very large packet.
 Demuxers MAY reject or partially process audio data packets larger than
  61,440&nbsp;octets in an Ogg Opus stream with channel mapping families&nbsp;0
@@ -1336,8 +1333,9 @@
  algorithmic delay of the Opus encoder.
 </t>
 <t>
-In encoders derived from the reference implementation, the number of
- samples can be queried with:
+In encoders derived from the reference
+ implementation&nbsp;<xref target="RFC6716"/>, the number of samples can be
+ queried with:
 </t>
 <figure align="center">
 <artwork align="center"><![CDATA[
@@ -1550,6 +1548,7 @@
  &rfc2119;
  &rfc3533;
  &rfc3629;
+ &rfc4732;
  &rfc5334;
  &rfc6381;
  &rfc6716;
@@ -1580,7 +1579,6 @@
 <references title="Informative References">
 
 <!--?rfc include="http://xml.resource.org/public/rfc/bibxml/reference.RFC.3550.xml"?-->
- &rfc4732;
  &rfc6982;
  &rfc7587;