shithub: opus

Download patch

ref: db5f38ef65560319ed530fc88b2618511c2c2d67
parent: e95de9a24411046f51b121b47b7de8ece6abfbd8
author: Jean-Marc Valin <[email protected]>
date: Mon May 11 12:33:05 EDT 2009

Change to ipr="trust200902" to make the experimental xml2rfc happy. Also, added
(old) version -00 of the draft

--- /dev/null
+++ b/doc/ietf/draft-valin-celt-rtp-profile-00.txt
@@ -1,0 +1,1176 @@
+
+
+
+AVT Working Group                                             J-M. Valin
+Internet-Draft                                     Octasic Semiconductor
+Expires: November 9, 2009                                     G. Maxwell
+                                                        Juniper Networks
+                                                             May 8, 2009
+
+
+                    draft-valin-celt-rtp-profile-00
+                 RTP Payload Format for the CELT Codec
+
+Status of this Memo
+
+   This Internet-Draft is submitted to IETF in full conformance with the
+   provisions of BCP 78 and BCP 79.
+
+   Internet-Drafts are working documents of the Internet Engineering
+   Task Force (IETF), its areas, and its working groups.  Note that
+   other groups may also distribute working documents as Internet-
+   Drafts.
+
+   Internet-Drafts are draft documents valid for a maximum of six months
+   and may be updated, replaced, or obsoleted by other documents at any
+   time.  It is inappropriate to use Internet-Drafts as reference
+   material or to cite them other than as "work in progress."
+
+   The list of current Internet-Drafts can be accessed at
+   http://www.ietf.org/ietf/1id-abstracts.txt.
+
+   The list of Internet-Draft Shadow Directories can be accessed at
+   http://www.ietf.org/shadow.html.
+
+   This Internet-Draft will expire on November 9, 2009.
+
+Copyright Notice
+
+   Copyright (c) 2009 IETF Trust and the persons identified as the
+   document authors.  All rights reserved.
+
+   This document is subject to BCP 78 and the IETF Trust's Legal
+   Provisions Relating to IETF Documents in effect on the date of
+   publication of this document (http://trustee.ietf.org/license-info).
+   Please review these documents carefully, as they describe your rights
+   and restrictions with respect to this document.
+
+
+
+
+
+
+
+
+Valin & Maxwell         Expires November 9, 2009                [Page 1]
+
+Internet-Draft       draft-valin-celt-rtp-profile-00            May 2009
+
+
+Abstract
+
+   CELT is an open-source voice codec suitable for use in very low delay
+   audio communication applications, including Voice over IP (VoIP).
+   This document describes the payload format for CELT generated bit
+   streams within an RTP packet.  Also included here are the necessary
+   details for the use of CELT with the Session Description Protocol
+   (SDP).  At the time of this writing, the CELT bit-stream has NOT been
+   finalized yet, and compatibility is usually broken with every new
+   release of the codec.
+
+
+Table of Contents
+
+   1.  Conventions used in this document  . . . . . . . . . . . . . .  3
+   2.  Overview of the CELT Codec . . . . . . . . . . . . . . . . . .  4
+   3.  RTP payload format for CELT  . . . . . . . . . . . . . . . . .  5
+     3.1.  RTP Header . . . . . . . . . . . . . . . . . . . . . . . .  5
+     3.2.  CELT payload . . . . . . . . . . . . . . . . . . . . . . .  6
+     3.3.  Multiple CELT frames in a RTP packet . . . . . . . . . . .  7
+     3.4.  Multiple channels  . . . . . . . . . . . . . . . . . . . .  8
+   4.  MIME registration of CELT  . . . . . . . . . . . . . . . . . . 10
+   5.  SDP usage of CELT  . . . . . . . . . . . . . . . . . . . . . . 12
+     5.1.  Multichannel Mapping . . . . . . . . . . . . . . . . . . . 14
+     5.2.  Low-Overhead Mode  . . . . . . . . . . . . . . . . . . . . 15
+   6.  Congestion Control . . . . . . . . . . . . . . . . . . . . . . 17
+   7.  Security Considerations  . . . . . . . . . . . . . . . . . . . 18
+   8.  Acknowledgments  . . . . . . . . . . . . . . . . . . . . . . . 19
+   9.  References . . . . . . . . . . . . . . . . . . . . . . . . . . 20
+     9.1.  Normative References . . . . . . . . . . . . . . . . . . . 20
+     9.2.  Informative References . . . . . . . . . . . . . . . . . . 20
+   Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 21
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+Valin & Maxwell         Expires November 9, 2009                [Page 2]
+
+Internet-Draft       draft-valin-celt-rtp-profile-00            May 2009
+
+
+1.  Conventions used in this document
+
+   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
+   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
+   document are to be interpreted as described in RFC 2119 [rfc2119].
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+Valin & Maxwell         Expires November 9, 2009                [Page 3]
+
+Internet-Draft       draft-valin-celt-rtp-profile-00            May 2009
+
+
+2.  Overview of the CELT Codec
+
+   CELT stands for "Constrained Energy Lapped Transform".  It applies
+   some of the CELP principles, but does everything in the frequency
+   domain, which removes some of the limitations of CELP.  CELT is
+   suitable for both speech and music and currently features:
+
+   o  Ultra-low algorithmic delay (as low as 2 ms)
+
+   o  Full audio bandwidth (up to 20 kHz audio bandwidth)
+
+   o  Support for both voice and music
+
+   o  Stereo support
+
+   o  Packet loss concealment
+
+   o  Constant bitrates from under 32 kbps to 128 kbps and above
+
+   o  Free software/open-source
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+Valin & Maxwell         Expires November 9, 2009                [Page 4]
+
+Internet-Draft       draft-valin-celt-rtp-profile-00            May 2009
+
+
+3.  RTP payload format for CELT
+
+   For RTP based transportation of CELT encoded audio the standard RTP
+   header [rfc3550] is followed by one or more payload data blocks.  An
+   optional padding terminator may also be used.
+
+
+        0                   1                   2                   3
+        0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+       |                         RTP Header                            |
+       +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+
+       |                   one or more frames of CELT ....             |
+       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+       |                              ....                             |
+       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+
+3.1.  RTP Header
+
+
+        0                   1                   2                   3
+        0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+       |V=2|P|X|  CC   |M|     PT      |       sequence number         |
+       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+       |                           timestamp                           |
+       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+       |           synchronization source (SSRC) identifier            |
+       +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+
+       |            contributing source (CSRC) identifiers             |
+       |                              ...                              |
+       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+
+   The RTP header is defined in the RTP specification [rfc3550].  This
+   section defines how fields in the RTP header are used.
+
+   Padding (P): 1 bit
+
+   If the padding bit is set, the packet contains one or more additional
+   padding octets at the end which are not part of the payload.  The
+   last octet of the padding contains a count of how many padding octets
+   should be ignored, including itself.  Padding may be needed by some
+   encryption algorithms with fixed block sizes or for carrying several
+   RTP packets in a lower-layer protocol data unit.
+
+   Extension (X): 1 bit
+
+   If the extension, X, bit is set, the fixed header MUST be followed by
+
+
+
+Valin & Maxwell         Expires November 9, 2009                [Page 5]
+
+Internet-Draft       draft-valin-celt-rtp-profile-00            May 2009
+
+
+   exactly one header extension, with a format defined in Section 5.3.1.
+   of [rfc3550].
+
+   Marker (M): 1 bit
+
+   The M bit MUST be set to zero in all packets.  The receiver MUST
+   ignore the M bit.
+
+   Payload Type (PT): 7 bits
+
+   Payload Type (PT): The assignment of an RTP payload type for this
+   packet format is outside the scope of this document; it is specified
+   by the RTP profile under which this payload format is used, or
+   signaled dynamically out-of-band (e.g., using SDP).
+
+   Timestamp: 32 bits
+
+   A timestamp representing the sampling time of the first sample of the
+   first CELT frame in the RTP payload.  The clock frequency MUST be set
+   to the sample rate of the encoded audio data and is conveyed out-of-
+   band (e.g., as an SDP parameter).
+
+3.2.  CELT payload
+
+   For the purposes of packetizing the bit stream in RTP, it is only
+   necessary to consider the sequence of bits as output by the CELT
+   encoder [celt-website], and present the same sequence to the decoder.
+   The payload format described here maintains this sequence.
+
+   A typical CELT frame, encoded at a high bitrate, is approx. 128
+   octets and the total size of the CELT frames SHOULD be kept below the
+   path MTU to prevent fragmentation.  CELT frames MUST NOT be split
+   across multiple RTP packets,
+
+   An RTP packet MAY contain CELT frames of the same bit rate or of
+   varying bit rates, since the bitrate for the frames is explicitly
+   conveyed in band with the signal.  The encoding and decoding
+   algorithm can change the bit rate at any frame boundary, with the bit
+   rate change notification provided in-band.  No out-of-band
+   notification is required for the decoder to process changes in the
+   bit rate sent by the encoder.
+
+   It is RECOMMENDED that sampling rates 32000, 44100, or 48000 Hz be
+   used for most applications, unless a specific reason exists -- such
+   as requirements for a very specific packetization time.  For example,
+   51200 Hz sampling may be useful to obtain a 5 ms packetization time
+   with 256-sample frames.  For compatibility reasons, the sender and
+   receiver MUST support 48000 Hz sampling rate.
+
+
+
+Valin & Maxwell         Expires November 9, 2009                [Page 6]
+
+Internet-Draft       draft-valin-celt-rtp-profile-00            May 2009
+
+
+   The CELT codec always produces an integer number of bytes and can
+   produce any integer number of bytes, so no padding is ever required.
+   Bitrate adjustment SHOULD be used instead of padding.
+
+3.3.  Multiple CELT frames in a RTP packet
+
+   The bitrate used by CELT is implicitly determined by the size of the
+   compressed data.  When more than one frame is encoded in the same
+   packet, it is not possible to determine the size of each encoded
+   frame, so the information MUST be explicitly encoded.  If N frames
+   are present in a packet, N compressed frame sizes need to be encoded
+   at the beginning of the packet.  Each size that is less than 255
+   bytes is encoded in one byte (unsigned 8-bit integer).  For sizes
+   greater or equal to 255, a 0xff byte is encoded, followed by the
+   size-255.  Multiple 0xff bytes are allowed if there are more than 510
+   bytes transmitted.  The length is always the size of the CELT frame
+   excluding the length byte itself.  The payload MUST NOT be padded,
+   except in accordance with the padding bit definition in the RTP
+   header.
+
+   Below is an example of two CELT frames contained within one RTP
+   packet.
+
+
+       0                   1                   2                   3
+       0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+      |V=2|P|X|  CC   |M|     PT      |       sequence number         |
+      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+      |                           timestamp                           |
+      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+      |         synchronization source (SSRC) identifier              |
+      +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+
+      |         contributing source (CSRC) identifiers                |
+      |                              ...                              |
+      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+      | length frame 1| length frame 2|          CELT frame 1...      |
+      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+      |         (frame 1)             |        CELT frame 2...        |
+      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+      |                          (frame 2)                            |
+      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+
+   The following is an example of C code that interprets the length
+   bytes:
+
+
+
+
+
+Valin & Maxwell         Expires November 9, 2009                [Page 7]
+
+Internet-Draft       draft-valin-celt-rtp-profile-00            May 2009
+
+
+      int i, N, pos;
+      int sizes[MAX_FRAMES][channels];
+      unsigned int total_size;
+      total_size=0;
+      N = 0;
+      pos = 0;
+      while (total_size < payload_size) {
+         for (i=0;i<channels;i++) {
+            int s;
+            int sum;
+            sum = 0;
+            do {
+               s = payload[pos++];
+               sum += s;
+               total_size += s+1;
+            } while (s == 255);
+            sizes[N][i] = sum;
+         }
+         N++;
+      }
+
+3.4.  Multiple channels
+
+   CELT supports both mono streams and stereo streams.  If more than two
+   channels are desired, it is possible to use transmit multiple streams
+   in the same packet.  In this case, the number of streams S and the
+   pairing must be agreed with out-of-band negotiation such as SDP.
+   Each stream can be either mono or stereo, depending on whether the
+   channels are assumed to be correlated.  For example, a 5.1 surround
+   could have the front-left and front-right channels in a stereo
+   stream, the rear-left and rear-right channels in a separate stereo
+   stream, while the center and low-frequency channels would be in
+   separate mono streams.  In that example, the RTP packet would be:
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+Valin & Maxwell         Expires November 9, 2009                [Page 8]
+
+Internet-Draft       draft-valin-celt-rtp-profile-00            May 2009
+
+
+       0                   1                   2                   3
+       0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+      |V=2|P|X|  CC   |M|     PT      |       sequence number         |
+      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+      |                           timestamp                           |
+      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+      |         synchronization source (SSRC) identifier              |
+      +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+
+      |         contributing source (CSRC) identifiers                |
+      |                              ...                              |
+      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+      |  Front length |  rear length  | center length |  LFE length   |
+      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+      |                        Front stereo                           |
+      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+      |                             ...                               |
+      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+      |      ...      |        Rear stereo data...                    |
+      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+      |                             ...                               |
+      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+      |                       Center mono data...                     |
+      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+      |                             ...                               |
+      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+      |                               |      LFE mono data...         |
+      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+      |                             ...                               |
+      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+
+   In the case where streams for multiple channels are used with
+   multiple frames of the same streams per packet, then all streams for
+   a certain timestamp are encoded before all streams for the following
+   timestamp.  In the case of the 5.1 example above with two frames per
+   packet, the number of compressed length fields would be S*N = 8.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+Valin & Maxwell         Expires November 9, 2009                [Page 9]
+
+Internet-Draft       draft-valin-celt-rtp-profile-00            May 2009
+
+
+4.  MIME registration of CELT
+
+   Full definition of the MIME [rfc2045] type for CELT will be part of
+   the Ogg Vorbis MIME type definition application [rfc3534].
+
+   MIME media type name: audio
+
+   MIME subtype: celt
+
+   Optional parameters:
+
+   Required parameters: to be included in the Ogg MIME specification.
+
+   Encoding considerations:
+
+   Security Considerations:
+
+   See Section 6 of RFC 3047.
+
+   Interoperability considerations: none
+
+   Published specification:
+
+   Applications which use this media type:
+
+   Additional information: none
+
+   Person & email address to contact for further information:
+
+
+      Jean-Marc Valin <[email protected]>
+
+   Intended usage: COMMON
+
+   Author/Change controller:
+
+      Author: Jean-Marc Valin <[email protected]>
+
+      Change controller: Jean-Marc Valin <[email protected]>
+
+      Change controller: IETF AVT Working Group
+
+   This transport type signifies that the content is to be interpreted
+   according to this document if the contents are transmitted over RTP.
+   Should this transport type appear over a lossless streaming protocol
+   such as TCP, the content encapsulation should be interpreted as an
+   Ogg Stream in accordance with [rfc3534], with the exception that the
+   content of the Ogg Stream may be assumed to be CELT audio and CELT
+
+
+
+Valin & Maxwell         Expires November 9, 2009               [Page 10]
+
+Internet-Draft       draft-valin-celt-rtp-profile-00            May 2009
+
+
+   audio only.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+Valin & Maxwell         Expires November 9, 2009               [Page 11]
+
+Internet-Draft       draft-valin-celt-rtp-profile-00            May 2009
+
+
+5.  SDP usage of CELT
+
+   When conveying information by SDP [rfc2327], the encoding name MUST
+   be set to "CELT".  The sampling frequency is typically between 32000
+   and 48000 Hz.  Implementations SHOULD support both 44100 Hz and 48000
+   Hz.  The maximum bandwidth permitted for the CELT audio is encoded
+   using the "b=AS:" header, as explained in SDP [rfc2327].
+
+   The SDP parameters have the following interpretation with respect to
+   CELT:
+
+      b=AS: The maximum bandwidth (in kbit/s) allowed for CELT,
+      excluding the header overhead.  The default is 64 kbit/s.
+
+      ptime: The desired packetization time.  The sender SHOULD choose a
+      number of frames per packet that corresponds to the smallest
+      packetization time greater or equal to the specified ptime for the
+      selected frame size.  The default is 20 ms as specified in
+      [rfc3551]
+
+      maxptime: The maximum packetization time desired.  If the maximum
+      is lower than the smallest packetization time determined from the
+      chosen frame size (as described above), then that packtization
+      time SHOULD be used despite the maxptime value.  The default is
+      "no maximum".
+
+   CELT-specific parameters can be given via the "a=fmtp:" directive.
+   Several parameters can be given in a single a=fmtp line provided that
+   they are separated by a semi-colon.  The following parameters are
+   defined for use in this way:
+
+      frame-size: The frame size is the duration of each frame in
+      samples.  If more than one frame size is supported, a comma-
+      separated list can be used.  It is possible to use "any" to denote
+      that all even frame sizes are supported.  The default is 480.
+
+      mapping: Optional string describing the multi-channel mapping.
+
+   Because the frame-size is not transmitted in-band, an SDP answer MUST
+   contain only one frame-size, even if multiple frame sizes were
+   offered.
+
+   The selected frame-size values MUST be even.  They SHOULD be
+   divisible by 8 and have a prime factorization which consists only of
+   2, 3, or 5 factors.  For example, powers-of-two and values such as
+   160, 320, 240, and 480 are recommended.  Implementations MUST support
+   receiving and sending the default value of 480, and if the size 480
+   is supported it MUST be offered.  Implementations SHOULD also support
+
+
+
+Valin & Maxwell         Expires November 9, 2009               [Page 12]
+
+Internet-Draft       draft-valin-celt-rtp-profile-00            May 2009
+
+
+   frame sizes of 256 and 512 since these are the ones that lead to the
+   lowest complexity.  When frame sizes that are powers of two are
+   supported, they SHOULD be listed first in the offer and chosen over
+   non powers of two in the answer.
+
+   Care must be taken when setting the value of ptime: and b=AS: so that
+   the RTP packet size does not exceed the path MTU.
+
+   An example of the media representation in SDP for offering a single
+   channel of CELT at 48000 samples per second might be:
+
+
+      m=audio 8088 RTP/AVP 97
+
+      a=rtpmap:97 CELT/48000
+
+   Note that the RTP payload type code of 97 is defined in this media
+   definition to be 'mapped' to the CELT codec at a 48kHz sampling
+   frequency using the 'a=rtpmap' line.  Any number from 96 to 127 could
+   have been chosen (the allowed range for dynamic types).  If there is
+   more than one channel being encoded the rtpmap MUST specify the
+   channel count.
+
+   The following example illustrates the case where the offerer cannot
+   receive more than 64 kbit/s.
+
+
+      m=audio 8088 RTP/AVP 97
+
+      b=AS:64
+
+      a=rtmap:97 CELT/48000
+
+   In this case, if the remote party agrees, it should configure its
+   CELT encoder so that it does not use modes that produce more than 64
+   kbit/s.  Note that the "b=" constraint also applies on all payload
+   types that may be proposed in the media line ("m=").
+
+   The following example demonstrates the use of the a=fmtp: parameters:
+
+
+      m=audio 8008 RTP/AVP 97
+
+      a=ptime: 21
+
+      a=rtpmap:97 CELT/44100
+
+
+
+
+
+Valin & Maxwell         Expires November 9, 2009               [Page 13]
+
+Internet-Draft       draft-valin-celt-rtp-profile-00            May 2009
+
+
+      a=fmtp:97 frame-size=512;
+
+   This examples illustrate an offerer that wishes to receive a CELT
+   stream at 44100 Hz, by packing two 512-sample frames in each packet.
+
+5.1.  Multichannel Mapping
+
+   When more than two channels are used, a mapping parameter MUST be
+   provided.  The mapping parameter is defined as comma separated list
+   of integers which specify the number of channels contained in each
+   CELT stream, OPTIONALLY followed by a '/' and a comma separated list
+   of channel identifiers, then OPTIONALLY another '/' and a string
+   which provides an application specific elaboration on any speaker-
+   feed definitions.  The channels per stream entries MUST be either 1
+   or 2.  The total number of channels is indicated by the sum of the
+   channels per stream entries.  The sum of the channel counts MUST be
+   equal to the total number of channels.
+
+   Channel identifiers are short alphanumeric strings.  Each identifier
+   MUST begin with a letter indicating the type of channel.  'A' MUST be
+   used to indicate an ambisonic channel, 'S' to indicate a speaker-feed
+   channel, or 'O' indicating other usage.
+
+   A channel identifier MAY be repeated, but the meaning of such
+   repetition is application specific.  Applications SHOULD attempt to
+   utilize channel identifiers such that mixing all identical
+   identifiers would produce a reasonable result.
+
+   Non-surround usage such as individual performer tracks, effect send,
+   "order wire", or other administrative channels may be given
+   application specific identifiers which MUST not conflict with the
+   identifiers defined in this draft.  These identifiers SHOULD begin
+   with S if it would be sensible to include them in a mono-downmix, or
+   O if it would be most sensible to exclude them from a mono-downmix.
+   An example usage might be mapping=2,1,2,1,1/
+   SLguitar,SRguitar,OheadsetG,SLkeyboard,SRkeyboard,OheadsetK,SMbass,Oh
+   eadsetB"
+
+   Ambisonic channels MUST follow the Furse-Malham naming and weighing
+   conventions for up to third order spherical[Ambisonic].  Higher order
+   ambisonic support is application defined but MUST NOT reuse any of
+   WXYZRSTUVKLMNOPQ for higher order components.  For example, second
+   order spherical ambisonics SHOULD use the mapping
+   "mapping=1,1,1,1,1,1,1,1,1/AW,AX,AY,AZ,AR,AS,AT,AU,AV".  Any set of
+   Ambisonic channels MUST contain at least one "AW" channel.
+
+   Speaker-feed identifiers are named based on the intended speaker
+   locations.  "L", "R" for the left and right speakers, respectively,
+
+
+
+Valin & Maxwell         Expires November 9, 2009               [Page 14]
+
+Internet-Draft       draft-valin-celt-rtp-profile-00            May 2009
+
+
+   in conventional stereo or the front left and right in 4, 5, 5.1, or
+   7.1 channel surround.  "LR", "RR" for the left and right rear
+   speakers in 4,5 or 5.1 channel surround.  C" is used for a center
+   channel, "MLFE" for a low frequency extension channel.  "LS", "RS"
+   for the side channels in 7.1 channel surround.  Additional speaker-
+   feeds are application specific but should not reuse the prior
+   identifiers.  For 5.1 surround in non-ambisonic form the mapping
+   SHOULD be "mapping=2,2,1,1/L,R,LR,RR,C,MLFE/ITU-RBS.775-1".  When
+   only one or two channels are used, the mapping parameter MAY be
+   omitted, in which case the default mapping is used.  For one channel,
+   the default is "mapping=1/C", while for two channels, the default is
+   "mapping=2/L,R".
+
+   For example a stereo configuration might signal:
+
+
+      m=audio 8008 RTP/AVP 97
+
+      a=ptime: 5
+
+      a=rtpmap:97 CELT/44100/2
+
+      a=fmtp:97 frame-size=256;
+
+   Which specifies a single two-channel CELT stream according to the
+   default mapping.
+
+5.2.  Low-Overhead Mode
+
+   A low-overhead mode is defined to make more efficient use of
+   bandwidth when transmitting CELT frames.  In that mode none of the
+   length values need to be transmitted.  One the a=fmtp: parameter low-
+   overhead: is defined and contains a single frame size, followed by a
+   '/', followed by the number of frames (per channel) per packet,
+   followed by a '/', followed by a comma-separated list of the number
+   of bytes per frame for each stream defined in the channel mapping.
+   The frame-size: parameter MUST not be specified and SHOULD be ignored
+   if encountered in an SDP offer or answer.  The ptime:, maxptime: and
+   b=AS: parameters SHOULD also be ignored since the low-overhead:
+   parameter makes them redundant.  When the low-overhead: parameter is
+   specified, the length of each frame MUST NOT be encoded in the
+   payload and the bit-rate MUST NOT be changed during the session.
+
+   For example a low-overhead surround configuration could be signaled
+   as:
+
+      m=audio 8008 RTP/AVP 97
+
+
+
+
+Valin & Maxwell         Expires November 9, 2009               [Page 15]
+
+Internet-Draft       draft-valin-celt-rtp-profile-00            May 2009
+
+
+      a=ptime: 5
+
+      a=rtpmap:97 CELT/48000/6
+
+      a=fmtp:97 low-overhead=256/1/86,86,43,30;mapping=2,2,1,1/
+      L,R,LR,RR,C,MLFE/ITU-RBS.775-1
+
+   In this example, 4 bytes per packet would be saved.  This corresponds
+   to a 6 kbit/s reduction in the overhead, although the 60 kbit/s
+   overhead of the IP, UDP and RTP headers is still present.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+Valin & Maxwell         Expires November 9, 2009               [Page 16]
+
+Internet-Draft       draft-valin-celt-rtp-profile-00            May 2009
+
+
+6.  Congestion Control
+
+   CELT allows for bitrate adjustment in one byte per frame increments
+   without any signaling requirement or overhead.  Applications SHOULD
+   utilize congestion control to regulate the transmitted bitrate.  In
+   some applications it may make sense to increase the packetization
+   interval rather than decreasing the codec bitrate.  Congestion
+   control implementations should consider the users differential
+   tolerance for high latency and low quality.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+Valin & Maxwell         Expires November 9, 2009               [Page 17]
+
+Internet-Draft       draft-valin-celt-rtp-profile-00            May 2009
+
+
+7.  Security Considerations
+
+   RTP packets using the payload format defined in this specification
+   are subject to the security considerations discussed in the RTP
+   specification [rfc3550], and in any applicable RTP profile.  The main
+   security considerations for the RTP packet carrying the RTP payload
+   format defined within this memo are confidentiality, integrity and
+   source authenticity.  Confidentiality is achieved by encryption of
+   the RTP payload.  Integrity of the RTP packets through suitable
+   cryptographic integrity protection mechanism.  Cryptographic system
+   may also allow the authentication of the source of the payload.  A
+   suitable security mechanism for this RTP payload format should
+   provide confidentiality, integrity protection and at least source
+   authentication capable of determining if an RTP packet is from a
+   member of the RTP session or not.
+
+   Note that the appropriate mechanism to provide security to RTP and
+   payloads following this memo may vary.  It is dependent on the
+   application, the transport, and the signalling protocol employed.
+   Therefore a single mechanism is not sufficient, although if suitable
+   the usage of SRTP [rfc3711] is recommended.  Other mechanism that may
+   be used are IPsec [rfc4301] and TLS [rfc5246] (RTP over TCP), but
+   also other alternatives may exist.
+
+   This RTP payload format and its media decoder do not exhibit any
+   significant non-uniformity in the receiver-side computational
+   complexity for packet processing, and thus are unlikely to pose a
+   denial-of-service threat due to the receipt of pathological data.
+   Nor does the RTP payload format contain any active content.
+
+   Because this format supports VBR operation small amounts of
+   information about the transmitted audio may be leaked by a length
+   preserving cryptographic transport.  Accordingly, when CELT is used
+   inside a secure transport the sender SHOULD restrict the use of VBR
+   to congestion control purposes.
+
+   CELT implementations will typically exhibit tiny content-sensitive
+   encoding time variances.  Since transmission is usually triggered by
+   an accurate hardware clock and the encoded data is typically
+   transmitted as soon as encoding is complete this variance may result
+   in a small amount of additional frame to frame jitter which could be
+   measured by a third-party.  Encrypted implementations SHOULD transmit
+   packets at fixed intervals to avoid the possible information leak.
+
+
+
+
+
+
+
+
+Valin & Maxwell         Expires November 9, 2009               [Page 18]
+
+Internet-Draft       draft-valin-celt-rtp-profile-00            May 2009
+
+
+8.  Acknowledgments
+
+   The authors would also like to thank the following people for their
+   input: Timothy B. Terriberry, Ben Schwartz, Alexander Carot, Thorvald
+   Natvig, Brian West, Steve Underwood, and Anthony Minessale.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+Valin & Maxwell         Expires November 9, 2009               [Page 19]
+
+Internet-Draft       draft-valin-celt-rtp-profile-00            May 2009
+
+
+9.  References
+
+9.1.  Normative References
+
+   [rfc2119]  Bradner, S., "Key words for use in RFCs to Indicate
+              Requirement Levels", RFC 2119.
+
+   [rfc3550]  Schulzrinne, H., Casner, S., Frederick, R., and V.
+              Jacobson, "RTP: A Transport Protocol for real-time
+              applications", RFC 3550.
+
+   [rfc2045]  "Multipurpose Internet Mail Extensions (MIME) Part One:
+              Format of Internet Message Bodies", RFC 2045,
+              November 1998.
+
+   [rfc2327]  Jacobson, V. and M. Handley, "SDP: Session Description
+              Protocol", RFC 2327, April 1998.
+
+   [rfc3551]  Schulzrinne, H. and S. Casner, "RTP Profile for Audio and
+              Video Conferences with Minimal Control.", RFC 3551,
+              July 2003.
+
+   [rfc3534]  Walleij, L., "The application/ogg Media Type", RFC 3534,
+              May 2003.
+
+   [rfc3711]  Baugher, M., McGrew, D., Naslund, M., Carrara, E., and K.
+              Norrman, "The Secure Real-time Transport Protocol (SRTP)",
+              RFC 3711, March 2004.
+
+   [rfc4301]  Kent, S. and K. Seo, "Security Architecture for the
+              Internet Protocol", RFC 4301, December 2005.
+
+   [rfc5246]  Dierks, T. and E. Rescorla, "The Transport Layer Security
+              (TLS) Protocol Version 1.2", RFC 5246, August 2008.
+
+9.2.  Informative References
+
+   [celt-website]
+              Xiph.Org Foundation, "The CELT ultra-low delay audio
+              codec", CELT website http://www.celt-codec.org/.
+
+   [Ambisonic]
+              Malham, D., "Higher order Ambisonic systems", Paper http:/
+              /www.york.ac.uk/inst/mustech/3d_audio/
+              higher_order_ambisonics.pdf, December 2003.
+
+
+
+
+
+
+Valin & Maxwell         Expires November 9, 2009               [Page 20]
+
+Internet-Draft       draft-valin-celt-rtp-profile-00            May 2009
+
+
+Authors' Addresses
+
+   Jean-Marc Valin
+   Octasic Semiconductor
+   4101, Molson Street, suite 300
+   Montreal, Quebec  H1Y 3L1
+   Canada
+
+   Email: [email protected]
+
+
+   Gregory Maxwell
+   Juniper Networks
+   2251 Corporate Park Drive, Suite 100
+   Herndon, VA  20171-1817
+   USA
+
+   Email: [email protected]
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+Valin & Maxwell         Expires November 9, 2009               [Page 21]
+
--- a/doc/ietf/draft-valin-celt-rtp-profile.xml
+++ b/doc/ietf/draft-valin-celt-rtp-profile.xml
@@ -2,7 +2,7 @@
 <!DOCTYPE rfc SYSTEM 'rfc2629.dtd'>
 <?rfc symrefs="yes" toc="yes" ?>
 
-<rfc ipr="full3978" docName="RTP Payload Format for the CELT Codec">
+<rfc ipr="trust200902" docName="RTP Payload Format for the CELT Codec">
 
 <front>
 <title>draft-valin-celt-rtp-profile-01</title>
@@ -38,7 +38,7 @@
 </address>
 </author>
 
-<date day="27" month="February" year="2009" />
+<date day="11" month="May" year="2009" />
 
 <area>General</area>
 <workgroup>AVT Working Group</workgroup>