shithub: opus

--- a/doc/ietf/draft-valin-celt-codec.xml

+++ b/doc/ietf/draft-valin-celt-codec.xml

@@ -367,8 +367,8 @@

 <section anchor="encoding-symbols" title="Encoding Symbols">

<t>

-   The main encoding function is ec_encode() (rangeenc.c (Appendix

-   A.29)), which takes as an argument a three-tuple (fl,fh,ft)

+   The main encoding function is ec_encode() (<xref target="rangeenc.c">rangeenc.c</xref>),

+   which takes as an argument a three-tuple (fl,fh,ft)

    describing the range of the symbol to be encoded in the current

    context, with 0 &lt;= fl &lt; fh &lt;= ft &lt;= 65535. The values of this tuple

    are derived from the probability model for the symbol. Let f(i) be

@@ -390,7 +390,7 @@

    rng > 2^23. First, the top 9 bits of low, (low>>23), are placed into

    a carry buffer. Then, low is set to <![CDATA[(low << 8 & 0x7FFFFFFF) and rng

    is set to (rng<<8)]]>. This process is carried out by

-   ec_enc_normalize() (rangeenc.c (Appendix A.29)).

+   ec_enc_normalize() (<xref target="rangeenc.c">rangeenc.c</xref>).

 </t>

<t>

    The 9 bits produced in each iteration of the normalization loop

@@ -402,17 +402,17 @@

    any mathematically equivalent scheme to perform carry propagation.

 </t>

<t>

-   The function ec_enc_carry_out() (rangeenc.c (Appendix A.29)) performs

+   The function ec_enc_carry_out() (<xref target="rangeenc.c">rangeenc.c</xref>) performs

    this buffering. It takes a 9-bit input value, c, from the normalization

    8-bit output and a carry bit. If c is 0xFF, then ext is incremented

    and no bytes are output. Otherwise, if rem is not the special value

    -1, then the byte (rem+(c>>8)) is output. Then ext bytes are output

    with the value 0 if the carry bit is set, or 0xFF if it is not, and

-   rem is set to the lower 8 bits of c.

+   rem is set to the lower 8 bits of c. After this, ext is set to zero

 </t>

<t>

    In the reference implementation, a special version of ec_encode()

-   called ec_encode_bin() (rangeenc.c (Appendix A.29)) is defined to

+   called ec_encode_bin() (<xref target="rangeenc.c">rangeenc.c</xref>) is defined to

    take a two-tuple (fl,ftb), where <![CDATA[0 <= fl < 2^ftb and ftb < 16. It is

    mathematically equivalent to calling ec_encode() with the three-tuple

    (fl,fl+1,1<<ftb)]]>, but avoids using division.

@@ -429,7 +429,7 @@

    symbols in smaller contexts.

 </t>

<t>

-   ec_enc_bits() (entenc.c (Appendix A.25)) is defined, like

+   ec_enc_bits() (<xref target="entenc.c">entenc.c</xref>) is defined, like

    ec_encode_bin(), to take a two-tuple (fl,ftb), with <![CDATA[0 <= fl < 2^ftb

    and ftb < 32. While ftb is greater than 8, it encodes bits (ftb-8) to

    (ftb-1) of fl, e.g., (fl>>ftb-8&0xFF) using ec_encode_bin() and

@@ -437,7 +437,7 @@

    (fl&(1<<ftb)-1)]]>, again using ec_encode_bin().

 </t>

<t>

-   ec_enc_uint() (entenc.c (Appendix A.25)) takes a two-tuple (fl,ft),

+   ec_enc_uint() (<xref target="entenc.c">entenc.c</xref>) takes a two-tuple (fl,ft),

    where ft is not necessarily a power of two. Let ftb be the location

    of the highest one bit in the two's-complement representation of

    (ft-1), or -1 if no bits are set. If ftb>8, then the top 8 bits of fl

@@ -462,7 +462,7 @@

    After the carry buffer is finished outputting bytes, the rest of the

    output buffer is padded with zero bytes. Finally, rem is set to the

    special value -1. This process is implemented by ec_enc_done()

-   (rangeenc.c (Appendix A.29)).

+   (<xref target="rangeenc.c">rangeenc.c</xref>).

 </t>

 </section>

@@ -473,8 +473,9 @@

    to encode the current frame thus far. This drives allocation

    decisions and ensures that the range code will not overflow the

    output buffer. This is computed in the reference implementation to

-   fractional bit precision by the function ec_enc_tell() (rangeenc.c

-   (Appendix A.29)). Like all operations in the range encoder, it must

+   fractional bit precision by the function ec_enc_tell()

+   (<xref target="rangeenc.c">rangeenc.c</xref>).

+   Like all operations in the range encoder, it must

    be implemented in a bit-exact manner.

 </t>

 </section>

@@ -875,17 +876,18 @@

    a value fs that lies within the range of some symbol in the current

    context. The second step updates the range decoder state with the

    three-tuple (fl,fh,ft) corresponding to that symbol, as defined in

-   Section 4.2.1.

+   <xref target="encoding-symbols"></xref>.

 </t>

<t>

-   The first step is implemented by ec_decode() (rangedec.c (Appendix

-   A.30)), and computes fs = ft-min((dif-1)/(rng/ft)+1,ft), where ft is

+   The first step is implemented by ec_decode()

+   (<xref target="rangedec.c">rangedec.c</xref>),

+   and computes fs = ft-min((dif-1)/(rng/ft)+1,ft), where ft is

    the sum of the frequency counts in the current context, as described

-   in Section 4.2.1. The divisions here are exact integer division.

+   in <xref target="encoding-symbols"></xref>. The divisions here are exact integer division.

 </t>

<t>

    In the reference implementation, a special version of ec_decode()

-   called ec_decode_bin() (rangeenc.c (Appendix A.29)) is defined using

+   called ec_decode_bin() (<xref target="rangeenc.c">rangeenc.c</xref>) is defined using

    the parameter ftb instead of ft. It is mathematically equivalent to

    calling ec_decode() with ft = (1&lt;&lt;ftb), but avoids one of the

    divisions.

@@ -908,7 +910,7 @@

    (dif&lt;&lt;8)-sym&amp;0xFFFFFFFF (i.e., using wrap-around if the subtraction

    overflows a 32-bit register). Finally, if dif is larger than 2^31,

    then dif is set to dif - 2^31. This process is carred out by

-   ec_dec_normalize() (rangedec.c (Appendix A.30)).

+   ec_dec_normalize() (<xref target="rangedec.c">rangedec.c</xref>).

 </t>

 </section>

@@ -921,7 +923,7 @@

    symbols in smaller contexts.

 </t>

<t>

-   ec_dec_bits() (entdec.c (Appendix A.27)) is defined, like

+   ec_dec_bits() (<xref target="entdec.c">entdec.c</xref>) is defined, like

    ec_decode_bin(), to take a single parameter ftb, with ftb &lt; 32.

    and ftb &lt; 32, and produces an ftb-bit decoded integer value, t,

    initalized to zero. While ftb is greater than 8, it decodes the next

@@ -933,7 +935,7 @@

    those bits to the final values of t, t = t&lt;&lt;ftb | s.

 </t>

<t>

-   ec_dec_uint() (entdec.c (Appendix A.27)) takes a single parameter,

+   ec_dec_uint() (<xref target="entdec.c">entdec.c</xref>) takes a single parameter,

    ft, which is not necessarily a power of two, and returns an integer,

    t, between 0 and ft-1, inclusive, which is intialized to zero. Let

    ftb be the location of the highest one bit in the two's-complement

@@ -956,7 +958,7 @@

    to decoded from the current frame thus far. This drives allocation

    decisions which must match those made in the encoder. This is

    computed in the reference implementation to fractional bit precision

-   by the function ec_dec_tell() (rangedec.c (Appendix A.30)). Like all

+   by the function ec_dec_tell() (<xref target="rangedec.c">rangedec.c</xref>). Like all

    operations in the range decoder, it must be implemented in a

    bit-exact manner, and must produce exactly the same value returned by

    ec_enc_tell() after encoding the same symbols.

--- a/libcelt/mfrngcod.h

+++ b/libcelt/mfrngcod.h

@@ -57,11 +57,4 @@

   We will only use EC_CODE_BITS of it.*/

 # define EC_CODE_MASK  ((((ec_uint32)1U)<<EC_CODE_BITS-1)-1<<1|1)

-/*The non-zero symbol of the second possible reserved ending.

-  This must be the high-bit.*/

-# define EC_FOF_RSV1      (1<<EC_SYM_BITS-1)

-/*A mask for all the other bits.*/

-# define EC_FOF_RSV1_MASK (EC_FOF_RSV1-1)

 #endif

--- a/libcelt/rangedec.c

+++ b/libcelt/rangedec.c

@@ -61,60 +61,6 @@

    encoding for efficiency actually re-discovers many of the principles

    behind range encoding, and presents a good theoretical analysis of them.

-  This coder handles the end of the stream in a slightly more graceful fashion

-   than most arithmetic or range coders.

-  Once the final symbol has been encoded, the coder selects the code word with

-   the shortest number of bits that still falls within the final interval.

-  This method is not novel.

-  Here, by the length of the code word, we refer to the number of bits until

-   its final 1.

-  Any trailing zeros may be discarded, since the encoder, once it runs out of

-   input, will pad its buffer with zeros.

-  But this means that no encoded stream would ever have any zero bytes at the

-   end.

-  Since there are some coded representations we cannot produce, it implies that

-   there is still some redundancy in the stream.

-  In this case, we can pick a special byte value, RSV1, and should the stream

-   end in a sequence of zeros, followed by the RSV1 byte, we can code the

-   zeros, and discard the RSV1 byte.

-  The decoder, knowing that the encoder would never produce a sequence of zeros

-   at the end, would then know to add in the RSV1 byte if it observed it.

-  Now, the encoder would never produce a stream that ended in a sequence of

-   zeros followed by a RSV1 byte.

-  So, if the stream ends in a non-empty sequence of zeros, followed by any

-   positive number of RSV1 bytes, the last RSV1 byte is discarded.

-  The decoder, if it encounters a stream that ends in non-empty sequence of

-   zeros followed by any non-negative number of RSV1 bytes, adds an additional

-   RSV1 byte to the stream.

-  With this strategy, every possible sequence of input bytes is transformed to

-   one that could actually be produced by the encoder.

-  The only question is what non-zero value to use for RSV1.

-  We select 0x80, since it has the nice property of producing the shortest

-   possible byte streams when using our strategy for selecting a number within

-   the final interval to encode.

-  Clearly if the shortest possible code word that falls within the interval has

-   its last one bit as the most significant bit of the final byte, and the

-   previous bytes were a non-empty sequence of zeros followed by a non-negative

-   number of 0x80 bytes, then the last byte would be discarded.

-  If the shortest code word is not so formed, then no other code word in the

-   interval would result in any more bytes being discarded.

-  Any longer code word would have an additional one bit somewhere, and so would

-   require at a minimum that that byte would be coded.

-  If the shortest code word has a 1 before the final one that is preventing the

-   stream from ending in a non-empty sequence of zeros followed by a

-   non-negative number of 0x80's, then there is no code word of the same length

-   which contains that bit as a zero.

-  If there were, then we could simply leave that bit a 1, and drop all the bits

-   after it without leaving the interval, thus producing a shorter code word.

-  In this case, RSV1 can only drop 1 bit off the final stream.

-  Other choices could lead to savings of up to 8 bits for particular streams,

-   but this would produce the odd situation that a stream with more non-zero

-   bits is actually encoded in fewer bytes.

   @PHDTHESIS{Pas76,

     author="Richard Clark Pasco",

     title="Source coding algorithms for fast data compression",