Codec Freedom -- Specifications -- G.711 bit-insertion mode

The bit-insertion mode for G.711 is based on a fixed noise level, which it applies to the A-law representation. This leaves room for data in the lower-valued samples. A transformation table and procedures define the insertion and extraction of data bits.

Setting a Noise Level for G.711

The G.711 codec represents samples in one byte each, using a floating point format. This format reserves 1 sign bit, 3 exponent bits and 4 mantisse bits. There is no compression based on the relationships between individual samples.

The exponent sets a base sample value, and the mantisse adds detail in 1/16th steps. This is done for each individual sample, meaning that the loudest samples influence the noise level and more quiet samples add unused resolution. And although the exponent does help to handle various volume levels, in reality the sound level of the codec is reasonably fixed; for instance, phones do not increase playback volume when the line is silent.

This means that it is reasonable to define a fixed noise level for the G.711 codec, and treat any G.711 bit below the noise level as a candidate to carry data. Most candidates will pass through, but a few must be held back on account of the translation between uLaw and A-Law variations of G.711. The result of this system is a G.711 codec that sounds like the normal one, albeit with more noise, but with room for data.

The noise level set by this specification is -48 dB, comparable to 8-bit PCM and reminiscent of AM radio. The noise can be heard at this level, but it is not disruptive to vocal communication. When the opportunistic negotiation of G.711+data fails, then the codec can stop and return to unmodified G.711 operation; if it succeeds however, the sound quality can be greatly improved; in that sense, the noise can be seen as an investment in better sound quality.

Defining the sPCM7 Codec

This specification assumes linear sample data to be laid out in a particular format, that shall be referred to as sPCM7. The goal is not to define a new codec for broad use, but rather to have a well-defined format that easily translates from and to existing linear formats.

The sPCM7 format represents each sample as a sign bit followed by 7 bits of linear volume information. In correspondance with G.711, the sign bit is set to 1 for positive values, and to 0 for negative values.

The zero volume occurs both with a positive and negative sign; this should be interpreted as the result of rounding sample values down. The sample values -1,-0,+0,+1 should describe an equidistantial sequence.

To map a linear sample value in two’s complement to a sample with a separate sign, the sign is extracted and the sample value is copied if it is non-negative, or complemented (but not incremented) if it is negative. The resulting linear sample values include support for -0 as well as +0. The most significant 7 bits (save one for the sign bit) of the result are used in sPCM7, but not before subtracting 00000001000… from it; in terms of sPCM7, this would mean subtracting 0.5; if the result would be negative, it is set to 0. Effectively, the distance between -0 and +0 now is the same as the distance between +0 and +1, as is desired by sPCM7.

To map an sPCM7 value to another value, it should be observed that the value has only 5 bits of actual value, in most cases a 1 followed by 4 bits mantisse. This value can be repeated as long as desired to fill a larger sample value; basically, the sPCM7 absolute value is placed in the top 7 bits (save one for the sign bit) of the new sample value, and the rest filled with zeroes. Then the value is shifted 5 bits down, and the result applied to the sample with bitwise-or. This repeats until nothing more is added. After this operation, 000000001000… is added, that is 0.5 relative to the original sPCM7 value, to increase the distance between -0 and +0. Now, when the sign is negative, the absolute value can be complemented (but not incremeneted) to obtain a two’s complement sample value.

G.711 Transformation Table

The following table defines patterns that are used for translation to and from G.711+data. Note that the table is constructed such that both the sPCM7 and G.711+data columns can be used to find exactly one row for every possible byte value, if the bits marked s, x, y and z are bound as variable bits.

sPCM7 Data Field G.711+data Distorted Values Data Values
s1xxxxyy s111xxxx NA
s01xxxxy s110xxxx NA
s001xxxx s101xxxx -80, 80 NA
s0001xxx z s100xxxz 0, 1
s0000111 zz s01111zz -63, 63 0, 10, 11, NA
s0000110 zz s01110zz 00, 01, 10, 11
s000010x zz s0110xzz 00, 01, 10, 11
s0000011 zzz s0101zzz -47, -45, 45, 47 00, 010, 011, 100, 101, NA, 11, NA
s0000010 zzz s0100zzz -32, 32 NA, 000, 001, 010, 011, 100, 101, 11
s0000001 zzzz s001zzzz -30, -28, -26, 26, 28, 30 0000, 0001, 0010, 0011, 0100, 0101,
0110, 0111, 100, NA, 101, NA, 11, NA,
ESC, NA
s0000000 zzzz s000zzzz 0000, 0001, 0010, 0011, 0100, 0101,
0110, 0111, 1000, 1001, 1010, 1011,
1100, 1101, 1110, 1111

The Data Values column gives the meaning for the z bits in the Data Field column, ordered by the binary value of the z bits. Not all bit sequences should be transmitted, because their values could be distorted when passed over a uLaw connection leg; such bit sequences are marked NA in the Data Values column. Given the existence of NA (and a few other) special values, not all combinations in the Data Values column specify the same number of data bits as the number of z bits.

ESC marks a special value; sending it may be disruptive to the G.711 audio flow. It is sent to switch to packet mode. The positive value indicates that the line is certainly free from distortion; the negative value indicates the opposite (that is, there is or there may be distortion). Distortion Detection is part of the landline connection profile and not considered here.

Translating G.711 and Data to G.711+data

This procedure inolves finding the G.711 value’s row in the transformation table, masking out s, x and z bits. If room exists for z bits, then their value is replaced with the index number in the data values column for data to send. The s and x bits are retained as they originally were.

Translating PCM and Data to G.711+data

This procedure involves finding the sPCM7 value's row in the transformation table, masking out s, x and y bits. If the input is in another PCM format than sPCM7, then a translation must first be made. If room exists for z bits, then their value is found as the index number in the data values column for data to send. The data bits are taken from the most significant bits of the data flow; the number of bits consumed is determined by the number of bits matched by the data value found. The G.711+data value is composed from the table row pattern, and any x and z bits found; the s bit is copied from the sPCM7 input.

Translating G.711+data to PCM and Data

This procedure involves finding the G.711+data value’s row in the transformation table, masking out s, x and z bits. The z bits are looked up in the Data Values table, and the output is delivered to the data output. The sPCM7 value to output is defined in the table, after replacing its s and x bits with the values from the G.711+data input. Any y bits in the sPCM7 format should be set to zero, to avoid adding noise at the higher level where these y bits occur.

Translating G.711+data to G.711 and Data

This procedure involves finding the G.711+data value’s row in the transformation table, masking out s, x and z bits. The z bits are looked up in the Data Values table, and the output is delivered to the data output. The G.711 value to output matches the s, x, 1 and 0 bits from the G.711+data pattern in the table; the z bits SHOULD NOT be the same as the incoming data because that might enable injected sound; the z bits SHOULD be replaced by one 1 bit followed by as many 0 bits as there are z bits left.

blogroll