Golomb coding
entropy encoding invented by Solomon W. Golomb that is optimal for alphabets following geometric distributions, that is, when small values are vastly more common than large values. If negative values are present in the input stream then a overlap and interleave scheme is used, where all nonnegative inputs are mapped to even numbers (), and all negative numbers are mapped to odd numbers ().
Golomb coding can also be used (as in Golomb's original article) to encode an alphabet of two symbols, where one is more likely than the other. In this case it can be thought of as a kind of runlength encoding.
Rice coding is a special case of Golomb coding first described by, and independently invented by, Robert F. Rice. It is equivalent to Golomb coding where the tunable parameter is a power of two. This makes it extremely efficient for use on computers, since the division operation becomes a bitshift operation and the remainder operation becomes a bitmask operation.
Rice coding is used as the entropy encoding stage in a number of lossless image compression and audio data compression methods.
GolombRice codes can be thought of as codes that indicate a number by the position of the bin (q), and the offset within the bin (r). The above figure shows the position q, and offset r for the encoding of integer N using GolombRice parameter M.
Formally, the two parts are given by the following expression, where is the number being encoded: and The final result looks like:
Note that can be of a varying number of bits, and is specifically only b bits for Rice code, and switches between b1 and b bits for Golomb code (i.e M is not a power of 2): Let . If , then use b1 bits to encode r. If , then use b bits to encode r. Clearly, if M is a power of 2 and we can encode all values of r with b bits.
The parameter M is a function of the corresponding Bernoulli process, which is parameterized by the probability of success in a given Bernoulli Trial. and are related by these inequalities:
The Golomb code for this distribution is equivalent to the Huffman code for the same probabilities, if it were possible to compute the Huffman code.
For example, with a RiceGolomb encoding of parameter M=10, the decimal number 42 would first be split into q=4,r=2, and would be encoded as qcode(q),rcode(r) = qcode(4),rcode(2) = 11110,010 (you don't need to encode the separating comma in the output stream, because the 0 at the end of the q code is enough to say when q ends and r begins ; both the qcode and rcode are selfdelimited).
Apple's ALAC audio codec uses modified Rice coding after its Adaptive FIR filter.
The GolombRice coder is used in the entropy coding stage of Rice Algorithm based lossless image codecs. One such experiment yields a compression ratio graph given below. See other entries in this category at the bottom of this page. in those compression, the progressive space differential data yields an alternating suite of positive and negative values around 0, which are remapped to positiveonly integers (by doubling the absolute value and adding one if the input is negative), and then RiceGolomb coding is applied by varying the divisor which remains small.
Note that in those results, the Rice coding may create very long sequences of onebits for the quotient ; for practical reasons, it is often necessary to limit the total runlength of 1bits, so a modified version of the RiceGolomb encoding consists in replacing the long string of onebits by encoding its length by a recursive RiceGolomb encoding; this requires reserving some values in addition to of the initial divisor k to allow the necessary distinction.
(See for formats and for codecs)
Golomb coding is a form of Golomb coding can also be used (as in Golomb's original article) to encode an alphabet of two symbols, where one is more likely than the other. In this case it can be thought of as a kind of runlength encoding.
Rice coding is a special case of Golomb coding first described by, and independently invented by, Robert F. Rice. It is equivalent to Golomb coding where the tunable parameter is a power of two. This makes it extremely efficient for use on computers, since the division operation becomes a bitshift operation and the remainder operation becomes a bitmask operation.
Rice coding is used as the entropy encoding stage in a number of lossless image compression and audio data compression methods.
Overview
Golomb coding uses a tunable parameter M to divide an input value into two parts: , the result of a division by M, and , the remainder. The quotient is sent in unary coding, followed by the remainder in truncated binary encoding. When Golomb coding is equivalent to unary coding.GolombRice codes can be thought of as codes that indicate a number by the position of the bin (q), and the offset within the bin (r). The above figure shows the position q, and offset r for the encoding of integer N using GolombRice parameter M.
Formally, the two parts are given by the following expression, where is the number being encoded: and The final result looks like:
Note that can be of a varying number of bits, and is specifically only b bits for Rice code, and switches between b1 and b bits for Golomb code (i.e M is not a power of 2): Let . If , then use b1 bits to encode r. If , then use b bits to encode r. Clearly, if M is a power of 2 and we can encode all values of r with b bits.
The parameter M is a function of the corresponding Bernoulli process, which is parameterized by the probability of success in a given Bernoulli Trial. and are related by these inequalities:
The Golomb code for this distribution is equivalent to the Huffman code for the same probabilities, if it were possible to compute the Huffman code.
Simple Algorithm
Note below that this is the RiceGolomb encoding, where the remainder code uses simple truncated binary encoding, also named "Rice coding" (other varyinglength binary encodings, like arithmetic or Huffman encodings, are possible for the remainder codes, if the statistic distribution of remainder codes is not flat, and notably when not all possible remainders after the division are used). In this algorithm, if the M parameter is a power of 2, it becomes equivalent to the simpler Golomb encoding. Fix the parameter M to an integer value.
 For N, the number to be encoded, find
 quotient = q = int[N/M]
 remainder = r = N%M
 Generate Codeword
 The Code format : <quotient Code><remainder Code>, where
 Quotient Code (in unary coding)
 Write a qlength string of 1 bits
 Write a 0 bit
 Remainder Code (in truncated binary encoding)
 If M is power of 2, code remainder as binary format. So bits are needed. (Rice code)
 If M is not a power of 2, set
 Code the first values of r as plain binary using b1 bits.
 Code the rest of the values of r by coding the number in plain binary representation using b bits.
Example
Set M = 10. Thus . The cutoff is


For example, with a RiceGolomb encoding of parameter M=10, the decimal number 42 would first be split into q=4,r=2, and would be encoded as qcode(q),rcode(r) = qcode(4),rcode(2) = 11110,010 (you don't need to encode the separating comma in the output stream, because the 0 at the end of the q code is enough to say when q ends and r begins ; both the qcode and rcode are selfdelimited).
Example code
Note: this basic code assumes that the M parameter is a power of 2; it does not implement the case where truncated bit encoding of division remainders will be preferable (when M is not a power of 2, like in the previous example).Encoding
> void golombEncode(char* source, char* dest, int M) { IntReader intreader(source); BitWriter bitwriter(dest); int num; int c = 0; while(intreader.hasLeft()) { num = intreader.getInt(); int q = num / M; for (int i = 0 ; i < q; i++) bitwriter.putBit(true); // write q ones bitwriter.putBit(false); // write one zero int v = 1; for (int i = 0 ; i < log2(M); i++) { bitwriter.putBit( v & num ); v = v << 1; } } bitwriter.close(); intreader.close(); }
Decoding
> void golombDecode(char* source, char* dest, int M) { BitReader bitreader(source); IntWriter intwriter(dest); int q = 0; int nr = 0; while (bitreader.hasLeft()) { nr = 0; q = 0; while (bitreader.getBit()) q++; // potentially dangerous with malformed files. for (int a = 0; a < log2(M); a++) // read out the sequential log2(M) bits if (bitreader.getBit()) nr += 1 << a; nr += q*M; // add the bits and the multiple of M intwriter.putInt(nr); // write out the value } bitreader.close(); intwriter.close(); }
Use for RunLength Encoding
Given an alphabet of two symbols, or a set of two events, P and Q, with probabilities p and (1p) respectively, where p >= 1/2, Golomb coding can be used to encode runs of zero or more P's separated by single Q's. In this application, the best setting of the parameter M is the nearest integer to . When p = 1/2, M = 1, and the Golomb code corresponds to binary (n>=0 ones followed by a zero codes for n P's followed by a Q).Applications
The FLAC audio codec uses Rice coding to represent linear prediction residues (because these residues are modeled into a neargeometric distribution, with small residues being more frequent than large residues, and are then encoded with less bits using a predefined RiceGolomb encoding, without requiring more complex variablelength encodings like Huffmann or arithmetic codings). However, this assumption is wrong for a simple sinusoidal signal, because the differential residues create a sinusoidal signal whose values are not creating a geometric distribution (the highest and lowest residue values have similar high frequency of occurrences, only the median positive and negative residues occur less often).Apple's ALAC audio codec uses modified Rice coding after its Adaptive FIR filter.
The GolombRice coder is used in the entropy coding stage of Rice Algorithm based lossless image codecs. One such experiment yields a compression ratio graph given below. See other entries in this category at the bottom of this page. in those compression, the progressive space differential data yields an alternating suite of positive and negative values around 0, which are remapped to positiveonly integers (by doubling the absolute value and adding one if the input is negative), and then RiceGolomb coding is applied by varying the divisor which remains small.
Note that in those results, the Rice coding may create very long sequences of onebits for the quotient ; for practical reasons, it is often necessary to limit the total runlength of 1bits, so a modified version of the RiceGolomb encoding consists in replacing the long string of onebits by encoding its length by a recursive RiceGolomb encoding; this requires reserving some values in addition to of the initial divisor k to allow the necessary distinction.
References
 Golomb, S.W. (1966). , Runlength encodings. IEEE Transactions on Information Theory, IT12(3):399401
 R. F. Rice, "Some Practical Universal Noiseless Coding Techniques, " Jet Propulsion Laboratory, Pasadena, California, JPL Publication 7922, Mar. 1979.
 Witten, Ian Moffat, Alistair Bell, Timothy. "Managing Gigabytes: Compressing and Indexing Documents and Images." Second Edition. Morgan Kaufmann Publishers, San Francisco CA. 1999 ISBN 1558605703
 David Salomon. "Data Compression",ISBN 0387950451.
 http://ese.wustl.edu/class/fl06/ese578/GolombCodingNotes.pdf
Lossless compression methods 
 

Audio compression methods 
 
Image compression methods 
 
Video compression 
 
Timeline of information theory, data compression, and errorcorrecting codes 
In information theory an entropy encoding is a lossless data compression scheme that is independent of the media’s specific characteristics.
One of the main types of entropy coding assigns codes to symbols so as to match code lengths with the probabilities of the
..... Click the link for more information.
One of the main types of entropy coding assigns codes to symbols so as to match code lengths with the probabilities of the
..... Click the link for more information.
Solomon Wolf Golomb (b. 1932 in Baltimore, Maryland) is a mathematician and engineer, a professor of electrical engineering at the University of Southern California best known to the general public and fans of mathematical games as the inventor of polyominoes, the inspiration for
..... Click the link for more information.
..... Click the link for more information.
geometric distribution is either of two discrete probability distributions:
..... Click the link for more information.
 the probability distribution of the number X of Bernoulli trials needed to get one success, supported on the set , or
 the probability distribution of the number Y
..... Click the link for more information.
Runlength encoding (RLE) is a very simple form of data compression in which runs of data (that is, sequences in which the same data value occurs in many consecutive data elements) are stored as a single data value and count, rather than as the original run.
..... Click the link for more information.
..... Click the link for more information.
computer is a machine which manipulates data according to a list of instructions.
Computers take numerous physical forms. The first devices that resemble modern computers date to the mid20th century (around 1940  1941), although the computer concept and various machines
..... Click the link for more information.
Computers take numerous physical forms. The first devices that resemble modern computers date to the mid20th century (around 1940  1941), although the computer concept and various machines
..... Click the link for more information.
In computer science, a mask is some data that, along with an operation, is used in order to extract information stored elsewhere.
The most common mask used, also known as a bitmask
..... Click the link for more information.
The most common mask used, also known as a bitmask
..... Click the link for more information.
In information theory an entropy encoding is a lossless data compression scheme that is independent of the media’s specific characteristics.
One of the main types of entropy coding assigns codes to symbols so as to match code lengths with the probabilities of the
..... Click the link for more information.
One of the main types of entropy coding assigns codes to symbols so as to match code lengths with the probabilities of the
..... Click the link for more information.
Image compression is the application of Data compression on digital images. In effect, the objective is to reduce redundancy of the image data in order to be able to store or transmit data in an efficient form.
Image compression can be lossy or lossless.
..... Click the link for more information.
Image compression can be lossy or lossless.
..... Click the link for more information.
Audio compression is a form of data compression designed to reduce the size of audio files. Audio compression algorithms are implemented in computer software as audio codecs.
..... Click the link for more information.
..... Click the link for more information.
Unary coding is an entropy encoding that represents a natural number, n, with n − 1 ones followed by a zero. For example 5 is represented as 11110. Some representations use n − 1 zeros followed by a one.
..... Click the link for more information.
..... Click the link for more information.
Truncated binary encoding is an entropy encoding typically used for uniform probability distributions with a finite alphabet. It is parameterized by an alphabet with total size of number n.
..... Click the link for more information.
..... Click the link for more information.
In probability and statistics, a Bernoulli process is a discretetime stochastic process consisting of a sequence of independent random variables taking values over two symbols. Prosaically, a Bernoulli process is coin flipping, possibly with an unfair coin.
..... Click the link for more information.
..... Click the link for more information.
In the theory of probability and statistics, a Bernoulli trial is an experiment whose outcome is random and can be either of two possible outcomes, "success" and "failure".
In practice it refers to a single experiment which can have one of two possible outcomes.
..... Click the link for more information.
In practice it refers to a single experiment which can have one of two possible outcomes.
..... Click the link for more information.
Huffman coding is an entropy encoding algorithm used for lossless data compression. The term refers to the use of a variablelength code table for encoding a source symbol (such as a character in a file) where the variablelength code table has been derived in a particular way
..... Click the link for more information.
..... Click the link for more information.
Unary coding is an entropy encoding that represents a natural number, n, with n − 1 ones followed by a zero. For example 5 is represented as 11110. Some representations use n − 1 zeros followed by a one.
..... Click the link for more information.
..... Click the link for more information.
Truncated binary encoding is an entropy encoding typically used for uniform probability distributions with a finite alphabet. It is parameterized by an alphabet with total size of number n.
..... Click the link for more information.
..... Click the link for more information.
Free Lossless Audio Codec
File extension:
MIME type:
Type of format: Audio
Free Lossless Audio Codec
Developer: Xiph.Org Foundation
Latest release: 1.2.
..... Click the link for more information.
File extension:
.flac
MIME type:
audio/xflac^{[1]}
Type of format: Audio
Free Lossless Audio Codec
Developer: Xiph.Org Foundation
Latest release: 1.2.
..... Click the link for more information.
Audio compression can mean two things:
..... Click the link for more information.
 Audio data compression  in which the amount of data in a recorded waveform is reduced for transmission. This is used in CD and MP3 encoding, internet radio, and the like.
..... Click the link for more information.
Apple Inc.
Public (NASDAQ: AAPL , LSE: ACP , FWB: APC )
Founded California (April 1 1976, as Apple Computer, Inc.)
Headquarters 1 Infinite Loop, Cupertino, California
Key people Steve Jobs, CEO & Cofounder
Steve Wozniak, Cofounder
..... Click the link for more information.
Public (NASDAQ: AAPL , LSE: ACP , FWB: APC )
Founded California (April 1 1976, as Apple Computer, Inc.)
Headquarters 1 Infinite Loop, Cupertino, California
Key people Steve Jobs, CEO & Cofounder
Steve Wozniak, Cofounder
..... Click the link for more information.
ALAC may refer to:
..... Click the link for more information.
 Apple Lossless, a digital audio format
 The AtLarge Advisory Committee, a committee of ICANN
..... Click the link for more information.
Audio compression can mean two things:
..... Click the link for more information.
 Audio data compression  in which the amount of data in a recorded waveform is reduced for transmission. This is used in CD and MP3 encoding, internet radio, and the like.
..... Click the link for more information.
A finite impulse response (FIR) filter is a type of a digital filter. It is 'finite' because its response to an impulse ultimately settles to zero. This is in contrast to infinite impulse response filters which have internal feedback and may continue to respond indefinitely.
..... Click the link for more information.
..... Click the link for more information.
data compression or source coding is the process of encoding information using fewer bits (or other informationbearing units) than an unencoded representation would use through use of specific encoding schemes.
..... Click the link for more information.
..... Click the link for more information.
Lossless data compression is a class of data compression algorithms that allows the exact original data to be reconstructed from the compressed data. This can be contrasted to lossy data compression, which does not allow the exact original data to be reconstructed from the
..... Click the link for more information.
..... Click the link for more information.
Information theory is a branch of applied mathematics and engineering involving the quantification of information to find fundamental limits on compressing and reliably communicating data.
..... Click the link for more information.
..... Click the link for more information.
Shannon entropy or information entropy is a measure of the uncertainty associated with a random variable.
Shannon entropy quantifies the information contained in a piece of data: it is the minimum average message length, in bits (if using base2 logarithms), that must
..... Click the link for more information.
Shannon entropy quantifies the information contained in a piece of data: it is the minimum average message length, in bits (if using base2 logarithms), that must
..... Click the link for more information.
In computer science, the Kolmogorov complexity (also known as descriptive complexity, KolmogorovChaitin complexity, stochastic complexity, algorithmic entropy, or programsize complexity
..... Click the link for more information.
..... Click the link for more information.
Redundancy in information theory is the number of bits used to transmit a message minus the number of bits of actual information in the message. Informally, it is the amount of wasted "space" used to transmit certain data.
..... Click the link for more information.
..... Click the link for more information.
In information theory an entropy encoding is a lossless data compression scheme that is independent of the media’s specific characteristics.
One of the main types of entropy coding assigns codes to symbols so as to match code lengths with the probabilities of the
..... Click the link for more information.
One of the main types of entropy coding assigns codes to symbols so as to match code lengths with the probabilities of the
..... Click the link for more information.
Huffman coding is an entropy encoding algorithm used for lossless data compression. The term refers to the use of a variablelength code table for encoding a source symbol (such as a character in a file) where the variablelength code table has been derived in a particular way
..... Click the link for more information.
..... Click the link for more information.
This article is copied from an article on Wikipedia.org  the free encyclopedia created and edited by online user community. The text was not checked or edited by anyone on our staff. Although the vast majority of the wikipedia encyclopedia articles provide accurate and timely information please do not assume the accuracy of any particular article. This article is distributed under the terms of GNU Free Documentation License.