Over the last years, there has been an extraordinary development in digital communications especially in the areas of mobile phones, personal computers, satellites, and computer communication. In these digital communication systems, data is represented as a sequence of 0s and 1s. These binary bits are expressed as analog signal waveforms and then transmitted over a communication channel. Communication channels, though, induce interference and noise to the transmitted signal and corrupt it. At the receiver, the corrupted transmitted signal is modulated back to binary bits. The received binary data is an evaluation of the binary data being transmitted. Bit errors may occur because of the transmission and that number of errors depends on the communication channels interference and noise amount.
Channel coding is used in digital communications to protect the digital data and reduce the number of bit errors caused by noise and interference. Channel coding is mostly achieved by adding redundant bits into the transmitted data. These additional bits allow the detection and correction of the bit errors in the received information, thus providing a much more reliable transmission. The cost of using channel coding to protect the transmitted information is a reduction in data transfer rate or an increase in bandwidth.
1. FORWARD ERROR CORRECTION BLOCK CODES
1.1 ERROR DETECTION - CORRECTION
Error detection and correction are methods to make sure that information is transmitted error free, even across unreliable networks or media.
Error detection is the ability to detect errors due to noise, interference or other problems to the communication channel during transmission from the transmitter to the receiver. Error correction is the ability to, furthermore, recreate the initial, error-free information.
There are two basic protocols of channel coding for an error detection-correction system:
Automatic Repeat-reQuest (ARQ): In this protocol, the transmitter, along with the data, sends an error detection code, that the receiver then uses to check if there are errors present and requests retransmission of erroneous data, if found. Usually, this request is implicit. The receiver sends back an acknowledgement of data received correctly, and the transmitter sends again anything not acknowledged by the receiver, as fast as possible.
Forward Error Correction (FEC): In this protocol, the transmitter implements an error-correcting code to the data and sends the coded information. The receiver never sends any messages or requests back to the transmitter. It just decodes what it receives into the "most likely" data. The codes are constructed in a way that it would take a great amount of noise to trick the receiver interpreting the data wrongly.
1.2 FORWARD ERROR CORRECTION (FEC)
As mentioned above, forward error correction is a system of controlling the errors that occur in data transmission, where the sender adds additional information to its messages, also known as error correction code. This gives the receiver the power to detect and correct errors (partially) without requesting additional data from the transmitter. This means that the receiver has no real-time communication with the sender, thus cannot verify whether a block of data was received correctly or not. So, the receiver must decide about the received transmission and try to either repair it or report an alarm.
The advantage of forward error correction is that a channel back to the sender is not needed and retransmission of data is usually avoided (at the expense, of course, of higher bandwidth requirements). Therefore, forward error correction is used in cases where retransmissions are rather costly or even impossible to be made. Specifically, FEC data is usually implemented to mass storage devices, in order to be protected against corruption to the stored data.
However, forward error connection techniques add a heavy burden on the channel by adding redundant data and delay. Also, many forward error correction methods do not quite respond to the actual environment and the burden is there whether needed or not. Another great disadvantage is the lower data transfer rate. However, FEC methods reduce the requirements for power variety. For the same amount of power, a lower error rate can be achieved. The communication in this situation remains simple and the receiver alone has the responsibility of error detection and correction. The sender complexity is avoided and is now entirely assigned to the receiver.
Forward error correction devices are usually placed close to the receiver, in the first step of digital processing of an analog signal that has been received. In other words, forward error correction systems are often a necessary part of the analog to digital signal conversion operation that also contain digital mapping and demapping, or line coding and decoding. Many forward error correction coders can also produce a bit-error rate (BER) signal that can be used as feedback to optimize the received analog circuits. Software controlled algorithms, such as the Viterbi decoder, can receive analog data, and output digital data.
The maximum number of errors a forward error correction system can correct is initially defined by the design of the code, so different FEC codes are suitable for different situations.
The three main types of forward error correction codes are:
Block codes that work on fixed length blocks (packets) of symbols or bits with a predefined size. Block codes can often be decoded in polynomial time to their block size.
Convolutional codes that work on symbol or bit streams of indeterminate size. They are usually decoded with the Viterbi algorithm, though other algorithms are often used as well. Viterbi algorithm allows infinite optimal decoding efficiency by increasing limited length of the convolutional code, but at the cost of greatly increasing complexity. A convolutional code can be transformed into a block code, if needed.
Interleaving codes that have alleviating properties for fading channels and work well combined with the other two types of forward error correction coding.
1.3 BLOCK CODING
Block coding was the first type of channel coding implemented in early mobile communication systems. There are many types of block coding, but among the most used ones the most important is Reed-Solomon code, that is presented in the second part of the coursework, because of its extensive use in famous applications. Hamming, Golay, Multidimensional parity and BCH codes are other well-known examples of classical block coding.
The main feature of block coding is that it is a fixed size channel code (in contrary to source coding schemes such as Huffman coders, and channel coding techniques as convolutional coding). Using a preset algorithm, block coders take a k-digit information word, S and transform it into an n-digit codeword, C(s). The block size of such a code will be n. This block is examined at the receiver, which then decides about the validity of the sequence it received.
1.3.2 FORMAL TYPE
As mentioned above, block codes encode strings taken from an alphabet set S into codewords by encoding each letter of S independently. Suppose (k1, k2,, km) is a sequence of natural numbers that each one less than |S| . If S=s1,s2,,sn and a specific word W is written as W = sk1 sk2 skn , then the codeword that represents W, that is to say C(W), is:
C(W) = C(sk1) C(sk2) C (skm)
1.3.3 HAMMING DISTANCE
Hamming Distance is a rather significant parameter in block coding. In continuous variables, distance is measured as length, angle or vector. In the binary field, distance between two binary words, is measured by the Hamming distance. Hamming distance is the number of different bits between two binary sequences with the same size. It, basically, is a measure of how apart binary objects are. For example, the Hamming distance between the sequences: 101 and 001 is 1 and between the sequences: 1010100 and 0011001 is 4.
Hamming distance is a variable of great importance and usefulness in block coding. The knowledge of Hamming distance can determine the capability of a block code to detect and correct errors. The maximum number of errors a block code can detect is: t = dmin 1, where dmin is the Hamming distance of the codewords. A code with dmin = 3, can detect 1 or 2 bit errors. So the Hamming distance of a block code is preferred to be as high as possible since it directly effects the codes ability to detect bit errors. This also means that in order to have a big Hamming distance, codewords need to be larger, which leads to additional overhead and reduced data bit rate.
After detection, the number of errors that a block code can correct is given by: t(int) = (dmin 1)/2
1.3.4 PROBLEMS IN BLOCK CODING
Block codes are constrained by the sphere packing problem that has been quite significant in the last years. This is easy to picture in two dimensions. For example, if someone takes some pennies flat on the table and push them together, the result will be a hexagon pattern like a bee's nest. Block coding, though, relies on more dimensions which cannot be visualized so easily. The famous Golay code, for instance, applied in deep space communications uses 24 dimensions. If used as a binary code (which very often it is,) the dimensions refer to the size of the codeword as specified above.
The theory of block coding uses the N-dimensional sphere model. For instance, what number of pennies can be packed into a circle on a tabletop or in 3-dimensional model, what number of marbles can be packed into a globe. Its all about the codes choice. Hexagon packing, for example, in a rectangular box will leave the four corners empty. Greater number of dimensions means smaller percentage of empty spaces, until eventually at a certain number the packing uses all the available space. These codes are called perfect codes and there are very few of them.
The number of a single codewords neighbors is another detail which is usually overlooked in block coding. Back to the pennies example again, first pennies are packed in a rectangular grid. Each single penny will have four direct neighbors (and another four at the four corners that are farther away). In the hexagon formation, each single penny will have six direct neighbors. In the same way, in three and four dimensions there will be twelve and twenty-four neighbors, respectively. Thus, increasing the number of dimensions, the close neighbors increase rapidly. This results in that noise finds numerous ways to make the receiver choose a neighbor, hence an error. This is a fundamental constraint of block coding, and coding in general. It may be more difficult to cause an error to one neighbor, but the number of neighbors can be so big that the probability of total error actually suffers.
2. REED-SOLOMON CODE
The Reed-Solomon code is an error correction block coding system that cycles data numerous times through a mathematical transformation that increases the effectiveness of the code, especially with the burst errors.
The Reed-Solomon code was invented in 1960 by Gustave Solomon and Irving S. Reed, who were at that time MIT Lincoln Laboratorys members. Their innovative and future promising article with the title "Polynomial Codes over Certain Finite Fields" changed the course of digital technology, which at that time was not advanced enough to apply the concept. The first application of Reed-Solomon coding in products of mass production was the Compact Disc, in 1982, where two interleaved Reed-Solomon codes are used. A well performing decoding algorithm for large distance Reed-Solomon coding was, also, developed by James Massey and Elwyn Berlekamp in 1969.
ReedSolomon is an error-correction code that operates by oversampling a polynomial produced from the data. This polynomial gets evaluated at different points, and then these values are being sent or recorded. Sampling the polynomial more than necessary makes the polynomial over-determined. As long as it receives a lot of bits correctly, the receiver can restore the original polynomial even with a few errors present.
The basic design block of Reed-Solomon coding is a symbol with m binary bits, where m>2. For a given m, the size of all Reed-Solomon codes with m-bit symbols is 2m - 1. For instance, for 8-bit symbols, the size of the Reed-Solomon codes will be 28 - 1 = 255.
Reed-Solomon codes are made of two sections, the data part and the parity part. Given a Reed-Solomon code with n symbols, the first k symbols represent the data part, that is the information which has to be protected from errors, and the rest (n-k) symbols represent the parity part, that is measured based on the fist, data part. Such a code is known as an (n, k)Reed-Solomon code, otherwise just RS(n,k) code. The parity symbolss number is (n-k), often an even number expressed by 2t. A Reed-Solomon code that has 2t parity symbols can correct up to t error symbols.
When a Reed-Solomon codes length has to be smaller than 2m - 1, zero padding is applied to make the codes length 2m - 1. After the encoding process, the padded zeros are taken away in order a so called shortened Reed-Solomon code to be formed. For instance, for an 8-bit Reed-Solomon code, there are 100 data bytes that have to be defended against 8 errors at maximum. So, 16 parity bytes will be needed. The total size of the code will be 116 bytes, a number smaller than 255. In order to calculate the 16 parity bytes, 139 zeros have to be padded to the data bytes and then the 239 total bytes to be encoded. After this encoding, the padded zeros are removed and only the inital data bytes and the calculated parity bytes are sent or recorded. At the decoding process of the code, the padding zeros that have been removed will at first be added to the code and next do the decoding.
ReedSolomon coding, just like convolutional coding, is a transparent code, meaning that if the channel symbols have been inverted in some point during transmission, the decoders will still function. The result will be the supplement of the initial data. Nevertheless, ReedSolomon coding loses its transparency when the codes length is decreased. The bits that are missing in a shortened code have to be filled by either ones or zeros. This depends on whether the data is complemented or not. (In other words, if the symbols are inverted, then the zero-fill must be inverted to a ones-fill.) Therefore, it is obligatory for the sense of the data to be resolved before ReedSolomon decoding.
2.4 REED-SOLOMON CODE MATHEMATICAL FORMULA
2.4.1 REED-SOLOMON ENCODING
A Reed-Solomon code is determined by its generator polynomial. Given a Reed-Solomon code that corrects t errors, its generator polynomial will be:
In the type above, a is an m-bit binary symbol and m0 is a pre-set number, most of the time 0 or 1. Before encoding a message sequence, the message polynomial is built first:
In this type, k=n2t. The parity polynomial is the remainder of X2ta(X)/g(X), that is given by:
Parity sequence is the parity polynomials coefficients. The code is designed as the data sequence that is followed by the parity sequence. The end code polynomial is represented as:
2.4.2 REED-SOLOMON DECODING
Suppose the transmitted code vector is given by:
while the received one is given by:
At the first step of Reed-Solomon code decoding the 2t syndrome components are found by:
S2t-1 = r(a2t-1) = r0 + r1(a2t-1) + r2(a2t-1)2 ++rn-1(a2t-1)n-1
The syndrome polynomial will be:
At the second step of Reed-Solomon code decoding the error evaluation polynomial and the error location polynomial have to be calculated. The error evaluation polynomial is given by:
And the error location polynomial by:
where e is the errors number. The W(X) and L(X) relate to the syndrome polynomial through this important equation:
The popular repeating Berlekamp-Macey algorithm is applied to solve the L(X) and W(X) equations.
At the last step of decoding a Reed-Solomon code the error location and value have to be found. The error location is found using Chans searching algorithm. Actually, X is displaced with an in L(X) equation for all possible n in a code in order to find L(X)s root. The opposite of the error location polynomials root is the error location. After an error location is found, the error value is determined by Forneys error evaluation algorithm. When the error value is found, it is then added to the corrupted symbol in order to correct the error.
2.5 REED-SOLOMON CODING APPLICATIONS
Reed-Solomon coding is rather widely implemented in digital communication systems and digital data storage systems. Digital communication systems using Reed-Solomon coding for error protection consist of digital video broadcasting systems, wireless communication systems, broadband communication systems as well as space and deep space communication systems. Digital data storage systems using Reed-Solomon coding to correct burst errors linked to media corruption consist of the most common implementation, compact disc (CD) storage systems, their larger brothers, digital versatile disc (DVD) storage systems, and of course hard disc storage systems. Also a recent research shows that Reed-Solomon codes start to find their use in dynamic memory protection systems.
The use of Forward Error Correction codes is a classical solution to improve the reliability of multicast and broadcast transmissions. Block coding clearly has some advantages over the convolutional and interleaving coding something that justifies why it is more preferable by programmers. Reed-Solomon coding represents the most famous example. The efficient encoding and decoding algorithm, the methodic formation as well as the powerful error correction capability of this coding method makes it one of the most commonly used error correction coding systems in the industry.