The Error Detection Strategies Computer Science Essay

Published: Last Edited:

This essay has been submitted by a student. This is not an example of the work written by our professional essay writers.

By using a redundancy method error detection can be introduced into the system. When an error occurs in the circuit, redundancy method use additional resources to detect. Error detection is impossible without some forms of redundancy. The common three broad categories of error detection have been summarized in this section. The three categories include temporal redundancy, hardware redundancy & information redundancy.

Temporal Redundancy: To execute error detection process, temporal redundancy makes use of additional time [clock cycle] as redundancy (Pradhan, 1998). The advantage of using this technique is that it uses comparatively less additional hardware than other on-line detection techniques at the expense of latency. While the latency of system is increased, this technique conserves the area. In figure 2 the structure of this technique can be seen. The operation is first executed normally during t0. Then the output is stored after which the operation is performed for the second time during t0+1 .The inputs are subjected to an encoding scheme at this second time. The outputs obtained are decoded and compared to the original soon after the re-computation is complete.  An error flag is raised signaling the output is wrong if any difference occurs. The transient and permanent faults of the system are detected by the encoder and decoder of the second operation.

Figure 2.jpg

Figure : Flow chart of temporal redundancy operation (Pradhan, 1998).

 Information redundancy: Information redundancy allows fault detection in the system by adding information bits to the data. These additional information's used to represent data are in the form of codeword's. Parity bit is the most common example. To certify that the data has not been corrupted, the parity is recalculated and compared with the stored original parity bit. The state machine encoding is another example of information redundancy. The variables of state machine is encoded to codeword's which are checked for validity (Hamming, 1950). These codeword's are subsets of a much larger universal set of codeword's. The error flag is set if a codeword appears which is not a part of the valid subset of codeword's.

Hardware redundancy: To detect errors which are caused by SEUs the hardware redundancy makes use of additional hardware components. To check if an error has occurred, an additional circuitry is added and a comparison of outputs from redundant circuit is made with the original circuitry. The additional hardware added is similar to the original hardware. For instance, to a multiplier system hardware redundancy is applied in the form of a second multiplier of reduced precision. Then a comparison of outputs can be done to check for error. The technique of hardware redundancy can be used throughout a system uniformly(McMurtrey, 2006).

3.3. Error Detection in FPGAs

Due to the composition and routing FPGAs exhibit specific challenges for error detection schemes. The values and wires stored in memories and flip-flops are also susceptible to the effect of radiations similar to that of ASICs .The FPGAs has the bits which determine the logic and routing behavior. These memories are also subjected to errors. The SEU (Single Event Upset) also causes changes in the I/O of the circuit, routing clock or the behavior of logic. This will result incorrectly formed logical function and that will not do the originally intended function of the system. See[(R. Katz, 1998), (M. J. Wirthlin, 2003b)] to understand the problem clearly. The main function of FPGAs error detection schemes is to detect when an error in configuration bit stream has manipulated the circuit behavior. These changes are subtle and hence difficult, for instance changing to XOR gate from OR gate. Another well known challenge in FPGAs is the difficulty in detecting errors in the configuration bit stream. These alterations cannot always be detected by the traditional techniques. In traditional temporal redundancy for instance an operation is performed twice and the results are compared to determine if any upset has occurred. This traditional method is very effective to determine any faults in any of the signals, wires or states, while the operation is being performed. However this method fails in detecting the upset if it alters the actual logic unless we use an encoding scheme. But these encoding schemes are specific to application for example; an encoding scheme that works for multiplication will not work for addition. Hence it makes challenging for detecting error.

3.4. Concurrent Error Detection Schemes

With the introduction of concurrent error detection, a wide range of application adopts this method for error detection, since the preventive measures can be started only after detecting the errors. The process of error detecting scheme is very simple, i.e. some characteristic of that particular scheme should be encoded with a code word and the deviation from the code word shows the occurrence of an error. Some common CED techniques are explained below.

3.4.1 Parity Codes

This is the easiest form of error detection code, with a single check bit (irrespective of input data size) and a normal hamming distance , d=2. There are two basic types of parity codes: Odd and Even. In an even-parity, the calculated check bit should be even, when summing the total number of 1s in the code word; for an odd parity this should be odd. As a result of this, the total count will be changed during the fault occurrence and hence the error gets easily detected. One of the major drawbacks in parity codes are the limitations in multiple fault detection capabilities.

3.4.2 Checksum Codes

Here a bit checksum code is added with the information, which is the summation of all information bytes. If any error happens during the transmission, then this will indicate as an error in the checksum. When b=1, these codes are cut down to parity check codes. For this scheme the hardware unit required is less and the codes used are symmetric in nature.

3.4.3 m-out-of-n Codes

In this scheme of error detection, a standard weight and length bit of m and n respectively is used as codeword. If an error happened during transmission, the codeword weight gets changes and the error is detected. Suppose the error transmission is from 0 to 1 then an increase in weight is detected whereas, if it is from 1 to 0 then a reduction in codeword will happen resulting in easy detection of error. This is the most common form used for detecting unidirectional error in digital systems.

3.4.4 Berger Codes

Berger codes are one of the unidirectional error detecting codes which is basically an extension of parity codes. The number of check bits required for a parity code is one, which can be taken as the number of information bits having a value 1 when considering in modulo 2. Whereas Berger codes contain many check bits in order to represent the information bit count having value 0. Total number of check bits (r) expected for k-bit information is

r = [log2 (k − 1)]

The non separable nature of m-out-of-n codes make it as the most optimal codes, of all the unidirectional fault detecting codes that exist (Lo et al., 1989). On the other side amongst the separable codes available, the most optimal code is the Berger code, which requires a less number of check bits(Lo et al., 1989).

If the detection required is for the unidirectional errors then Berger codes are not a better choice. Because of the above reason, there are some modified Berger codes exist such as Bose-Lin code, Hao dong etc. The code that is introduced by Hao Dong, has the capability of less error detection. But those codes use less check bit and the checker size is very small. More over there is no relation between the number information bits and the number of check bits. Another variation in Berger code was introduced by Bose and Lin (Lo et al., 1989) .Later on Bose introduce a code that improves the burst error detection capacities of his previous code, in which more bits are required in groups(G. C. Cardarilli).

3.5. Concurrent Error Correction Schemes

An error-correcting code was introduced in 1940s for the first time, adopting the principle of Claude Shannon which showed a maximum error-free communication in a noisy channel (Blahut, 1983). The error correcting ability of the codes however determines the quality of the recovered signal. Error correction coding needs lower rate codes than that of error detection; still it is a basic requirement in safety critical systems, where it is necessary to get it correct for the first time. In these particular circumstances, the extra bandwidth required for checking the redundancy is am acceptable price.

During these years the error correction schemes have increased gradually with constrained number of computation steps. At the same time, the hardware and time overhead cost required to perform a number of computational steps have also greatly reduced. These trends have led to high-end application of these error-correcting techniques. One of the applications of error correction coding is to detect or correct errors in a communication system where the errors appear in bursts. These errors will be grouped, so that several neighboring symbols are incorrectly detected. In this case non-binary codes are applied to correct such errors, since the error is always a difference from zero in the field to one in the binary codes. Moreover in a non-binary code, the magnitude of the error has to be calculated in order to correct the error, since the error can take many values. Below mentioned are some of the non-binary codes.

3.5.1 Bose - Chaudhuri - Hocquenqhem (BCH) Codes

BCH codes are very vital and more powerful class of linear block codes that are cyclic codes which has wide classification of parameters. The common BCH codes used are explained as follows. Here for a positive integer m and t there exists a binary BCH code, where m is equal to or greater than 3 and t is less than (2m −1) / 2.

Suppose, Bock length = 2m −1 n

Number of message bits k ≥ n − mt

Minimum distance 2 1 min d ≥ t +

Here t is the number of errors that can be corrected and m denoted the number of parity bits. Each BCH code can detect and correct up to t different errors per codeword. These codes offer flexibility in code rate, block length and the choice of code parameters. More over BCH codes can be used to describe Hamming single-error correcting codes(Blahut, 1983).

3.5.2 Burst Error Correcting Codes

Burst error correcting codes are required in virtually uncountable applications. Here the error correcting codes which is indented to correct a length of l bits will correct any error pattern that lies not more than l bits. This kind of code is known as complete error correcting codes (Konrad J. Kulikowski, 2011). In this case if a particular symbol is in error then, there is a high chance of getting error to its intermediate neighbors. For instance burst error happens in mobile communications as a result of fading and in magnetic recording as a result of media defects. Using interleavers these kinds of errors can be converted into independent errors. Some of the simple structures that explain the burst error correcting codes are Fire codes, cyclic codes and others (Konrad J. Kulikowski, 2011).

In cyclic codes almost all linear block codes are either cyclic or well related to cyclic codes. Easy to encode is one of the main advantage of cyclic code over most other codes. Furthermore a well defined mathematical code called Galois Field is used in cyclic codes, which leads to the creation of a high efficient decoding scheme for them. One of the most important sub-class of cyclic codes is reed Solomon codes (Hasan, 2005).