# Review On Crypt Analysis Techniques Computer Science Essay

**Published:** **Last Edited:**

This essay has been submitted by a student. This is not an example of the work written by our professional essay writers.

Traditionally, secrecy is required mainly in diplomatic and military communications, but nowadays it plays an important role in our everyday lives, for example, in managing our financial affairs. Cryptography plays a vital role in maintaining the privacy of electronic information against threats. In doing so, a combination of symmetric cryptographic systems for encrypting, decrypting the data and public key for managing the keys are employed by the systems. Assessing the strength of the cryptographic systems is an essential step in employing cryptography for information security. Cryptanalysis plays a key role in this context. The main objective of cryptanalysis is not only to recover the plain text, but also to estimate the strength of the algorithm which is useful in designing a good cryptographic algorithm.

There is an explosive growth in unclassified research in different aspects of cryptology and cryptanalysis which is one of the most active areas. Many crypto systerns that are thought to be secure are broken and a large variety of mathematical tools useful in cryptanalysis are developed.

Different approaches are available in the literature to perform cryptanalysis on either block ciphers or stream ciphers. Classical ciphers are divided into two broad categories: substitution ciphers and transposition ciphers. Cryptanalysis of classical ciphers is a popular crypto logical application for meta-heuristic search research.

In a practical model, one can test the block cipher with different known attacks and assess a certain security level to it. However, it is not possible to predict the security of the underlying block cipher with respect to yet unknown attacks.

## 2.2 REVIEW ON CRYPT ANALYSIS TECHNIQUES

Substitution ciphers represent the basic building blocks of more complex and more secure ciphers that are used today. Hence understanding the vulnerability of simple ciphers is important in using and building more complex ciphers. Many cryptanalysis techniques are available in the literature to break substitution ciphers, each of them having advantages and disadvantages over one another.

While attacking the cipher models, one can consider key recovery attack in which the goal is to derive the secret key or decryption attack in which the goal is to decrypt the cipher text or key recovery from decryption attack. Different techniques were explored in the literature to find the key of the cipher and there by decrypting the entire cipher text. Several possible methods to break a substitution cipher include exhaustive search, simulated annealing, frequency analysis, genetic algorithm, particle swarm optimization, tabu search and relaxation algorithm etc.

The exhaustive search method is the simplest out of all algorithms used to break substitution ciphers. This technique is possible when the cryptographic systems have finite key space and allowing for all possible key combinations to be checked until the correct one is found. This method could be an acceptable technique for breaking a mono alphabetic shift cipher. The first attempt using the exhaustive search is not the best choice, since it is time consuming, but it decrypts the text with 100% accuracy.

Michael Lucks described a method for automatic decryption of simple substitution ciphers. This method uses exhaustive key search and is controlled by constraints imposed on word patterns. This method will not use any statistical analysis or language heuristics. This method uses an exhaustive search in a large online dictionary for words that satisfy certain constraints. These constraints include word length, letter position and letter multiplicity. The method is language independent. This can be applied to any language for which sufficient online dictionary is available.

Failures were observed when none of the longer words were available in the dictionary. Another important failure is observed when none of the plain text words contain repeated letters. This situation mostly occurs when the messages were small.

Brute force method is a way of trying to break simple substitution ciphers, but the number of possible keys that need to be checked is large. Thus, practically, it may be impossible to do an exhaustive search with in a reasonable amount of time. To overcome this, new algorithms have been developed for faster breaking of the cipher.

Ryabko et.al suggested and investigated an attack on block ciphers called gradient statistical attack. The possibility of applying it to ciphers for which no attacks other than the exhaustive key search are available is explored. The described method belongs to a chosen plaintext attack. Analysis of statistical properties of block ciphers is used in the process of cryptanalysis. As an example, the applicability of this method to the RC5 cryptanalysis is analysed.

Raphael et al. presented a framework, which is designed to describe the block cipher cryptanalysis techniques compactly regardless of their individual differences. This framework describes possible attacks on block ciphers and also specifies the technical details of each type of attack and their respective strengths. It is shown how to apply the framework to describe various attacks on popular and recent block ciphers. A comparative study has been done between the proposed frame work and Commutative Diagram Cryptanalysis (CDC). Basically CDC is the framework for expressing certain kinds of attacks on product ciphers where as proposed framework focuses on distinguishing attacks and also on key-recovery attacks that exploit such distinguishers.

Baham studied the effect of multiple modes using chosen cipher text attack, when the underlying cryptosystems are DES and Feal-8. It is shown that in many cases, these modes are weaker than the corresponding multiple ECB mode. In most cases, these modes are not much secured than just one single encryption using the same cryptosystem.

For example, the triple CBC mode i.e. CBC|CBC|CBC-which is encrypted using a single DES and the modes CBC|CBC|ECB, CBC|ECB|CBC, and ECB|CBC|CBC were weaker than triple DES, and their strength is comparable to the strength of a single DES. It is concluded that strong modes of operation which are invulnerable against finding the keys, should not be based on combining different simpler modes, or use of internal feedbacks. It is suggested to use single mode and use multiple encryptions as the underlying cryptosystems of single mode. Multiple mode or any other mode that uses internal feedbacks can be strengthened by eliminating the use of the internal feedbacks.

Automated attack algorithms were developed for which human intervention is not necessary. These methods will be finished either after a predetermined number of iterations or after a message has been successfully decrypted. One such automated attack algorithms is the genetic algorithm which is widely used for cracking substitution ciphers.

Joe Gester proposed and implemented the simplest approach based on searching the more likely used keyword generated key space. The proposed Genetic algorithm involves an iterative process of finding the fitness of the individuals in the population. Then selectively genetic operators are applied to the members of the population to create a new generation and the process is repeated. Each generation is created by selecting members of the previous generation randomly and weighted according to their fitness.

The proposed method uses a simple genetic algorithm approach to search the key space of cryptograms. If this method is not satisfactory, then attempt is made to search a smaller problem space by restricting the key searched to those which are generated by a keyword. In first approach fitness function were used for rating the quality of each individual in the population's Solution, that is based on trigram and bigram counts. Then new populations were constructed by selecting randomly either crossover or mutation. In this approach rapidly the populations reach local maxima or never seem to converge to anything resembling English. The fitness function has evaluated the fitness of a candidate properly. The method used for crossover is more of a crossover coupled with several mutations rather than a simple crossover.

To allow for a simplified crossover mechanism it will be necessary to allow duplicate characters. An ad-hoc heuristic approach may be suitable to automate solving substitution ciphers.

Like other heuristic algorithms, the genetic algorithm will not always produce the exact result. They give solutions which are nearest to the correct one. In case of deciphering, after using the "best" key produced by genetic algorithm, most of the time, it is easy for a human to read the "decrypted" text. Then make small changes to reproduce the correct plain text. The experiments performed using the genetic algorithm method suggested that a fitness of about 0.9 is enough to determine the vowel substitutions and consonant substitutions after which the visual examination by a human can be used to decrypt the entire text.

David Ornchak proposed algorithm for decryption of homophonic substitution ciphers. Mono-alphabetic homophonic ciphers allow each cipher text symbols to map to only one plaintext letter. But homophonic substitution cipher unlike mono alphabetic, maps each plaintext letter to one or more cipher text symbols. Moreover homophonic ciphers conceal language statistics in the enciphered messages and making statistical-based attacks more difficult. So, an approach that uses a dictionary-based attack using genetic algorithm is presented.

Alabassal et al. proposed a method to discover the key of a Feistal Cipher using Genetic Algorithm approach. The possibility of using Genetic Algorithm in key search is attractive due to its ability in reducing the complexity of the search problem. The complexity of the attack is reduced by 50%.

Yean Li et al. performed a study on the effect of an optimization heuristic cryptanalytic attack on block ciphers. A known plaintext is encrypted by the chosen cipher using a randomly chosen key of reduced length. The possible key-solution generated by the heuristic function is used to decrypt the known-cipher text. The resulting plain text is compared with the original text. The fitness value for the solution is obtained by decrypting the known-cipher text and then calculating the percentage of character-location matches in the original text and the retrieved text. The search for the correct key combination will continue until a solution match or closest match is found within the constraints of the test environment.

Tabu search is another optimization technique used for breaking substitution ciphers. It is of iterative search type and is characterized by the use of a flexible memory. It eliminates local minima and searches beyond the local minimum. The experimental results suggest that the genetic algorithm recovers slightly more characters than the other algorithms. Simulated annealing algorithm is much simpler to implement than genetic algorithms and the Tabu search. Tabu search method obtains the desired result faster than the other algorithms.

Tabu Search Algorithm and Genetic Algorithm frame works were applied on three types of ciphers viz. AES, Hill and Columnar Transposition. Genetic Algorithm produced results efficiently in terms of the performance against Tabu Search. However, the Genetic Algorithm generally did not perform well on the Hill Cipher and AES. The results have shown that the transposition cipher is susceptible to the Tabu Search and Genetic Algorithm attacks on weak passwords. Polygraphic Substitution Cipher (Hill Cipher) is vulnerable to the Tabu Search attack as well as the Genetic Algorithm attack.

Simulated annealing is another technique that is similar to the genetic algorithm which is used to break substitution ciphers. The main difference is that the genetic algorithm has a pool of possible keys at each moment, while the simulated annealing keeps one value at a time. When combined with a few other simplifications, simulated annealing makes the approach much simpler than the genetic algorithm. The genetic algorithm matches more letters correctly than the simulation annealing does for any given length of the cipher text. This difference is not too high, so the simulated annealing method is still a good technique for breaking simple ciphers.

To reach the correct solution, simulated annealing takes less iterations, but the total time it takes is more than the time needed by genetic algorithm to reach the same result since the simulated annealing method spends long time in each iteration because of its detailed examination of each possible perturbation. Even though the genetic algorithm seems to be better than simulated annealing in all aspects, the main advantage of the simulated annealing is that it is much easier to implement and therefore, it still has an important role in cracking simple substitution ciphers.

The particle swarm optimization method is another method based on machine learning processes that is used for breaking substitution ciphers. The algorithm starts by selecting a random population of potential solutions, each of which is called a particle. Particle swarm optimization is a good method for breaking simple substitution ciphers as long as bigrams are used to calculate the fitness of particles. Considering unigram as a measure for fitness does not give any useful results for deciphering even the simplest ciphers.

Laskari et al. applied the particle swarm optimization(PSO) method to address the problem introduced by the cryptanalysis of block-cipher crypto systems. PSO originates from the field of evolutionary computation. PSO is a population-based algorithm which exploits a population of individuals, to search promising regions of the function space. The particle swarm optimization method is applied to the problem of locating the key of a simplified version of DES. In this method, the population is called swarm and the individuals are called particles. This method had been proven to be efficient and effective where deterministic optimization methods fail. The work mainly involves investigating the problem of finding missing bits of the key used to a simplified Feistel cipher, the data encryption standard (DES) reduced to four rounds. The method can be readily adapted to handle complex Feistel based ciphers.

Relaxation algorithm is another technique used to break substitution ciphers is the. This is a graph-based technique that relies on iterative and parallel updating of values associated with each node. The nodes of the graph, vi, are elements of the cipher alphabet. Each node has a random variable, li, associated with it which represents the probabilities of the possible characters that this node represents. The probabilities of a node are updated based on the appearance of its two neighbors in the cipher text and the trigram analysis of the original language.

Brute force attacks are successful in solving simple ciphers. But cryptanalysis of complex ciphers require specialized techniques and powerful computing systems. Nalini et al. performed a systematic study of efficient heuristics in successful attacking some block ciphers. For a systematic study of the attack of ciphers using heuristics, it is desirable to have simple ciphers which are tractable and at the same time incorporate representative features of practical ciphers into them.

Two ciphers, Simplified Data Encryption Standard (SDES) and a simple variant of Data Encryption Standard (DES) with 16-bits of key and few rounds were selected for demonstration purpose. Cryptanalysis is performed on these two ciphers using Tabu search, Particle swarm optimization, Genetic Algorithm, Adaptive Genetic Algorithm heuristic approaches. All the three algorithms do not differ significantly as far as the success of the attack is concerned. However, it is observed that as the amount of known cipher text is increases, the tabu search attack performs better than the other methods in case of SDES. In case of modified DES, particle swarm optimization method performed well than other methods.

Algebraic cryptanalysis is a method of cryptanalysis, in which cryptanalysis begins by constructing a system of polynomial equations in terms of plaintext bits, cipher text bits and key bits. This technique uses modern equation solvers to attack cryptographic algorithms. The factors that play important roles in algebraic cryptanalysis are the number of variables, the number of polynomials and the degrees of the polynomials of the system of equations. Another factor is the computing power of the cryptanalyst. This method of creating the polynomials for S-AES is simple. The power of the equation solver and the speed and amount of memory that is being used determines whether the system can be solved or not.

One aspect of differential cryptanalysis that appears to be overlooked is the use of several differences to attack a cipher simultaneously. This is best described by Kwan. The analysis of a cryptosystem should not only measure the strength under the best differential attack, but should take into account the best several attacks. These simultaneous attacks will reduce the search space by a reasonable factor. These simultaneous attacks do result in a significant improvement. The relative costs of encryption, XOR, and memory I/O also need to be taken into consideration. Trial and error will probably be required for an accurate answer.

Schneier explored the different cryptanalytic techniques and the ways to break new algorithms. Breaking a cipher doesn't necessarily mean finding a practical way to recover the plaintext from the cipher text. Breaking a cipher also means as finding a weakness in the cipher with a complexity less than the bruit force attack.

RC5 is a fast block cipher algorithm. Several attempts of cryptanalysis of RC5 cipher were found in the literature. Kaliski and Yin [3] evaluated the strength of the RC5 algorithm with respect to linear and differential attacks. They found a linear attack on RC5 with 6 rounds that uses 257 known plaintexts and whose plaintext requirement is impractical after 6 rounds. The best previously known attack requires 254 chosen plain texts to derive the full set of 25 sub keys for the 12 round RC5 with 32 bit words. In this paper Alex Biryukov et.al proposed a method that drastically improves these results due to a novel partial differential approach. The proposed attack requires 244 chosen plaintexts. Data complexity of cryptanalysis RC5 is bounded by the probability of a good pair.

Blowfish is a Feistel cipher in which the round function F is a part of the private key. A differential cryptanalysis on Blowfish is possible either against a reduced number of rounds or with the piece of information which describes the function F. Vaudenay showed that the disclosure of F allows performing a differential cryptanalysis which can recover all the rest of the key with 248 chosen plaintexts against a number of rounds reduced to eight.

New techniques were developed for cryptanalysis based on impossible differentials and these techniques are used for attack. Biham et al. described the application of these techniques to the block cipher algorithms IDEA and Khufu. The new attacks cover more rounds than currently well-known attacks. This demonstrates the power of the new cryptanalytic techniques.

## 2.3 CRYPT ANALYSIS USING LANGUAGE MODELS

The role of Cryptanalysis is also to study a cryptographic system with an emphasis on exploring the weaknesses of the system. The complex properties of natural languages play an important part in cryptanalysis. Different approaches of cryptanalysis in the literature use language characteristics to understand the strength of cipher system. One such approach deals with frequency statistics. This is based on the assumption that each letter in the plain text is to be substituted by another letter of the original ciphered text.

Frequency analysis is the process of determining at what frequency each symbol of the encrypted text occurs within the cipher text. This information is used along with the knowledge of frequencies of symbols within the language used in the cipher to determine which cipher text symbol maps to the corresponding plain text symbol. The frequency analysis algorithm is the fast approach to decipher the encrypted text. But, it requires the knowledge of the language statistics of the original text. The disadvantage is that it relies on constant human interaction to determine the next move in the process.

Bárbara et al. presented a method for de-ciphering texts in Spanish using the probability of usage of letters in the language. This method is basically to perform Crypto-analysis of a mono alphabetic cryptosystem. The method uses probability and usage of letters in Spanish language to break the encrypted text files. This method assigns weights to different alphabetical letters of Spanish language. The assignment of weights depends on their usage in the language. For this purpose analysis of the frequency of different symbols in the Spanish plain text is done. The same analysis is done on cipher text also. Every encrypted character is mapped to a single character in the original message and vice versa. In this way the original text is retrieved from the cipher text. Few characters vary because there are letters and symbols that have the same frequency.

This method of deciphering cryptograms in Spanish to obtain the original text gave positive results; still the deciphering is not 100% successful as there may be letters and symbols that have the same frequency values.

A simple substitution cipher uses substitution on the set of letters in the plain text alphabet such that different letters in the cipher text correspond to different letters in the plaintext. To encode a text by using character wise substitution, an "infinite key" is used and each letter in the plaintext will be replaced by a letter of the cipher text by means of a one-to-one self-mapping of the set of letters. Then, the knowledge of the key is necessary to reconstruct the plaintext. The work of Mineev et al. is concerned with a similar smoothing effect on the simple substitution cipher resulting from contracting the alphabet by quadratic residues and quadratic non residues in finite fields. As a sample, the Russian alphabets were considered in the proposed work.

A single-letter frequency analysis is helpful if it is used for obtaining an initial key to perform powerful bi-gram analysis. Apart from single character, relation between cipher text and plain text in terms of bi-grams and trigrams also play vital role.

Samuel W. Hasinoff presented a system for the automatic solution of short substitution ciphers. The proposed system operates by using n -gram model of English and stochastic local search over all possible keys of the key space. This method resulted in median of 94% cipher letters correctly decoded. The technique consists of two main components, a generic Stochastic Local Search (SLS) method and a scoring function. The search method is for navigation in the key space and scoring function is to evaluate the goodness of various keys. The scoring function of a particular key is the log likelihood of an n -gram language model that is applied to the cipher text and decrypted using that key. Here n -gram models of characters are considered. Such models can also be extended to the level of words. Most practical systems employ bigrams or trigrams.

Sujith Ravi et al. studied about attacking Japanese syllable substitution cipher. Different Natural language processing techniques were used to attack a Japanese substitution cipher. They made several novel improvements over previous probabilistic methods, and report improved results.

In general the receiver uses the key to convert cipher text to plain text. But a third party who intercepts the message may guess the original plaintext by analyzing the repetition patterns of the cipher text. From a natural language perspective, this cryptanalysis task can be viewed as a kind of unsupervised tagging problem. Language modeling (LM) techniques are used to rank proposed decipherment. This work mainly attacks on difficult cipher systems that have more characters than English, on cipher lengths that are not solved by low-order language models and relate the language-model perplexity to decipherment accuracy.

The work involves processing the input file into syllables and removing typographical errors. The sequence considered is approximately one million syllables in size and contains 65 unique syllable types. This data is split into three parts: LM training data, LM smoothing data and Plaintext messages (various sizes). When 3-gram LM is trained on various data, from the decipherment results it can be concluded that more LM data (i.e., more knowledge about the language) leads to better decipherment. With improved LM smoothing further improvements in accurate decipherment of shorter texts can be achieved. Further algorithms may lead to accurate decipherment of more complex Japanese cipher systems, including translation to other languages.

Jackobsen proposed method for cryptanalysis of substitution ciphers. In this method the initial guess of the key is refined through a number of iterations. In each step the recovered plain text using the current key is evaluated to how close it is to the correct key. To solve the cipher using this method bi gram distribution of letters in cipher text and plain text are sufficient. A distribution matrix is constructed only once and in each iteration the matrix is manipulated. This method is suitable for both mono and poly alphabetic substitution ciphers.

G W Hart proposed a method for solving cryptograms which works well even in difficult cases where only a small sample of text is available and the probability distribution of letters are far from what is expected. This method performs well even on longer and easier cryptograms. An exponential time is required in the worst case, but in practice it is quite fast. This method fails completely when no words of the plain text were in the dictionary.

A deciphering model is developed by Lee et al. to automate the cryptanalysis of mono alphabetic substitution ciphers. The method proposed by Lee et al. uses enhanced frequency analysis technique. The method is a three hierarchical approach. To perform deciphering of mono alphabetic substitution cipher, monogram frequencies, keyword rules and dictionary are used one by one. As a first step, for all of the letters in the cipher text the monogram frequencies are computed. In the second step, the keyword rules are used to decipher the other unknown cipher text letters. For letters that are not yet deciphered in the second step, they will be identified in the third step. In this step, the cipher text letters will be recognized by the dictionary checking. This approach is tested on two short cryptograms and it is observed that both cryptograms achieved successful deciphering results in good computational time. It is observed that this enhanced frequency analysis approach performs faster decryption than the Hart's approach. Due to the combined properties of keyword rules and dictionary checking, the Hart's approach weakness could be hindered.

Knight et al. discussed a number of natural language decipherment problems that use unsupervised learning. These include letter substitution ciphers, phonetic decipherment, character code conversion

and word-based ciphers with importance to machine translation. Basic unsupervised learning techniques often fail on the first trial. So, techniques for understanding errors and increasing the performance were discussed. These include letter substitution, character code conversion, phonetic decipherment, word based decoding etc.

An efficient algorithm that accomplishes the same thing as a naive application of the EM algorithm to break a substitution cipher is implemented. Unsmoothed parameter values for an English letter-bigram P(p) from sample data were estimated and tabulated. Then uniform P(c | p) were set up. The decipherment yielded by EM method results in 68 errors. Using a 1.5-million character data set instead of a 70,000-character data set reduced the number of errors from 68 to 64. Lambda interpolation smoothing is applied to P(p). This reduced errors further to 62.

Ravi and Knight introduced another method that uses low-order letter n-gram models to solve substitution ciphers. This method is based on integer programming which performs an optimal search over the key space. This method guarantees that no key is overlooked. This can be executed with standard integer programming solvers. The proposed method studies the variation of decipherment accuracy as a function of n-gram order and cipher length.

Fifty ciphers each of lengths 2, 4, 8, . . . , 256 were created. These ciphers are solved with 1-gram, 2-gram, and 3-gram language models. The average percentage of cipher text decoded incorrectly is recorded.

It is observed that solution obtained by integer programming is exact in achieving the objective. With a 2-gram model, EM algorithm resulted in 10% error for a 414-letter cipher, where as integer programming method provided a solution with 0.5% error only on the same cipher. For short cipher lengths, much higher improvement is observed when integer programming method is used. For example, on a 52-letter cipher, using a 2-gram model, the solution from integer programming method resulted in 21% error, which is low when compared to 85% error given by the EM method. The 1-gram model works badly in this scenario, which is inline with Bauer's observation for short texts.

The work mainly focuses on letter substitution ciphers which also include spaces. The work compares decipherment on ciphers with spaces and without spaces using different n-gram English model. For ciphers without spaces, English models are retrained on text without spaces. Cracking space-free ciphers is more difficult than ciphers with spaces with simple language models.

## 2.4 INFORMATION THEORITIC APPROACH

The entropy is a statistical parameter that measures how much information is produced on an average for each letter of a text in the language. Redundancy measures the amount of constraint imposed on a text in the language because of its statistical nature. Shannon proposed a new method of estimating entropy and redundancy of a language. This method uses the knowledge of the language statistics possessed by those who speak the language. It also depends on results in predicting the next letter when the preceding text is known. Some properties of an ideal predictor were developed.

An approach for finding n-gram entropy is developed. For this purpose a study has been done on 26 letter English where spaces, punctuation were ignored. The n-gram entropies can be calculated from letter, digram and trigram frequencies. The estimated entropy values for n =1,2,3 are 4.14,3.56,3.3 respectively. Based on the frequencies of symbols in the reduced text, it is possible to set bounds to the n-gram entropy of the original language.

The approach proposed by Shannon deals with the basic mathematical structure of secrecy systems. Shannon's work defines theoretical secrecy which is defined as how immune a system is to cryptanalysis when the cryptanalyst has unlimited time and computing power available for the analysis of cryptograms. This is related to communication systems in which noise is present and entropy, equivocation can be applied to cryptography. The work is also concerned with practical secrecy which can be defined as the level of security that is necessary to make the system secure against an enemy who has a limited amount of time and limited computational resources for attacking the intercepted cryptogram. This leads to methods for constructing systems which will require a large amount of work to solve. Finally, a certain incompatibility among the various desirable qualities of secrecy systems is discussed. An analysis of the basic weakness of secrecy systems is made. Discussion is made regarding certain incompatibilities among the various desirable qualities of secrecy systems.

H Yamamoto presented a survey on different information theoretic approaches in cryptology. Shannon's cipher system, Simmons authentication approach, wire tape channel, Secret sharing communication system approaches were studied. Shannon introduced the concept of perfect secrecy which means that even with unlimited time and computing power information should not be leaked. To achieve this key space should be at least as larger as the message space which is practically impossible.

Diffie and Hellman introduced another approach to achieve practical security based on computational complexity. Trap door functions and one way functions were used. To prevent active attacks authenticity is introduced. Several authentication mechanisms were developed in the literature. Simmons proposed an authentication mechanism which holds for any type of system. The conclusion drawn from this work is that information theoretic approach is as important as computational complexity approach.

Shannon in his information theoretic approach of cryptography assumes that the computational abilities are unlimited. The work proposed by Hellman is an extension to Shannon's theory. The concepts of matching a cipher to a language and of the trade-off between local and global uncertainty is developed. Hellman defined a model in which the messages are divided into two subsets. One set is all meaningful messages, each with the same a priori probability and other with meaningless messages which are assigned a priori probabilities of 0. Shannon theory, approach considered in this work is not, in general, applicable directly to design practical cryptographic systems. But it appears to be useful for gaining insights into the design of practical cryptographic systems.

Borissov and Lee proposed bounds on the theoretical measure for the strength of a system under known plain text attack. Dunham proposed the key equivocation H(K|M,C), which is the conditional entropy of the key given cipher text and corresponding plain text is proposed as a measure of strength of the system. For simple substitution cipher lower and upper bounds were found. This work gave a direction that key recovery in known-plaintext attack for substitution ciphers is more difficult when this has many fixed points.

A study has been done on estimating the key equivocation of secrecy systems using Additive Like Instantaneous Block (ALIB) ciphers. In general when the length of the block is large, it is difficult to find key equivocation. A simplified method for computing the key equivocation for two classes of secrecy systems with ALIB encipherers is developed by Zhang. The criterion here is the key equivocation rather than error probability.

For simple substitution ciphers bounds are derived for the message equivocation in terms of the key equivocation. It is established that the message equivocation approaches faster to the key equivocation. It is observed that the exponential behavior of the message equivocation is not determined by redundancy in the message source but by either the

symbol probabilities which are chest in a certain sense or the sum of the

two smallest symbol probabilities.

The importance of the key appearance equivocation in evaluating and bounding the message equivocation in terms of the key equivocation has been established for general cipher systems. When evaluating the strength of a cryptographic system, the cryptanalyst is assumed to know the general set of transformations and the statistics of both the message and key sources.

Maurer presents a review on the relation between information theory and cryptography. Shannon's approach fixes some lower bound on the size of the security to achieve a particular level of security. Recent models proposed contradiction to Shannon's approach, where in with a short key also it is possible to provide perfect secrecy. Models like wire type and broad cost channels, privacy amplification etc. were considered for illustration.

A parametric evaluation and analysis to study the behaviour of algorithms with respect to cryptographic strength is performed by Prieto. In general Unicity distance has been considered as a parameter for evaluating the strength of a cipher. According to Shannon's information theoretic approach unicity distance is a minimum length of cipher text required to determine the key. But when cipher text length is less than unicity distance, the predicted key will have a non zero error probability for which an upper bound is proposed by Jabri. It is observed that this probability is inversely proportional to logarithm of key size and directly proportional to redundancy of source.

Prieto proposed two more factors to evaluate the quality of the algorithm. One such factor is invulnerability factor and the other is called quality factor. This approach has been applied to five subsets of Spanish that belongs to different environments of normal language using DES with ECB and CBC modes. The proposed approach allows to make some general conclusions by comparing the algorithms and to choose the best algorithm in every case.

## 2.5 PROBLEM STATEMENT

## 2.1 INTRODUCTION:

The objective of Cryptanalysis is

However thefirst appearances of the idea can be found in

the paper of