Encryption is the processes of converting information which is often referred to plain text into to ciphertext. Decryption is the reverse of this process and will change ciphertext to plain text provided the correct key is supplied. Keys are a parameter of the part of the encryption algorithm that is generated in order to make the process unique. (Stallings, 2005). Without a key, the algorithm would not produce valuable results (Paul Reuvers, 2010). There are many different encryption techniques which will be covered in this report. I will critically evaluate these techniques in order to create an encryption algorithm of my own which will be implemented into a program written in Java.
The two types of encryption most commonly used today are Stream ciphers and block ciphers. A block cipher is a type of encryption algorithm that converts a fixed-length block of plain text into a block of ciphertext of the same length. This conversion takes place using a user provided key that is kept secret from individuals who you do not want to decrypt the text. Decryption is achieved by applying the reverse of the conversion technique to encrypt the plain text. Encryption and decryption use the same key. Keys tend to be 64bit, 128bit or 256bits in length. (RSA Laboratories, 2010).
Get your grade
or your money back
using our Essay Writing Service!
An advantage that most block ciphers have over stream ciphers, is that a combination of confusion and diffusion can be used during the encryption which increases overall strength of the algorithm. Stream ciphers will commonly only use confusion. Confusion is making the connection between the key and the ciphertext as complex possible. Diffusion however "refers to the property that the redundancy in the statistics of the plaintext is "dissipated" in the statistics of the ciphertext". (Claude E. Shannon, 1949).
A message longer than the block size (128 bits in the above example) can still be encrypted with a block cipher by breaking the message into blocks and encrypting each block individually. However, in this method all blocks are encrypted with the same key, which degrades security each repetition in the plaintext becomes a repetition in the ciphertext. To overcome this issue, modes of operation are used to make encryption probabilistic. Some modes of operation, despite the fact that their underlying implementation is a block cipher, allow encrypting individual bits which results in a stream cipher.
In addition to linear and differential cryptanalysis, there is a growing catalogue of attacks: truncated differential cryptanalysis, partial differential cryptanalysis, integral cryptanalysis (which encompasses square and integral attacks), slide attacks, boomerang attacks, the XSL attack, impossible differential cryptanalysis and algebraic attacks. For a new block cipher design to have any credibility, it must demonstrate evidence of security against known attacks.
The slide attack is a form of cryptanalysis designed to deal with the prevailing idea that even weak ciphers can become very strong by increasing the number of rounds, which can ward off a differential attack. The slide attack works in such a way as to make the number of rounds in a cipher's algorithm irrelevant. Rather than looking at the data-randomizing aspects of the block cipher the slide attack works by analysing the key schedule and exploiting weaknesses in it to break the cipher. The most common one is the keys repeating in a recurring way (Chalermpong Worawannotai and Isabelle Stanton, 2010)
In overview, the XSL (eXtended Sparse Linearization) attack relies on first analysing the internals of a cipher and stemming a system of quadratic simultaneous equations. "These systems of equations are typically very large, for example 8000 equations with 1600 variables for the 128-bit AES". Several methods for solving such systems are known. In the XSL attack, a specialized algorithm is then applied to solve these equations and recover the key. (Nicolas T. Courtois and Josef Pieprzyk, 2002).
Stream ciphers are an important class of encryption algorithms. They encrypt individual characters (usually binary digits) of a plaintext message one at a time, using an encryption transformation which varies with time.( Imre J. Rudas, János Fodor, 2006). An advantage of stream ciphers is that they are generally faster than block ciphers at encryption and decryption (RSA Laboratories, 2010). (See Image below). An interesting stream cipher is the "one time pad". This cipher is said to offer "perfect security" although it is sometimes impractical due to "key management problems" and also the key needs to be the same length as the plain text which could be impractical.
Always on Time
Marked to Standard
Stream ciphers are not without their problems however. Implementations of stream ciphers can be tricky and the level of security depends highly on how well they are implemented. Stream ciphers do not provide integrity protection or authentication, whereas some block ciphers can provide integrity protection.
William Stallings (2005)
The stream cipher RC4's weak initialization vectors allow an attacker to mount a known-plaintext attack and have been widely used to compromise the security of WEP. ( J. Philip Craiger, 2002)
IDEA which is another stream cipher can have weak keys that are identifiable in a chosen-plaintext attack. They make the relationship between the XOR sum of plaintext bits and ciphertext bits predictable. There is no list of these keys, but they can be identified by their structure. (wordiQ,2010).
Stream ciphers are usually best for cases where the amount of data is either unknown, or continuous - such as network streams. Block ciphers, on the other hand, or more useful when the amount of data is known - such as a file, data fields, or request/response protocols, such as HTTP where the length of the total message is known already at the beginning. (VX Heavens,1999).
A transposition cipher involves keeping the same letter but changing their sequence order (permutation) to prevent someone understanding them. A transposition cipher on its own is generally classed as weak as they are susceptible to frequency analysis attacks. Frequency analysis is a cryptanalysis technique that studies the frequency of letters in the encrypted text in order to arrange them into a readable sequence.
There are many different types of transposition and in modern encryption algorithms they are used as an added stage in the algorithm to increase strength. An example of a transposition cipher is columnar transposition. During this cipher, the message is written in rows of a fixed length and then read by column by column of which are chosen in a scrambled order. Both the width of the rows and the permutation of the columns are usually defined by a keyword. For example, the plain text DAVIDR is 6 characters long so the rows will be of the same length. The permutation is then selected using the alphabetical order of the letters in the keyword.
Since transposition does not affect the frequency of individual symbols, simple transposition can be easily detected by the cryptanalyst completing a frequency count. If the ciphertext exhibits a frequency distribution very similar to plaintext, it is most likely a transposition cipher. This can then often be attacked by anagramming - sliding pieces of ciphertext around, then looking for sections that look like anagrams of English words, and solving the anagrams. Once such anagrams have been found, they reveal information about the transposition pattern, and can consequently be extended onwards and the attack is complete
Simpler transpositions often suffer from the property that keys very close to the correct key will reveal long sections of legible plaintext interspersed by nonsense. Consequently such ciphers may be vulnerable to optimum seeking algorithms such as genetic algorithms.Substitution:
A substitution cipher is a method of encryption which substitutes the plain text with chosen substitutes which in the end will amount to ciphertext. For decryption the process is reversed the ciphertext is substituted with the original plain text.
There are a number of different methods for a substitution ciphers:
A Monoalphabetic ciphers tend to be one of the most basic and weak types of cipher available. They use the same substitution across the full message. For example if the letter A was mapped to T then this would not change throughout the entire encryption process. These types of messages can be cracked by using frequency analysis, educated guessing and trial and error.
Some examples of Monoalphabetic ciphers are:
Pigpen / Masonic Cipher
In a Polyalphabetic cipher, the character substituted can change throughout the encryption process. For example the letter A could be mapped to G then later on A could be mapped to T. Some examples of Polyalphabetic ciphers are:
This Essay is
a Student's Work
Running Key Cipher
These tend to be more secure than the previous examples of Monoalphabeic ciphers as it makes frequency analysis slightly harder.
Polygraphic ciphers work by substituting one character from the plaintext to a group of characters. This has the advantage of hiding the frequency distribution of letters. This makes frequency analysis attacks much more complicated. Some examples of Polygraphic ciphers are:
Padding is used in encryption to make the plaintext long enough so that some functions can operate correctly as designed. For example, if the algorithm was reading in 3 letters but had to operate on 4 to correctly encrypt the letter then a pad of 1 would be added. This tends to be used in block ciphers to provide a valid block size.
Salt is normally a random amount of characters that is added to the password in order to stop hashing attacks where a database of hashes is made up from common words or larger databases containing random characters. This hash is then compared to the hash of the password. A random salt will give a different hash for the same password each time making these attacks much harder.
Selection and justification of techniques used
The program is based on a transposition cipher combined with a substitution cipher to create the encrypted text. I have chosen these two methods combined as transposition ciphers can complement a substitution cipher due to avoiding the weaknesses associated with each cipher as previously described. For example, a substitution cipher when combined with columnar transposition will avoid the weakness of both because it will replace high frequency ciphertext characters with high frequency plaintext characters so it will not reveal chunks of plaintext due to the transposition. Anagramming would fail hereon the transposition because of the substitution.
To start my encryption and decryption process, I will ask the user for a password. Where the password is too short or not strong enough, a salt will be added to the end of the password which will in turn create the key to be used for the algorithm. The salt will also be stored at the beginning of the encrypted text to ensure that decryption is possible. The salt will be distinguishable from the encrypted text as it will be random characters with no meaning or readability.
The first algorithm chosen and applied is a transposition cipher. By running the cipher at this stage, it creates an unreadable format before running any further algorithm, adding to the overall strength; the bulk of the remaining algorithm being completed on a now transposed version. The algorithm to transpose the text is based on shifting the text much like the shift rows stage of AES (Joan Daemen, Vincent Rijmen, 1999). My algorithm will take four sections of the array at a time and then move them by an offset of one in the first round then two in the next until the whole array has been shift by the offsets. For decryption, the offsets are the reverse (two then three). These rounds will add more complexity to the stage whilst also increasing the strength of the overall algorithm. Padding will be applied to the plaintext in order for the transposition cipher to work as it has to be a multiple of four due to the designed implementation.
As this transposition cipher completes four characters at a time, the plain text will need to be a multiple of 4. In the program there is a check to see if this is the case and if not then padding is added in the form of space characters. These will be visible after the decryption process but will not affect how the text will look as they will be located at the end of the plain text and thus not visible to the user.
The next stage in my algorithm will be a modified Vigenère cipher. The Vigener cipher is a form of polyaphabetic substitution which uses a square of all the alphabetical characters which it will then substituted the plain text with depending on the keyword. My version of the Vigenère cipher has a much larger Vigenère square (the substitution box) containing not only the characters from the alphabet but a whole range of upper case and lower case alphabetical characters, numbers and special characters. This will allow for there to be a larger number of possibilities that the plain text could have been substituted with, increasing the overall strength of the algorithm. This part of the algorithm will also be repeated in rounds to increase overall strength.
Flow of Operation
Evidence of Success
Using the first example in appendix C:
Plain Text: My Plain Text To Encrypterty
Decrypted: My Plain Text To Encrypterty
Number of characters: 28
The plain text is "My Plain Text To Encrypterty" which is 28 characters, my program will first check to see if its divisable by 4 which is. Due to being divisiable by 4 an extra 4 pads(y) are added to the plain text plus the number of pads is added to the end. Once this padding process has been finished the plain text gets processed through the transposition algorithm by shifting each row left in the array by 1 then by 3 the next transposition is run.
Transposition 1(Offset 1):
My Plain Text to Encryptertyyyy4
y PMainlTex TotEnc yptrrtyeyy4y
Transposition 2(Offset 3):
y PMainlTex TotEnc yptrrtyeyy4y
PMy nlaix Teot Tc Entrypyert4yyy
The last stage of my algorithm is the subsitution stage. The subsitution box is created using the method makeTable();. Each character from the key and the plain text is the read off one by one which represents a point on the subsitution box which is it then subsituted with to create the cipher text. Finally the salt and a Âµ symbol is added to the cipher text in order to allow decryption. The Âµ symbol is used in the decryption process in order to select out the salt from the cipher text.
Using the same principles as before decryption is just the reverse of the encryption. (See flow of operation)
Argument: -d passworded C:\mydataen.txt C:\mydatade.txt
Open command prompt from the search/run box from windows start button
Use the command prompt window to navigate to the correct location of the .jar file.
For this example I will be using a text file called "mydata" in C:\.
The program runs using these parametres:
(-e for encryption, -d for decryption), (The password. Minimum 10, Maximum 40), (The read file path),(The write file path).
-e mypassword C:\mydata.txt C:\encrypted.txt
This will encrypt the file C:\mydata.txt and output the cipher text in a text file called encrypted in C:\.
This is an exmaple of what the cipher text file will look like.
To decrypted to encrypted text file just use the parameter -d. See above for example.
This is the decrypted cipher text from the encrypted text file.
Test inputs and outputs
Even number of characters:
Plain Text: My Plain Text To Encrypterty
Decrypted: My Plain Text To Encrypterty
Number of characters: 28
Odd number of characters:
Plain Text: My Plain Text To Encryp
Decrypted: My Plain Text To Encryp
Number of characters: 25
Same Key different file:
Due to my algorithm having salt added to the password, they key changes each time a file is encrypted thus I have had to set the key to the required value for this test.
Plain Text: Hello world
Decrypted: Hello world
Number of characters: 25
Encryption Speed Test
The below numbers were achived by taking in the computers clock time in milliseconds at the beginning of the encryption or decryption methods then again and the end and subtracting the last time with the first.
long x = System.currentTimeMillis();
long p = System.currentTimeMillis();
long t = p - x;
long t is the time it had taken to encrypt or decrypt the text.
Plain text size
Time to encrypt (Milliseconds)
Cipher text size
Time to decrypt (Milliseconds)
*These numbers are only an estimate due to other processes running on the system.