Steganography Hiding Data Within Data Computer Science Essay

Published: Last Edited:

This essay has been submitted by a student. This is not an example of the work written by our professional essay writers.

Steganography is the art of hiding data within data. Its main purpose is to store information in a way that hides its existence from a 3rd party. Steganography is different from cryptography. Cryptography uses a 'secret key' to allow the receiver to decrypt his message. The most popular key used is the Data Encryption Standard (DES) and the Triple-DES which encrypts the data 3 times [1] .It does not conceal the existence of the secret communication. Even though these two are distinct from each other there are many similarities or analogies between them and a few authors have categorized steganography as a form of cryptography [2].

Steganography hides the message but communication between parties is still visible. Steganography involves placing a hidden message in a carrier. A steganography key can be used for encrypting the hidden message. In summary:

steganography_medium = hidden_message + carrier + steganography_key

Figure 5.0 Shows a common categorization of steganographic techniques

Technical steganography: scientific methods such as invisible ink are used to conceal messages.

Linguistic steganography is meant to hide the secret message in the carrier in and is divided into semagrams and/or open codes.

Semagrams use of symbols or signs to hide information. Visual semagrams use simple everyday physical objects to convey a message. Examples include things like positioning of items on a table or even websites. Text semagrams hide messages by making minor changes to font sizes, type, adding a few extra spaces or hand written text.

Open code conceals the information in a genuine carrier message in a way that is not very obvious to an unsuspecting individual.

Jargon codes use a language that only makes sense to a certain group of people but is does not make sense to others. Some of the jargon codes used include the warchalking (these are symbols that are used to show the existence of a network signal [3], underground jargon, or even an innocent-looking conversation that convey a special meaning.

A covered cipher would hide a message plainly in the carrier medium and can be recovered by a person who knows how it was hidden. Grille cipher uses a pattern that was used to cover up the carrier message. A null cipher would hide the message in a way that was agreed before hand, for example, one has to read the forth word or look at the 3rd character in every word.

A lot of information is stored on PCs and transmitted over networks. It does not come as a surprise that steganography has now entered the digital age. Steganography software allows one to hide any sort of binary file in another binary file.

Steganography techniques

Null Ciphers

This method is used to hide data or information without using any complex algorithm. It uses certain letters or even words of a null cipher text are important. An example would be every fifth word or second letter of the word with all other letters as nulls therefore producing the disguise. An example, sent by the Germans in WW1 is shown below [4].


The first letter of every word spells out, "Pershing sails from N.Y. June 1". The second message was sent so verify the first message had the same contents on the second letters of each word.


Digital Carrier Methods


Substitution- Altering/Replacing the LSB

There are a a number of ways to hide information. The most used steganography technique in image and audio files is the Least Significant Bit (LSB) substitution. The LSB term came from the numeric significance of the bits in a byte. The most significant bit is one with the highest arithmetic worth (i.e., 27=128), whereas the LSB is one with the least arithmetic value (i.e., 20=1).

An example of LSB substitution is hiding the character 'G' across the following eight bytes of a carrier file, the least significant bits are in red (underlined):

10010101 00001101 11001001 10010110

00001111 11001011 10011111 00010000

The letter 'G' is represented in the ASCII as a binary string 01000111. The eight bits can be written to the least LSB of each of the carrier bytes like:

10010100 00001101 11001000 10010110

00001110 11001011 10011111 00010001

Half of the bits are changed as indicated in red (italics). It does make a lot of sense when a set of zeros and that of ones is substituted with the other set of 0s and 1s.

LSB substitution can also be used to overwrite valid RGB color encodings and palette pointers in JPEG, BMP and GIF files. By overwriting LSB, the numeric value changes a little and is most unlikely to be detected or heard in case of an audio file.


Image compression often has an effect on the integrity of the hidden message. There are two types of image compression:

· Lossy - JPEG files use this format and it offers a very high compression ratio.

· Lossless - BMP and GIF provide very high quality but lesser compression. This makes them 'easier' carriers to hide information/messages within them.

A conversion from a BMP/GIF format to a lossy compression like JPEG has the potential to destroy the hidden data in the image. Masking techniques are proven to be more effective than LSB when JPEG images are used. "By covering, or masking a faint but perceptible signal with another to make the first non-perceptible, we exploit the fact that the human cannot detect slight changes in certain temporal domains of the image" [7]. Humans do not recognize little changes in colour. This form of data hiding is closely associated with watermarking than steganography.


When one is hiding data in an audio file they normally use a low bit encoding technique which is a bit to LSB that is used in images. The issue with low bit encoding is that it is noticeable and therefore is not the best method to mask information inside of an audio file.

Spread Spectrum

It hides data in an audio file. It works by adding random noises to the signal; the information is hidden inside the carrier and then it is spread across the frequency spectrum.

Echo data hiding

Echoes are used in sound files to try and hide information. When extra sounds are added to an echo inside the audio file, data can be hidden. This method of concealing information is known to improve the sound of an audio file.

Spread Spectrum

"Spread-spectrum encoding is the method of hiding a small or narrow-band signal, a message, in a larger cover signal"[8]. This method adds random noise to a signal using a noise generator. The message is hidden inside the noise of the carrier and spread the frequency spectrum as possible.


Steganography in video generally uses the Discrete Cosine Transform method of manipulation. A good example for this method could be video conferencing as described by Westfeld and Wolf [9]. Video conferencing requires a high frame rate which often places great stress on digital networks. To overcome this problem, it uses differential lossy compression which means that only the differences in each successive still frame are transmitted across the wire.

Discrete Cosine Transform (DCT)

Embedding messages in an image is seen as an effective way to hide secret data. However, image compression destroys the integrity of the secret message and makes it unrecoverable. The DCT method explains how some modern programs overcome this issue: DCT works by using quantization on the least important parts of the image in respect to the human visual capabilities. Quantization means for example the value 6.763 can be rounded up to the value 7 and therefore be represented by a lot less number of bits. However, the human eye under normal conditions does not detect high frequencies in images so this allows DCT to make larger modifications to these frequencies with little noticeable image distortion. DCT works by dividing the image up into smaller areas and performing the quantization on the frequencies that humans do not normally detect. This is the lossy compression stage. Any secret message is then injected at this point. The image will then be 'lossless compressed' which will not have any impact on the integrity of the secret message.

Detecting Steganography

Steganography power lies in the unsuspecting individuals' unawareness that a message is concealed in the carrier object. When an object is revealed to be carrying hidden communication, that means steganography has lost its purpose and the security or protection of the message can only be relied on the strength of the encryption that was used. However, the encryption of the information is a cryptographic technique. On the other hand the hiding of the data is a steganographic technique. Discovering the existence of a concealed message does not mean steganography has been conquered through steganalysis.

What is Steganalysis?

"Steganalysis is identifying the existence of a message" [10]. Steganalysis does not deal with trying to decrypt the hidden information inside of a file, just discovering it. Steganalysis techniques can be classified in a similar way as cryptanalysis methods, largely based on how much prior information is known [11].

Steganography-only attack: The steganography medium is the only item available for analysis.

Known-carrier attack: The carrier and steganography media are both available for analysis.

Known-message attack: The hidden message is known.

Chosen-steganography attack: The steganography medium and algorithm are both known.

Chosen-message attack: A known message and steganography algorithm are used to create steganography media for future analysis and comparison.

Known-steganography attack: The carrier and steganography medium, as well as the steganography algorithm, are known.

Steganography methods for digital media can be broadly classified as operating in the image domain or transform domain. Image domain tools hide the message in the carrier by some sort of bit-by-bit manipulation, such as least significant bit insertion. Transform domain tools manipulate the steganography algorithm and the actual transformations employed in hiding the information, such as the discrete cosine transforms (DCT) coefficients in JPEG images [12].

Tools for Steganography Detection

Detection of steganography applications on a suspicious computer is vital to the consequent forensic analysis. Most steganography uncovering programs work well when there are a few clues to the sort of steganography that is in use. When steganography software is found on a computer it usually gives rise to some suspicion that there are steganography files with secreted messages on the computer. Moreover, the kind of steganography software found will have 'a big say' on steganalysis. For example S-Tools can direct attention to WAV, GIF and BMP. JP Hide-&-Seek will 'say a lot' about JPEG files.

WetStone Technologies software is used to identify the presence of steganography applications. It employs a set of files in the well-known steganography software distributions. It the compares them to the files hashes subject to search. The figure xx shows the output when this software is aimed at the directory where steganography software might be stored. The presence of cryptography, key logging, Trojan horse, password cracking, and other 'wicked' software can be detected by using Gargoyle data sets.

Fig xx: Gargoyle output from a directory on a hard drive.

AccessData's Forensic Toolkit [13] and Guidance Software's EnCase [14] can use the HashKeeper [15], Maresware [16], and National Software Reference Library hash sets to search for a mixture of software. These data sets are intended to leave out hashes of known good files from search indexes when conducting forensic analysis. Gargoyle software can import these hash sets.

Detection of steganography applications is getting more and more difficult these days, the small size of the softwares together with the larger amounts of storage devices. An example would be S-Tools; it only needs 600 KB of disk space for it to be executed directly with no need for additional installation from a USB key. When this happens there are no remnants of the program to be found in the hard disk.

Another purpose of steganography detection software is to discover possible carrier files. In an ideal world, the uncovering software should provide clues to the steganography algorithm employed to cover information in the suspicious file so that the forensic investigator might be able to reveal the hidden data.

WetStone Technologies' Stego Watch [17] provide a likelihood on which type of steganography media and also the possible algorithm that was used hide and provide clues to the possible software employed. The analysis uses a mixture of user-selectable tests based on carrier file characteristics that could have been altered by the diverse steganography methods. If a forensic investigator knows the steganography software that is on the computer being investigated it will help in selecting the most probable statistical tests.