# Information Hiding In JPEG Images Computer Science Essay

Published:

This essay has been submitted by a student. This is not an example of the work written by our professional essay writers.

Since the establishment of the Internet, information security has always been the primary concern in digital transmission. In fact, every service in the Internet assures its clients of their privacy, security and confidentiality. However, along with the advancement of approaches addressing these issues, are also the development of programming malpractices. Thus, Internet transactions are never really secure and are actually always risky; especially those involving highly sensitive data like in telecommunication, business and banking. Consequently, various researches have been established in the hope of resolving these issues. Data encryption and information hiding are just two of the many methods addressing information security.

Cryptography, a sub-discipline of data encryption, is the art and science of writing in secret code. Unfortunately, encrypting the contents of a secret message is sometimes not enough, it may also be necessary to conceal its existence as well. The technique used to implement this essential secrecy is information hiding. Steganography, digital watermarking and fingerprinting deals with information hiding. Basically, the differences of these information hiding methods are based on the purpose of the hidden data [1]. This study, however, is focused only on steganography.

Steganography, which means 'covered or hidden writing', studies the encoding and the detection of hidden information in digital communication transmissions. Steganographic methods hide the existence of an arbitrary digital message by encoding it into suitable carriers such as images, audio files, text files, videos and signals, hence making it difficult for a potential investigator to discover [2]. Among various carrier options, the scope of this study is narrowed to image steganography, particularly JPEG images.

JPEG images are the most common subjects to image steganography. It is not only for the fact that it is the most common image file format used in the Internet, but also because modifications of its standard JPEG tables can be optimized to increase hiding capacity. This concept is the foundation of the works of Chang et al. [3] and Wang et al. [4], which in turn is the basis of the study of Caandi et al. [5], where this study is adapted from. Moreover, in order to perform Optimized Least Significant Bit Substitution (OLSB), the approach used in this study to image steganography, the Harmony Search Algorithm (HSA) is utilized to find the optimal substitution matrix to be used in the encoding process.

Harmony search is a music-based metaheuristic optimization algorithm, introduced by Geem et al. in 2001. Since then, it has been proven useful to many optimization problems including function optimization, engineering optimization, design of water distribution networks, groundwater modelling, energy-saving dispatch, truss design, vehicle routing, and others [6]. This study then intends to try its efficiency in image steganography using the concept of Optimal Least Significant Bit.

## Image

An image is a two-dimensional picture that resembles the appearance of the subject as it illustrates the likeness seen or produced from something or perception of somebody [10]. It is composed of an array of numbers, ranging from 0 to 255, that represent light intensities at various points. These points, referred to as picture elements or pixels, make up the raster data of the image. Moreover, each pixel uses 3 bytes to represent the primary colors: red, green, and blue (RGB), which is the basic color format used in images. In effect, these pixels contribute to the size of the image file. Furthermore, since this study focuses on digital images, compression of the image would be beneficial for size is a major factor in transmitting files [11].

Generally speaking, digital images come in two types. These are raster (also known as bitmap) and vector images. Raster images are composed of a matrix (grid) or bitmap of digital picture elements (pixels). The only restriction when working with bitmaps is the resolution of the viewing device. While filling information at pixel level, in upsizing or downscaling a pixel, the computer has to interpolate (enlarging), or get rid of information (reducing), inevitably damaging the final result. Vectors, on the other hand, have the totally inverse perspective. The image is not a product of "filling" the pixels with information, it is rather the software's interpretation of a bunch of mathematical calculations. As it is resolution independent, pixels are unharmed; rescaling a vector simply requires modification of parameters for mathematical locations. The software rather distributes the image on whatever size is indicated. Hence, no re-sampling takes part and the quality is always the same. No info is interpolated or lost in the process [14, 15].

Bitmap image enlarged to see the individual pixels.

Although it is inherently not supported, bitmaps can carry other information like alpha channels, Z-Depth and ID's, giving the images special properties as transparency and 3D filtering. On the other hand, vector images are not confined to the square space of their canvas, as they have innate transparency properties.

Vectors have inherent transparency properties. Bitmaps need extra information.

Vector images are unpopular since it is less photorealistic than raster images. Nevertheless, bitmaps used with technical negligence are like bad voodoo for printing workflows. The perfect example of a vector is a font. A 6px font has the same sharpness and quality as a 600px font. On the other hand, the perfect example of a bitmap is a high color depth photograph.

Figure 1 Beautiful shot by Gregory Hugh Davidson

There is not a single command in the code of vector images that gives the power to tell the software toÂ exactly, positively fill the displayed pixels on the monitor. The power of vector graphics is that resizing images needs no re-sampling as it simply makes a rendering of an equation over a given grid of pixels. This versatility and power is surprisingly what makes vectors flaw when scaled down to such extremely small sizes as 24 or 16 pixels, like icons. The same rule applies to fonts. Vector fonts are indispensable in 99% of graphic work, but for 9px fonts and below, bitmapped fonts would be a better option.

Two-dimensional pictures are represented as ones and zeros, the fundamental language of a computer called binary. In this study, this representation of an image is an important factor in embedding while other characteristics of a vector or a bitmap image such as size, color, pixel and resolution contribute to the quality of the resulting image after embedding.

## Image Analysis

The concept of image analysis is widely used in machine or computer vision. It is employed in deriving of meaningful information by analyzing the data that comprise an image, such as contour lengths, areas, shape, size distribution, etc; mainly from digital images by means of digital image processing techniques [16]. It can be as simple as bar code reading or as sophisticated as face recognition. Moreover, analyzing images taken from medical apparatus is a crucial function of image analysis tools.

## Image Compression

Image compression is minimizing the size in bytes of a graphics file without degrading the quality of the image to an unacceptable level. The reduction in file size allows more images to be stored in a given amount of disk or memory space. It also reduces the time required for images to be sent over the Internet or downloaded from Web pages [17]. Compressing an image is significantly different than compressing raw binary data. General purpose compression programs can be used to compress images, however, result is less than optimal. This is because images have certain statistical properties which can be exploited by encoders specifically designed for them. Also, some of the finer details in the image can be sacrificed for the sake of saving a little more bandwidth or storage space. This also means that lossy compression techniques can be used in this area [18].Â

## Lossless and Lossy Image Compression

Lossless and lossy compression are two image compression types that describe whether or not, in the compression of a file, all original data can be recovered when the file is uncompressed. With lossless compression, every single bit of data that was originally in the file remains after the file is uncompressed. All of the information is completely restored. This is generally the technique of choice for text or spreadsheet files, where losing words or financial data could pose a problem. The Graphics Interchange File (GIF) is an image format used on the Web that provides lossless compression.

Figure 2 Illustration of compression effect on file size.

Figure 3 Illustration of JPEG compression. On the left, the original image (270 kb).

Then the compressed file in the right (220 kb). [5]

On the other hand, lossy compression reduces a file by permanently eliminating certain information, especially redundant information. When the file is uncompressed, only a part of the original information is still there (although the user may not notice it). Lossy compression is generally used for video and sound, where a certain amount of information loss will not be detected by most users. The JPEG image file, commonly used for photographs and other complex still images on the Web, is an image that has lossy compression. Using JPEG compression, the creator can decide how much loss to introduce and make a trade-off between file size and image quality [19].

## Image Processing

Image processing involves signal processing of a two-dimensional image as the input of a system. The output of image processing may either be a modified image or, a set of characteristics or parameters related to the image. It involves many transformations such as enlargement, size reduction, linear translation and rotation. This process is from treating the image as a two-dimensional signal to applying standard signal-processing techniques to it. Standard techniques involve simply estimating the missing pixels based on the color of the nearest known pixels. More sophisticated techniques may involve using algorithms to judge the missing pixels usually by factoring in the relative colors of all surrounding pixels. Techniques to align images are also quite straightforward [20].

Image processing has large contribution in the success of implementation of computer vision, feature detection and augmented reality. Perhaps the one most familiar application to most of us is in security and surveillance applications. It has also highly contributed to the field of medicine. Medical image processing and medical image processing are only the few of them. Moreover, as we all know, we are in possession of a wealth of redundant data on the surfaces of planets but need to use image processing to highlight areas of interest for further study. Hence, a third area of significance of image processing is dealing with images obtained through remote imagery such as satellites through which scientists extends efficiency and capability to judge the presence of craters, soil and atmospheric characteristics [20]. However, with modern digital technology, the demand of image processing in diverse application areas is continuously growing like in the field of multimedia computing, secured image data communication, biomedical imaging, biometrics, remote sensing, compression and the like.

## Color Space

Color space is an abstract model used in the method of specifying, creating and visualizing color. It has three main attributes, that is, saturation, brightness and hue. These attributes indicate a position within the color space, which in turn identifies a color, and that is more likely a conversion or combination of colors. There are six different color spaces at the present that are used in image and video processing: RGB (Red Green Blue), CMY (K) (Cyan Magenta Yellow (Black)), HSL (Hue Saturation and Lightness), YUV, YCbCr and YPbPr. Among these color spaces, RGB, YCbCr and YUV have been useful in compression applications [5, 21].

For every application, a certain color spaces suits it best, for example some equipment has limiting factors that dictate the size and type of color space that can be used. Some color spaces as device dependent while others are equally valid on whatever device they are used. Moreover, most computer graphics color spaces are perceptually non-linear, which makes it inefficient for coding color information as some areas will enable expression at too high a precision ,while others at not enough. Another problem is that many can be unintuitive in use is that specification of a desired color can be difficult for the inexperienced eye, like distinguishing brown in an RGB vector) [21].

## RGB

RGB is frequently used in most computer applications since no transform is required to display information on the screen. For this reason it is commonly the base color space for most applications. This is the color space produced on CRT displays and similar monitors when pixel values are applied to a graphics card. RGB space may be visualized as a cube with the three axes corresponding to red, green and blue. The bottom corner, when red = green = blue = 0 is black, while the opposite top corner, where red = green = blue = 255 (for an 8 bit per channel display system), is white [21].

Figure 4 RGB Color Cube [25]

## Luminance and Chrominance

Luminance and chrominance based color spaces correspond to brightness and color. Luminance is the perceived brightness, or grayscale level, of a color produced from the density of RGB colors in a projected image. It is adjustable to produce not only colors, but also blacks and whites of varying intensity. On the other hand, chrominance defines the color component of an image pixel as it sets its hue and saturation [5, 25]. These color spaces are denoted as YUV, YIQ, YCbCr, and YCC.

The YUV space is a European standard, while, the YIQ color space is North American. These two are analogue spaces for NTSC and PAL systems respectively, while YCbCr is a digital standard. These methods are nearly identical, using slightly different RGB color space conversion equations. In both systems, Y is the luminance or brightness component and the U and V (or I and Q, Cb and Cr) are the chrominance, or color, components. The values are bipolar, which goes negative as well as positive. These are the variables that are changed by the brightness, color, and tint controls on an image [21, 22].

The advantage of using these color spaces is that the amount of information needed to define a color image is greatly reduced, based on human visual perception. However, this compression restricts the color range in these images. Many colors that can appear on a computer display cannot be recreated on a television screen. Nevertheless, this project used YCbCr, which is simply a lightened, gamma adjusted version of YUV color space, in the JPEG processing [5, 22, 26].

## YCbCr

Human eye sensitivity to color differences and brightness is the basis of YCbCr color space. It deals only with the digital representation of RGB signals in YCbCr form [21]. The following are the coding conversion equations for non-linear signals used in the JPEG processing:

Y0 = 16 + (0.299) R0 + (0.587) G0 + (0.114) B0

Cb0 = 128 - [(-0.16874) R0 - (0.33126) G0 + (0.5) B0]

Cr0 = 128 + [(0.5) R0 - (0.41869) G0 - (0.08131) B0]

## Joint Photographic Experts Group

JPEG is a well-known standard method of compressing photographic images. It was specified by a committee called "Joint Photographic Experts Group" formed in 1986 and became an international standard since 1992 [5].

JPEG format is one of the two most common formats used on the Web. It has the capability of storing 24 bits-per-pixel and designed for compressing both colored and gray-scale images of real-world scenes. It used in many applications such as satellite, medical and general photography [5].

## JPEG Process

A JPEG process is divided into two parts: the encoding method and the decoding method. The encoding method specifies multi-level coding processes: Color Space Transformation, Block Splitting, Discrete Cosine Transformation, Quantization, and Entropy Coding. On the other hand, the decoding method is the reverse process of the encoding process. It is a repetitive process since an image can be compressed and decompressed several times [5].

JPEG Encoding Algorithm [26]

## JPEG Encoding

The JPEG encoding algorithm performs compression in five phases:

An image is converted from RGB to YCbCr (luminance and chrominance color spaces)

Then, the image is partitioned to separate blocks of 8Ã-8 pixels.

Apply Discrete Cosine Transform (DCT) to each block.

Quantizing the DCT coefficients

Entropy Coding.

## Color Space Transformation

Each component of color images is compressed independently, as with the lossless compression scheme. RGB color space is not the most efficient way to JPEG compress images, as it is particularly susceptible to color changes due to quantization [25].

Color images are converted from RGB components into YCbCr color space, which consists of the luminance (grayscale) and two chroma (color) components. Note that although some literature (including some previously produced by the HyperMedia Unit) states that images are converted into YUV components - or is just very hazy about what format the image is converted to - it is not YUV exactly [25].

## Block Splitting

Each of the three YCbCr color planes are encoded separately but using the same scheme. Eight by eight pixel blocks (64 pixels in total) were chosen as the size for computational simplicity. If the size of the image is not a factor of 8, it is padded out to the required size, with extra space being added on the left and bottom of the image. When the image is decoded, these pad pixels are chopped off [25].

## Discrete Cosine Transformation

DCT separates the frequencies in an image by converting the spatial image representation into a frequency map of DCT coefficients. The low-order or DC coefficient, found at the top-left corner of each block, represents the average value in the block. The rest of the coefficients are the higher-order or AC coefficients, which represent the strength of more and more rapid changes across the width or height of the block. The highest AC coefficient represents the strength of a cosine wave alternating from maximum to minimum at adjacent pixels [27]. The following equation is the idealized mathematical definition of the 8x8 DCT [29]:

Since the DCT works on a particular matrix format, the values are levelled off from unsigned integers (0 to 255) to signed integers (-128 to 127), which is done by simply subtracting 128 to every matrix entry [28].

Figure 5 - Example of Matrix Shifting [28]

From this point on, each color component is processed independently, so a pixel means a single value, even in a color image [27, 28]. To get the matrix form, we need to use the following equation [29]:

Finally, in order to perform DCT proper, matrix multiplication following this equation is used:

Wherein the example used in Figure ??? will result to this matrix:

Figure 6 - DCT coefficients [28]

The DCT calculation is fairly complex; in fact, this is the most costly step in JPEG compression. The point of doing it is for easier discarding of high-frequency data without losing low-frequency information. Since this step is merely replacing pixel blocks values by 64 DCT coefficients, the DCT itself is lossless except for round off errors [27].

## Quantization

The purpose of quantization is to discard information which is not visually significant. This step takes advantage of low sensitivity of human eye to high frequency brightness variation to greatly reduce the amount of information in the high frequency components. This is done by simply dividing each component in the frequency domain by a constant for that component, and then rounding to the nearest integer. This is the main lossy operation in the whole process [30].

To discard an appropriate amount of information, each of the 64 DCT coefficients is uniformly quantized in conjunction with a 64-element Quantization Table, which is specified by the application. and rounds the result to an integer. The larger the quantization coefficient, the more data is lost, because the actual DCT value is represented less and less accurately. Each of the 64 positions of the DCT output block has its own quantization coefficient, with the higher-order terms being quantized more heavily than the low-order terms (that is, the higher-order terms have larger quantization coefficients). Furthermore, separate quantization tables are employed for luminance and chrominance data, with the chrominance data being quantized more heavily than the luminance data. This allows JPEG to exploit further the eye's differing sensitivity to luminance and chrominance [27, 29].

It is this step that is controlled by the "quality" setting of most JPEG compressors. The compressor starts from a built-in table that is appropriate for a medium-quality setting and increases or decreases the value of each table entry in inverse proportion to the requested quality. The complete quantization tables actually used are recorded in the compressed file so that the decompressor will know how to (approximately) reconstruct the DCT coefficients [27].

However, this project used modified quantization tables as proposed by Chang et al. [3], and Li and Wang [4].

Proposed Modified and Optimized Luminance JPEG Quantization Table

8

8

1

1

1

1

1

1

1

1

1

1

1

1

1

1

28

1

1

1

1

1

1

35

28

1

1

1

1

1

44

40

31

1

1

1

1

34

55

52

39

1

1

1

32

41

52

57

46

1

1

39

44

52

61

60

51

1

46

48

49

56

50

52

50

Proposed Modified and Optimized Chrominance JPEG Quantization Table

9

1

1

1

1

1

1

1

1

1

1

1

1

1

1

50

1

1

1

1

1

1

50

50

1

1

1

1

1

50

50

50

1

1

1

1

50

50

50

50

1

1

1

50

50

50

50

50

1

1

50

50

50

50

50

50

1

50

50

50

50

50

50

50

Selection of an appropriate quantization table is something of a black art. Most existing compressors start from a sample table developed by the ISO JPEG committee. It is likely that future research will yield better tables that provide more compression for the same perceived image quality. Implementation of improved tables should not cause any compatibility problems, because decompressors merely read the tables from the compressed file; they don't care how the table was picked [27].

## Entropy Coding

The final processing step of encoder is entropy coding. Entropy coding is a special form of lossless data compression. It involves arranging the image components in a "zigzag" order employing run-length encoding (RLE) algorithm that groups similar frequencies together, inserting length coding zeros, and then using Huffman coding on what is left [30].

Before entropy coding, there are a few processing steps for the quantized coefficients. The resulting coefficients after quantization contain a significant amount of redundant data. Huffman compression will losslessly remove the redundancies, resulting in smaller JPEG data. An optional extension to the JPEG specification allows arithmetic encoding to be used instead of Huffman for an even greater compression ratio. At this point, the JPEG data stream is ready to be transmitted across a communications channel or encapsulated inside an image file format [27].

Note that the DC coefficient is treated separately from the 63 AC coefficients. The DC coefficient is a measure of the average value of the 64 image samples. Because there is usually strong correlation between the DC coefficients of adjacent 8x8 blocks, the quantized DC coefficient is encoded as the difference from the DC term of the previous block in the encoding order, called Differential Pulse Code Modulation ( DPCM ), and the function is as followed [29]:

## DiffDC(i) = DC(i) - DC(i-1)

where i is the i-th block, DC(0) = 0

DPCM can usually achieve further compression due to the smaller range of the coefficient values [29].

The remaining AC coefficients are ordered into the "zig-zag" sequence, which helps to facilitate entropy coding by placing low-frequency coefficients before high-frequency coefficients [29].

Now the outputs of DPCM and "zig-zag" scanning can be encoded by entropy coding separately. It encodes the quantized DCT coefficients more compactly based on their statistical characteristics. The JPEG proposal specifies two entropy coding methods: Huffman coding and arithmetic coding.

Entropy coding can be considered as 2-step process. The first step converts the zig-zag sequnce of quantized coefficients into an intermediate sequence of symbols. The second step converts the symbols to a data stream in which the symbols no longer have externally identifiable boundaries. The form and definition of the intermediate symbols is dependent on both the DCT-based mode of operation and the entropy coding method.

## Â

## Steganography

The rise of internet resulted in many systems that secure the information. Cryptography hides the context of a message but makes no attempt to hide the existence of the message. To secure a message, it may also be necessary to conceal its existence as it will prevent a potential investigator from suspecting that the message exists. The technique used to implement this essential secrecy is steganography. The word is derived from Greek, and literally means "Covered Writing". The process of a typical steganography technique basically includes the following steps [13]:

1. Providing a file (such as a gray-scale image) called Cover Media in which the secret information is supposed to be embedded.

2. Using a stego-key in order to get the output file called as stego-media. It is worth mentioning that without having stego key the extraction procedure of the secret message bits is impossible.

3. Embedding the secret message file inside the cover media that can be any file (a text file).

## Optimal Least Significant Bit Substitution

The Optimal Least Significant Bit Substitution scheme improves the stego-image quality by finding an optimal pixel after performing an adjustment process. Three candidates are picked out for the pixel's value and compared to see which one has the closest value to the original pixel value with the secret data embedded in. The best candidate is then called the optimal pixel and used to conceal the secret data [13].

The embedding algorithm is as follows:

Let be the original pixel value and k-bit(s) of secret data is to be embedded.

Embed k bit(s) of secret data into by using the LSBs method. The stego-image can be obtained.

Generate another two pixel values by adjusting the (k + 1) th bit of .

Therefore, and can be calculated as follows:

Obviously, the hidden data in and is identical to because the last k-bits of them are the same.

The best approximation to the original pixel value, , (i.e. the optimal candidate) is found by the following formula:

Finally, all the optimal candidates for replace the original pixel values and the embedding algorithm come to its end.

## Harmony Search Algorithm

Computer scientists have found a connection between playing music and optimal solution finding. This relationship led to the creation of a new algorithm called Harmony Search. The Harmony Search Algorithm (HSA) was first developed by Geem et al. in 2001. Since then, it has been applied to many optimization problems and has been proven of its effectiveness and advantages as demonstrated in various applications which include design of water distribution networks, groundwater modeling, energy-saving dispatch, truss design, vehicle routing, function optimization, engineering optimization and others [6].

The HSA is a music-based metaheuristic optimization algorithm, which was inspired by the observation that the aim of music is to search for a perfect state of harmony. The effort to find that harmony in music is analogous to the searching of optimality in an optimization process. Derived from an orchestra or a band of jazz musicians, each musician assigned to his own instrument plays a note contributing to the overall quality of the harmony of the music produced [6].

## HSA: Optimum Solution Finding

In searching for the perfect harmony, a musician employs three methods: (1) Playing from memory, (2) Playing modified music from memory and (3) Creating music through random notes. Geem et al. has formalized these three methods into the newly developed optimization algorithm. The three corresponding components of HSA are (1) Memory Consideration, (2) Pitch Adjustment and (3) Randomization. Each of these elements plays a vital role in searching for optimal solution in HSA [6, 12].

For a musician to create a good music, he may consider existing compositions. This composition consideration is related to HSA as the Harmony Memory Consideration. The Harmony Memory ensures that possible solutions are stored as elements in a solution vector.

Another way a musician can play good music is by playing a modified existing composition. The HSA also uses this concept and associated as the Pitch Adjustment, also referred as the exploitation mechanism of HSA, which is responsible for generating solutions that are slightly varied from the existing solutions.

Randomization comes in to play as the third method in HS, also referred as the exploration mechanism of HSA. This ensures that the search for the solution is not limited in the local optima. In effect, the solution set are more diverse.

In the program flow, after initialization, the optimization process starts and terminates until termination condition is reached. Each decision variable in the HSA contributes to the optimization. The value of each decision variable is decided with respect to harmony memory acceptance rate (rhmcr) on each pass. The rhmcr decides if the value of the ith variable will be taken from the values in the harmony memory [6] [12].

## HSA: Pseudo Code

Harmony Search Algorithm

## Begin

Define objective function f(x), x = (x1, x2â€¦ xd) T

Define harmony memory accepting rate (raccept)

Define pitch adjusting rate (rpa) and other parameters

Generate Harmony Memory with random harmonies

While (t < max number of iterations)

While (i <= number of variables)

If (rand < raccept) Choose a value for the variable i

If (rand<rpa) Adjust the value by adding certain amount

## End if

Else Choose a Random Value

## End if

## End while

Accept the new memory if better

## End while

Find current best solution

## End

## HSA: Flow Chart

## Chapter 2

## Literature Review

Information hiding has been the focus of many researches and the techniques employed become more complex. Of all the techniques employed to steganography, spread spectrum methods satisfy most requirements and are especially robust against statistical attacks. On the other hand, least significant bit substitution methods are at the bottom of the hierarchy, however it has its strengths, and are therefore subject to researches to improve imperceptibility [1].

Recently, several steganographic techniques for data hiding in JPEGs have been developed: JSteg, JP Hide&Seek, F5, and OutGuess. All these techniques manipulate the quantized DCT coefficients to embed the hidden message.

Li and Wang [7] contributed a new steganographic method derived from JPEG and Particle Swarm Optimization (PSO) algorithm. They utilized the PSO for selecting the optimal substitution matrix. Moreover, they used the concept of JPEG and Quantization Table Modification (JQTM) method for modifying the JPEG standard quantization table which yielded an increased hiding capacity in the host image. It is duly noted that Optimal Least Significant Bits (OLSB) improves stego-image quality by embedding the secret image in the least significant bits of the pixels of the host image. In totality, the significance of their study is an achievement for addressing the trade-off between payload and image quality. However, their proposed method still exposed some limitations when dealing with longer computational time requirement and larger size of JPEG file.

Wang et al. [8] proposed an OLSB and Genetic Algorithm (GA) approach to image steganography. They employed Peak-Signal-to-Noise Ratio (PSNR) calculation for each substitution matrix used in the GA process. They also computed for the Mean Square Error (MSE) from which they proved that the worst MSE of the optimal substitution is actually identical to half of the worst MSE of the simple substitution. This confirms the effectiveness of OLSB substitution under the worst condition. Moreover, they utilized GA to solve the problem of hiding data in the rightmost k LSBS of the host image, since this technique may require longer computational time to find the optimal result when k is large. This is mainly because the number of possible solutions grows exponentially as k increases. In cases, however, where k â‰¤ 3, the embedding in the k LSBs of the host image should work just fine. Furthermore, they developed an improved hiding technique based on conceptual modelling and obtained a high quality embedding result. Specifically, they used the rightmost four and two LSBs of noisy pixels and smoothing pixels, respectively, to hide the important data. Their experimental results reveal the practicality and superiority of their proposed method.

Bhattacharya, et.al [9] used Session Based Stego-key, Genetic Algorithm (GA) and variable bit replacement technique to improve image-hiding technique. In their approach, the secret image is perturbed in two stages. It starts by entering the 8 bit Stego Key. This Key (K) is a decimal equivalent of first 3 Moderately Significant Bit (MdSB) of KS which ranges from 0 to 7. The encrypted Kth bit of each Byte of perturbed image = [Kth bit of each Byte of Hidden image] XOR [Kth bit of SK]. Then, last step for the first stage perturbation of image, swap the lower and upper 4 bits of each byte to obtain perturbed image. The second stage involves the use of GA in generating Transposition operator P when the encrypted output file is once again perturbed.

Harmony Search Algorithm (HSA) is a new metaheuristic algorithm developed by Geem et al. in 2001. [6] [12]. Since then, it has been applied to many optimization problems and has been proven of its effectiveness and advantages as demonstrated in various applications. Harmony Search has outperformed the other previously existing optimization algorithm which includes ACO, PSO and GA, with respect to the benchmark problems like Traveling Salesman Problem. Moreover, in the case of real world problems, HSA has also outperformed other optimization algorithms such as groundwater modelling. Another notable about HSA is that, in terms of cost, it was able to find a more optimal solution than GA. With its potential and all that HSA has yet to offer, it would be interesting to use this metaheuristic as an underlying algorithm for other fields especially steganography.

Geem and Williams [7] applied Harmony Search on Maximal Covering Species Problem (MCSP), an ecological conservation problem which aims to preserve species and their habitat. It attempts to find the maximum number of species while limiting the number of parcel. The algorithm was tested on the so called Oregon data, which consists of 426 species and 441 parcels. They modified the structure of HSA since their decision variable yields only two possible values. Consequently, they omitted the pitch adjustment operation for HSA to be able to suit to the problem. The results have confirmed the viability of HS on ecological optimization and outperformed Simulated Annealing in solving the said problem.

## Chapter 3

## Statement of the Problem

Along with the advancement of approaches addressing these issues, are also the development of programming malpractices. Thus, Internet transactions are never really secure and are actually always risky; especially those involving highly sensitive data like in telecommunication, business and banking.

Data hiding is one method addressing information security. It is concerned with concealing information in digital media, thus making it difficult for a potential investigator to discover. One sub-discipline of data hiding is steganography. Steganography studies the encoding and the detection of hidden information in digital communication transmissions.

A good steganographic algorithm has to be imperceptible. Steganographic imperceptibility requires invisibility, sufficient payload capacity, robustness against statistical attacks and image manipulation, independence in file format, and carrier files of unsuspicious nature.

Using the concept of Optimal Least Significant Bit Substitution, this study, will test if Harmony Search Algorithm can be used to produce and support efficient image steganography.

## Chapter 4

## Proposed Methodology

## 4.1 Input

In this project, we have standardized the cover image to be a 256 x 256 pixels 24-bit image at 256 gray levels, since the best cover image is a grayscale image [23]. A 256 x 256 pixel cover image with 24 bit gray-level, the resulting 8 x 8 pixels are 1,024 blocks. Also, in the embedding, the secret data used can only be an image or a text file.

The modified quantization table gives the capacity of 36 coefficients in each block to embed data, which yields a reasonable payload of 36,864 coefficients in the entire cover image. In effect, when k = 2, the cover image capacity is 73,728 bits or 9,216 bytes. Correspondingly, when k = 4, capacity is 147,456 bits or 18,432 bytes. This then means a constraint to the secret data file size.

Taken from [5]

Figure 4.1: Lena Image as cover image

## 4.2 Data Preparation

To be able to meet the requirements of the algorithm, the cover image and the secret data will be treated accordingly. The cover image will undergo JPEG re-encoding which is essential for embedding the data, which is helpful for compression as well. On the other hand, the secret data will be converted to its corresponding bytes, which is then decomposed into its k-bit units and treated as single pixel. The k-bit secret messages will be represented as a decimal value ranging from 0 to 2k - 1.

5.4 The Procedure for Embedding

To examine and determine the efficiency of HSA, the proposed embedding procedure is a combination of JPEG compression process and the substitution matrix procedure of Li and Wang. Rather than using PSO as the solution explorer for the optimal substitution matrix, HSA will be utilized. Figure 5.3 describes the method to be employed.

Until the Maximum Cycle is reached, the processes within the dashed box in Figure 5.3 are repeated. The main embedding is done with the logical operator exclusive or (EOR, XOR, EXOR, or âŠ•); it is false when both are true or both are false. The associative and commutative property of XOR makes it a feasible way to embed information. The stego-image with the highest PSNR will be the output.

Cover Data

Secret Data

Image partition to blocks of 8 x 8 pixels

Unsigned Byte

Decompose into

K-bits

RGB - YUV color transform

HSA Matrix Substitution

Matrix Substitution

Translation

## DCT

Quantization

Embedding

PSNR

Entropy encode

MSE

JPEG Stego

Cover Data

Figure 5.3: The embedding procedure diagram

5.4.1 JPEG Steganography

JPEG based Steganography starts when the cover image is partitioned to blocks of 8x8 pixels. Then, to acquire a higher hiding rate, the RGB components of each block is converted to YUV using equations 1-3. Using DCT, transform each value in a block to DCT coefficients and scale each coefficient using the modified and optimized quantization tables (Tables 5.1 and 5.2) based on Huang et.al and Chang et.al works.

After Harmony Search Algorithm (HSA) finds the best-substituted data, the data now can be embedded to DC-to-middle frequency components of the quantized DCT coefficients for each 8x8 blocks specifically in the 36 coefficients using the embedding order in Figure 5.6. Then k secret bits will be embedded to the least k significant bits of each coefficient where k is either 2 or 4.

Figure 5.4: Lena Image Subdivided into Figure 5.5: A block of 8 x 8 pixels

blocks

The next procedure will be employing of JPEG entropy coding (Huffman coding, Run-Length coding and Differential Pulse Code Modulation) in compressing the resultant blocks to obtain a JPEG file. These files will contain important data gathered from compressing an image such as the quantization table and some other compressed data that will meet the requirements of the JPEG standard.

Figure 5.6: DC-to-middle frequency embedding order

Table 5.1: Proposed Modified and Optimized Luminance JPEG Quantization Table

## 8

## 8

## 1

## 1

## 1

## 1

## 1

## 1

## 1

## 1

## 1

## 1

## 1

## 1

## 1

## 1

## 28

## 1

## 1

## 1

## 1

## 1

## 1

## 35

## 28

## 1

## 1

## 1

## 1

## 1

## 44

## 40

## 31

## 1

## 1

## 1

## 1

## 34

## 55

## 52

## 39

## 1

## 1

## 1

## 32

## 41

## 52

## 57

## 46

## 1

## 1

## 39

## 44

## 52

## 61

## 60

## 51

## 1

## 46

## 48

## 49

## 56

## 50

## 52

## 50

Table 5.2: Proposed Modified and Optimized Chrominance JPEG Quantization Table

## 9

## 1

## 1

## 1

## 1

## 1

## 1

## 1

## 1

## 1

## 1

## 1

## 1

## 1

## 1

## 50

## 1

## 1

## 1

## 1

## 1

## 1

## 50

## 50

## 1

## 1

## 1

## 1

## 1

## 50

## 50

## 50

## 1

## 1

## 1

## 1

## 50

## 50

## 50

## 50

## 1

## 1

## 1

## 50

## 50

## 50

## 50

## 50

## 1

## 1

## 50

## 50

## 50

## 50

## 50

## 50

## 1

## 50

## 50

## 50

## 50

## 50

## 50

## 50

## 5.4.2 Substitution Matrix Using HSA

A harmony X will be defined as a substitution matrix of 2k dimensions.

X = x0 x1 â€¦ x2k-1 (17)

where x0 has the value 1 in the row 0 of the substitution matrix, x1 has the value 1 in the row 1 of the substitution matrix and up to the ()th row where the value is 1.

M = M'=

Figure 5.7: Matrix substitution

The harmony is just a one-dimensional representation of the 2k x 2k substitution matrix. From the matrix M, value 1 of the row 0 is found at column 0 so the first row of M' contains 0. Similarly, the second column of M' is 3 because the value 1 in row 1 is located at column 3. In simple terms, M' is just the column index where the value 1 is found for every row in matrix M.

## 5.4.3.1 Harmony Initialization

From the basic HSA, the parameters are redefined to examine the algorithm's performance. We set the MaxC as the maximum iteration, HMS as the number of employed bees, HMCR as the number of onlooker bees and the number of scout bees as PAR. The first three parameters are set to be configurable by the user while the sN takes 10 percent of the employed bees.

## 5.4.3.2 Evaluation Function

The stego-image quality assessment is done through the peak signal to noise ratio (PSNR) evaluation using the mean squared error. For each harmony, the performance will be evaluated under the gray-level using MSE (Equation 7) and PSNR (Equation 8).

In this study, a modification to the HSA's fitness function is necessary since PSNR is the standard performance measure for fitness function used in Steganography.

(20)

where i = 1,2,â€¦N.

## 5.4.3.3 Randomization

## 5.4.3.4 Pitch Adjustment

## 5.4.3.5 Memory Consideration

## 5.4.3.6 Termination Condition

The only termination condition is to reach the number of cycles set by the user. When this value is reached then the stego-image produced by the best PSNRi will be displayed.

## 5.5 The Procedure for Extraction

In the extraction process, take the reverse of the embedding process as shown in Figure 5.7. To extract the kth LSBs, XOR is employed. The transpose of the HSA-selected substitution matrix will be used to decode the exact data embedded into the LSBs of cover-image.

Three inputs are necessary to conduct the extraction procedure, the stego file, the cover image, and the substitution matrix. The JPEG file or stego image is entropy decoded and copies the blocks of quantized DCT coefficients. A particular JPEG component called QDCT Luminance component holds the secret data. The QDCT Luminance values of the stego image are XORed with the QDCT Luminance values of the cover image to get the embedded information.

## JPEG Stego

## Entropy decode

## Compose k-bits into N-bits

## Secret Data Extraction

## Data Translation with Matrix Substitution Transpose

## Cover QDCT Luminance

## Stego QDCT Luminance

## Secret Data

Figure 5.7: The extraction procedure diagram

It is necessary to compose the bits back to its original byte form with respect to the value of k used. Finally, using the transpose of the substitution matrix, the original secret data will be obtained. It is unnecessary to reconstruct the cover image from the stego image since part of the extraction requirements is the cover image itself because XOR is employed. In short, cover data is XORed with the stego data to get the secret data. If the cover data is incorrect then the secret data is also incorrect.