# Preserving Sensitive Information In Shared Documents Computer Science Essay

**Published:** **Last Edited:**

This essay has been submitted by a student. This is not an example of the work written by our professional essay writers.

This paper proposes a mechanism in which a user can secure his confidential and sensitive information in a shared space of documents where every user can interact with other users while preserving some sensitive information like birth-date, account number, health etc . The objective of this paper is to propose the information hiding by using a new method of Vigenere's cipher that will encrypt all characters which includes alphabets, numeric characters as well as special characters. This cipher text depends on the USER KEY (that also may be mixed type) not on the basis of login/ password (i.e. usually used in some practical applications based on numeric data only [2]), so that administrator can't read and modify. As compared to other encryption techniques, this is more dominant to encrypt cipher text and make comparisons among the algorithms such as classical Vigenere, 3kdec[3], Î¸-Vigenere cipher [1].

## INTRODUCTION

In current scenario the three major aspects of information security is confidentiality, integrity and availability, also known as CIA triad. The first and most widely used security mechanism is - authorization and authentication. When an user wants to access any information, then he first clear the security gate by providing valid key for it, i.e. he should haves a valid user name with an appropriate password, this process is known as AUTHENTICATION. After clearance the user is AUTHORISED for the accessing the concert information. When any outsider or malicious user breaches this level of security then we have to think about a higher level of security that is ENCRYPTION of the message. Encryption is the process of encoding a message so that its meaning is not obvious and decryption is the reverse process transforming an encrypted message back into its normal form. A good encryption algorithm should fulfill the following requirements:

The encryption and decryption should be an abstraction for user, i.e. user should visualize that he is interacting with original non encrypted database, firing the required query.

Sensitive information, of the stored document, should be encrypted in such a way that it should not be directly usable by the adversary.

The length of the encrypted message should not exceed the original message.

The strength of the algorithm depends upon the security of the key. The security of the key is maintained using probability and substitution method. As the number of substitution increases the security of key increases.

There are two ways of encryption-Symmetric key encryption and Asymmetric key encryption. In symmetric key encryption, the key at the encryption side and the key at the decrypt side remain same. According to the Kirchhoff's Law: -"the security of symmetric key encryption depends only upon the secrecy of the key". While in asymmetric key encryption, we have two types of keys - public key and private key. If there are two users A and B and if A want to send the data to B, then A encrypt the data with the public key of B and at the other end B decrypts it with its private key. In our paper we are only concern about symmetric key. Again symmetric key encryption is of two types - stream cipher and block cipher. In stream cipher each character of the plain text is mapped with key characters and using these combinations the cipher is generated accordingly. The block cipher only differs in the manner that, the plain text is divided into equal blocks and one set of key used for one particular block only. Our solution also goes along with the block cipher method in symmetric key encryption.

"Any cryptographic schema is safe if and only if it is unbreakable in reasonable time using feasible resources in spite of the intruders being aware of: (1) Encryption And Decryption algorithm and (2) size of the keyâ€¦"

This paper presents an algorithm that allows confusion incorporated to the Vigenere stream cipher, also with little modification it is implemented using block cipher. The algorithm is strengthened by changing the key for each block using a particular method.

## Earlier Worked

In Shared Environment the security of the private information is assured using cryptography. Cryptography is a study of techniques of mathematics related to aspects of information security, confidentiality and integration of data.

Generally in the shared environment the authentication process is carried out using user name password method, so in this type of security schema the stored information can be accessed by two person, firstly the user itself and next is the Database Administrator. But this doesn't fulfil the requirement of Confidentiality at the user level.

Classical cryptographic schema particularly encryption task is depended on the user key ONLY. Some algorithms are limited to encryption of Numeric characters to numeric characters [1] or alphabets [9]. The encryption of the alphabets is done by substituting them with other alphabets or numbers [10].

The cryptography is implemented using various encryption algorithms; one of them is Classical Vigenere's algorithm. In Classical Vigenere's algorithm, the data is encrypted using a key, which is expanded along with the length of plain text.

Message: sensitive information.

User Key: enigma

Plain text

s

e

n

s

i

t

i

v

e

i

n

f

o

r

m

a

t

i

O

n

Repeated key

e

n

i

g

m

a

e

n

i

g

m

a

e

n

i

g

m

a

E

n

The problem arises when information in the document is not bounded to numerical data or alphabets or alphanumeric. A document may contain information using any of the above combination. Any type of ASCII character can be there in the document, that is it lies in the range of 0 to 255 ASCII values (total 256 characters which is recognized by 8bits). Since the ASCII range from 0 to 32 is control characters such as Back Space, vertical tab etc, they are not considered in as the document text.

## Our Proposed Work

This paper describes securing the confidential information in shared environment where each and every user can store his information through uploading documents. Here we are focusing on securing the private information of the user. The part of information that is to be secured is gathered from the user and is encrypted using a new proposed algorithm. This algorithm is a modification of Vigenere's algorithm and various file like text (.txt) and word file (.doc/.docx) can be encrypted and decrypted using this algorithm.

The user gives file and the fields of the file that is to be encrypted as input to the encrypting algorithm. The cryptographic system is explained by quintuple (M, C, K, Ek , Dk ),where M is for Plain text message ,C is for Cipher text message ,K is for set of Keys , Ek is for family of encoding function , Dk is for family of decoding function by the key.

We have performed encryption using block cipher [9] and the size of the block is taken according to the size of key. According to cryptography rule following equation is to be achieved by-

DÎ¸ (EÎ¸ (m)) = m

Where D is the decoding function of message by algorithm E is encoding function of message by new algorithm Î¸ is the degree of faction or size of datagram

An important decision about the value of Î¸ is taken by number of characters of the Key that was entered by the user for Authentication.

## Encryption and Decryption through extended Vigenere Matrix

Key-Expansion algorithm

Design of Encryption Schema.

Design of Decryption Schema.

## Key-Expansion algorithm

Previously we discussed that the repetition of key is major advantage to cryptanalysis because tracker can easily find out the plain text from cipher within few second. This method was developed by Kasiski and Kereckhoff [], rely on the fact that key is repeating up to the length of pain text and in general languages alphabets are relatively repetitive, as per Index of Coincidence (i.e. I.C.) the value of IC of each character shows that the frequency of using that character in that particular language [7]. By finding the high frequency character or some group of words and their repetition and factoring these information, are most promising possibilities to find out derivative the key length [b-9-1].through this procedure the length of key is found out then the actual key could be easily found out by some combination.

Our new algorithm of key expansion is the most effective change in classical Vigenere cipher or other related algorithm based on Vigenere's matrix cryptography. This adds an advantage of confusion. This confusion is provided by Expanding User Key by applying exclusive OR or X-OR operation on actual key. Following method is used for Key-Expanding.

Specification of character ASCII values,

No.

Category

ASCII value range

Bits used for Identify range

1.

ASCII Control Characters

0 to 31

5 bits

2.

ASCII keyboard character

32 to 127

6 or 7 bits

3.

Extended ASCII Code

128 to 255

8 bits

User can specific Key character that comes in the category of ASCII keyboard character that is might be 6 bit character or 7 bit character. If character is of 6 bit then by it is padded by one bit that is MSB 1 bit is inserted as zero in 6 bit character. The reason behind this thing is that when we perform X-OR operation (describe on next page) between the Binary no. their size of bits must be same. Thus, the resultant bits will be size of 7 bits.

Here we have taken this restriction that user will have to entered the key that's more than two characters, as Key-Expansion algorithm can't be applied on one character key.

Now following block diagram (fig. No. 01 ) will show how the key is generation.

## In Such a way key can be expend that will depend on the plain text message size (i.e. n)

Fig No.01 (Key-Expansion block diagram)

This block diagram shows the Key Generation Method, i.e. XORing and ADDING in Previous key. After passing through the ADDER following possibilities may occur:-

(i) Value may be less than 32

(ii) Value may be in between 32 to 127

(iii) Value may exceed the range of Key that is 128 to 255

Mapping methods for such possibilities: - Step (i) if the value is less than 32(means 5bits no.) then performs an addition of that value with 32. For example after addition we get value 25 then shift this value in between 32 to 127 by adding 32 with 25, so final value becomes 25 + 32 = 57 or for example 31 is resultant will be 31 + 32 =63 both of these values are in 6 bits binary number so go to step (ii).

Step(ii)

As we want our result in between 32 and 127, therefore we do not have any problem with this resultant.

Step (iii) Range of resultants that is 128 to 255 means that number is of 8 bits. These 8 bits have to be mapped in between 32 to 127 bit characters (range of keyboard character).

8 bit character follows following procedure for mapping:

First step 243-32=211

Second step 211%96 =19 {127-32+1=96}

Third step 19+32=51 {if again 6 bit go to step (ii)}

If the plain text

PM0

PM1

PM2

PM3

PM4

## ----------- ----- --------

PMn-1

where each PMi-1 Ârepresents a character of the plain text and value of i varies as 0 â‰¤ i â‰¤(n-1)

Calculation for no. of blocks:-

Given that the no. of plain text character is n and degree of function or size of block is Î¸

No. of Block=n / Î¸ {value is either integer or rational}

if integer value directly taken i.e. = x if rational value the i.e. = x(w/Î¸) in which the no. of block = x (applying block cipher concept) the reminder value = w (means remaining no. of characters left i.e. = w )

For that no. of characters w,{ we will apply directly stream cipher concept. }.The value of w lies in between 1 â‰¤ w â‰¤ Î¸ because Î¸ divides n.

Design of Encryption Schema: - So in such a way we will get first next Î¸ expended key character, this algorithm is continuously applied until the length of pain text i.e. n. here EKi-1 is stand for Extended Key of ith corresponding location

EK0

EK1

EK2

EK3

## --- -- -- -- ---- ---- ---- ---- ---- ----

EKn-1

And for explaining purpose our plain text is taken as follows:

PM0

PM1

PM2

PM3

## -- -- -- -- -- --- -- -- -- - ----- ---

PMn-1

where PM(i-1) is stand for plain text of ith located character.

Most of the times the given message is much longer then the key or key can be large as much as the length of the message for encryption purpose.

Supposed that the character length of the data i.e. plain text is n and the character length of Key is Î¸. so we will apply the concept of Block cipher by constructing the Block of characters(datagram/cryptogram [9]) in which no. of character must be Î¸,

Plain Text Message is Divided in to Block: -

From character PM0 to PM(Î¸-1) is included into PMÎ¸(1) is first block of plain massage From character PMÎ¸ to PM2(Î¸-1) is included into PMÎ¸(2) is second block of plain message. Like that From PM0 to PMÎ¸ = PMÎ¸(1) From PMÎ¸ to PM2(Î¸-1) = PMÎ¸(2) -------- and so on.... from PM(x-1)Î¸ to PM(x)(Î¸-1) = PM Î¸(x) where the no of block = x

Dividing of Key in to block in following way : -

From EK0 to EK(Î¸-1) is included in to first block of Extended Key EKÎ¸(1) From EKÎ¸ to EK2(Î¸-1) is included in to second block of Extended Key EKÎ¸(2) Like that From EM0 to EMÎ¸ = EMÎ¸(1) From EMÎ¸ to EM2(Î¸-1) = EMÎ¸(2) -------- and so on.... from EM(x-1)Î¸ to EM(x)(Î¸-1) = EMÎ¸(x) where the no of block = x Above discussed encryption block PMÎ¸(x)for encryption is processed in following manner by using key Block EKÎ¸(x)that generates cipher blocks CMÎ¸(x) shown in Fig No.02 and algorithm is discussed then after.

Fig. No.02 (diagram of Block wise encryption)

Encryption Algorithm details:- As we have categories our algorithm that is substitution algorithm means each and every character of plain text is substitute from the corresponding character of Extended Vigenere's matrix. We have increased diffusion in our algorithm by taking all characters (ASCII value from 0 to 255) for defining our column size. For each substitution of ith position of plain text PMi-1 we have to locate index of column and in the same manner ith position of extended Key array value, EKi-1 that will locate the index of Row element then corresponding element of index of Extended Vigenere's matrix [EKi-1 PMi-1] will substitute a plain text by cipher text PMi-1.

The extended Vigenere's matrix (Fig. No. ) is given by following manner

>>>>>matrix<<<<<<

This process of substitution is done block by block. The reminder section that is having w character after x block will simply perform a stream cipher encryption.

Design of Decryption Schema:- Decryption is performed by taking same extended key. Our decryption algorithm is uses same Extended key generation algorithm what we have used at the time of encryption because this is a type of symmetric algorithm and key should be same at both phases. So we will have block wise information of encrypted Cipher Message (CM),

CM0

CM1

CM2

CM3

## --- -- -- -- ---- ---- ---- ---- ---- ---

CM(n-1)

Here each CM(i-1) is represent the ith character of Cipher Message

And after expanding of key, we have Extended Key (i.e. EK).

EK0

EK1

EK2

EK3

## --- -- -- -- ---- ---- ---- ---- ---- -

EK(n-1)

and corresponding EKi-1 show ith character of Extended key.

From character CM0 to CM(Î¸-1) is included into CMÎ¸(1) is first block of plain massage. From character CMÎ¸ to CM2(Î¸-1) is included into CMÎ¸(2) is second block of plain message. Like that From CM0 to CMÎ¸ = CMÎ¸(1) From CMÎ¸ to CM2(Î¸-1) = CMÎ¸(2) -------- and so on.... from CMxÎ¸ to CM(x+1)(Î¸-1) = CMxÎ¸ where the no of block = x

Decryption Algorithm Details: - For decrypting, we first divide the Cipher message in blocks and the size of the block will be as same as it was at the time of encryption and if reminder is there for cipher length n then we will apply stream decipher concept for decrypting reminder message, this is same as what we have done at the time of encryption. The following fig.No.03 Shows decryption work.

Fig No.03 (Block Diagram for decryption block wise)

So we will get same no. of Block i.e. x what we have calculated at the time of encryption and in n/x is not integer then will get reminder w no. of character those will be decipher by stream cipher method. For decryption of each block, ith position of EKi-1 will select the index of Row of Extended Vigenere's Matrix and will check that at which column the value PKi is exist, that select the column and plain text of ith position will be at first row of the matrix.

## Implementation

The implementation of the proposed work is done in three phase - key generation, encryption and decryption. The following section describes the working for different modules:

Key Generation Algorithm

In first step, the plain text on is divided into the blocks. Each block is equal to the size of key and one key is used by one block. For next block the key is expanded by performing XOR between the key characters and adding the incurred value to each key character. This gives a new key, which is used in next block. Following algorithm has been used for the above work in Fig No 04.

Fig No. 04 (Key Expansion Algorithm)

Encryption and decryption algorithm

For encryption, plain text in divided into block of length Î¸ such as 1st Block PMÎ¸(1) :- PM0, PM1, PM2, . . . , PM(Î¸-1) 2ed Block PMÎ¸(2) :- PMÎ¸, PMÎ¸+1, PMÎ¸+2, . . . , PM2(Î¸-1) etc. Using key generation algorithm we get a new key for each next block. There is matrix where the value of the cipher is stored corresponding to the key and the plain text. Following is the implemented algorithm for encryption Fg No.05:

Fig No.05 (Encryption algorithm)

Decryption is just reverse process of how we encrypted the message. The key is taken from the key generation function for each block. The cipher text and generated key is passed to the decryption function to obtain the decrypted text. Following is the implemented algorithm for encryption shown in Fig No. 06 (on the next page.)

Fig No.06 (Decryption Algorithm)

Implementation of the algorithm is done in an application for Shared Storage build in Microsoft Visual Studio 2005 (using language VB.NET) for front end and Microsoft SQL Server 2005 as a Back End. Where a user can browse their document file in text format (that may be .txt, .doc or .docx) then according to his request area of document will be secured. Following snap shot Fig. No. 07 shows(on next page)

Fig No.07

## Conclusion:

Due to rapid development in the field of computer the information security b became a very challenging task. Since there are many limitations of the Vigenere method by using the key expansion method.

This paper proposes a new method of encryption of information in the shared environment which is an extension of classical vigenere algorithm. For a computer it may be zero and one but the importance of that is known only to the expertise. And our algorithm also meets with the syverson's principle which guarantee for integrity, confidentiality and authentication.