This essay has been submitted by a student. This is not an example of the work written by our professional essay writers.
Although Encryption can provide good security for stored data, but various factors must be taken into consideration while developing a strategy. Optimum balance must be maintained between Security required and performance needed. Encryption at database level rather than file or application level can be useful to protect data and also achieve better performance. There are various architectures for the same which fall in two categories
Decrease Encryption overhead
Limiting number of Encryption operations
Also, the security and performance are complex issues and should be handled by expert who has thorough knowledge about the domain must handle this. If sensitive data is not handled carefully by organization, they may face various legal, regulatory, legislative or brand consequences. Thus depending solely on database access control mechanism and perimeter security won't be adequate. Packaged database is also another solution if cryptography expertise isn't available inhouse.
This paper focus on performance aspects 3 major topologies for Database Encryption.
There are various architectures, tools available which helps in managing and ensuring that performance and security have optimum balance. Each of these have advantages and disadvantages associated with it. Database security has concepts like statistical database security, intrusion detection associated with it. These essentially protects the privacy and confidentiality of the information but by no means address any issues related to performance. In this report we address one of the most critical issue associated with the successful Database Encryption -- Performance. In order to understand that first, different topologies have been analyzed and then performance parameters of each of these is taken into consideration.
DIMENSIONS TO ENCRYPTION SUPPORT
When considering encryption of databases, three dimensions need to be considered.
Granularity of Data: The options available are a field, a row or a page. The field is by far the best choice as it reduces the number of bytes that are encrypted. But also it will need ways to embed encryption with databases or database servers. There are only few available databases which support row or page level encryption that is worth for the cost associated with it for large data.
Software versus Hardware Level Implementation of algorithms: This also has substantial effect on performance of algorithm. The encryption of relational databases at hardware level involve significant startup cost associated with encryption operation.
Location of the Encryption service: The location of service involves local service , remote procedure service or network attached service. Selection of this points is important as it not only affects the work to be done from integration point of view but also affects the security model.
Generally it is better to encrypt data as soon as it enters the network but sometimes it is not advised due to distributed business logic of application and environments. Encryption in DBMS is very good at protecting data at rest, but protection of data transferred between database and application is also must for better security.
In Database level encryption, the data that is written and read from database is encrypted. This is generally done at column level in a table. If this is implemented along with database security and access control mechanism, can avoid theft of important and critical data. Database level encryption also protects against various threats like storage media thefts, database attacks, storage attacks and also malicious Database Administrators.
Database level encryption does not need any changes to be made to application and also it facilitates use of stored procedures and triggers with ease in order to implement business logic in the DBMS.
As the encryption is done at database level, enterprise using it need not understand the various aspects associated with accessing the encrypted data by application. Although this solution secure data, there are various integration activities needed at database level, which includes modifications to current schemas in use and also implementing triggers and stored procedure to add encryption and decryption functionality.
Also, attention need to be paid to performance. For improvement in performance, accelerated index search can be implemented on encrypted data. Other alternatives include - First, only the sensitive fields can be encrypted. Second, hardware can be used to achieve this kind of encryption and thereby offload the cryptographic process load in order to enhance performance.
The major disadvantage of this type of implementation is that, it does not protect against any kind of application level attacks because the encryption is implemented inside DBMS.
In Storage level encryption, the data is encrypted at storage subsystem which includes either at file level like NAS/DAS or at block level like SAN. This method is best for encrypting files, storage blocks, directories and tape media.
Storage level encryption helps to secure data without use of LUN (Logical Unit Number) masking which is helpful in large storage environments.
Although it can be useful in order to segment workgroups and provide security - it has various limitations. First, it protects only against some threats like media theft and storage system attacks only and does not protect against most of the application and database level attacks which are direct threats to critical data. Second, it only provides block level encryption i.e. they don't allow encryption of selective data thus, entire databse can be encrypted but not specific information or data in database.
CHOOSING THE TOPOLOGY FOR ENCRYPTION
There are various encryption approaches like software or hardware level or different granularities. Each has its advantages and shortcoming which makes it suitable for some application and unsuitable for some.
Basic software level encryption
There are various algorithms available for the software encryption. Prominent among them are AES, RSA, Blowfish and DES. It is observed from various experiments that AES is better in terms of performance and security than RSA and Blowfish implementations. Also AES is faster than algorithms like DES. DES is slower because it uses 64 bit block cipher, due to this fact, it increases overhead as even 8 bit data is converted to 64 bit when encrypted. AES needs to be first registered in database as user defined function (UDF). After registration, it can be used to encrypt data in field(s).thus whenever data is to be inserted in database, it would be first be encrypted and then stored. When read securely access is needed, data is decrypted before being used.
Basic hardware level encryption
In this generally Hardware Security Modules (HSM) with combination of hardware and software keys is used. The master key is created, encrypted and decrypted on HSM and is not accessible outside HSM. The cost related to processing includes, startup cost which is paid each time a row is processed and encryption-decryption algorithm cost which depends on size of input. Edit routine is called each time a row is accessed by DBMS. The edit routine thereby invokes encryption decryption algorithm which is implemented in hardware for entire row. In this as discussed earlier, column level protection isn't available which affects the overall security.
The time taken for a query on same unencrypted and partly/completely encrypted data will differ such that, time taken for encrypted data is always greater than that of the unencrypted one as it needs extra time for both the decryption as well as routine or hardware invocations. This increase in time is termed as Encryption Penalty.
Generally in any database not all the fields have same sensitivity. Also it is possible in Hybrid that only selected fields in selected tables are encrypted. In any case encryption will slow down process but with some measures overhead can be minimized. It is better to encrypt sensitive information like Social Security Number, Credit Card Info, Medical Records but information like Booleans or non-sensitive information or small sets like the integers 1 through 10 should better not be encrypted as it increases considerable overhead. Also creating indexes on encrypted fields can also be useful in some cases. Generally encrypted data is stored in binary form which when to be search needs range scanning. This essentially needs table scans which decrypts the all the rows for any given column which is not advised for sensitive data. So using accelerated search index is best practice.
PERFORMANCE OF DIFFERENT ENCRYPTION
There are basically three topologies
Network attached Encryption Device
Combination of Hardware Security Module (HSM) and Software
Each has its associated advantages and disadvantages. It can be observed that adding only centralized security measures and encryption is not enough since it not only penalizes the system but also opens security holes. Various aspects of Database security include intrusion detection ,statistical database security and privacy preserving data mining and also designing information systems such that it confidentiality of data while not obstructing movement of information.
Thus now users can access the data for any application without overhead of having different solutions for different applications and data storage systems. Thus this would solve the problem of maintenance of software and their administration at level of specific application or database.
There are many technological issues associated with database privacy as a part of enterprise IT infrastructure. Most important among them is encryption key management. For most of the organizations, data is a valuable asset. Thus key management should be sophisticated enough to protect the distributed use of encryption keys. The combination of hardware and software based solution for encryption can be a possible solution to the problem. A distributed policy and audit capability can be used to manage various encryption keys. Also the overhead for interactions between databases and IT infrastructure increases because of encryption, which reduces performance whose source must be identified and thereby acted upon.
Network Attached Encryption
A NAED is a kind of hardware device that is deployed on network, stores the encryption keys and also perform all cryptographic operations. This provides additional security by separating the keys and the data. But this can degrade performance by a factor of 10 to 100 due to overhead caused by encryptions. To understand overhead consider a user request of 500,000 encrypted data. Whenever a user requests the particular data, the security system performs the task of fetching data from the database after proper authentication of the user and then decrypting it. In this topology, request and retrieval of encrypted data is handled by encryption agent. This encrypted data is sent to the NAED for decryption where keys and algorithms are stored to decrypt the data. After decryption clear text data is to be sent to database server over the network. Now the information must be protected using a secure communication mechanism like SSL. After data is received on the database server, it must be returned in clear text and then presented to the application and in turn to the user, who requested the data.
In NAED topology at encryption occurs at three places. As in above example , 500,000 rows is sent over network to NAED for decryption which then encrypts it using SSL and sends to the database server which provides the data in clear text to application.
Lot of network overhead is caused due to transfer of 500,000 rows over the network to and from NAED.
NAED is a stateless device i.e every time a row is to be decrypted , it needs to be initialized , thus in this it has to be setup 500,000 times.
It has been observed that due to this overhead, other alternatives seem to be better. It is observed that roughly it takes 1 millisecond per row for processing. Thus for this case it will take about 500 seconds while other alternative tend to take around 25 seconds. It is generally presumes that NAED offload work from databases but as seen in real a decryption from SSL has to be performed at database itself which is similar to other alternatives available on hand.
The Hybrid System
This topology basically combines, software which provides enhanced performance and hardware which is responsible for better security. In some cases HSM is best solution in order to add protection for the encryption keys-most important elements for any security solution. HSM devices are fast and also they are tamper proof, thus they are ideally best place to sore the encryption keys.
The performance of hybrid topology is almost similar to that of software solution one, with sporadic transfer of data to HSM in order to refresh and retrieve the master encryption keys. Thus we can say that during most of the working time, performance is similar to pure software solution.
Consider earlier 500,000 row example for this topology. In earlier case, the entire 500,000 rows are transferred over network to NAED and then again transferred back. But in Hybrid system it is implemented as distributed system which scales to available number of processors and servers. In pure software topology, the database server itself act as encryption/decryption agent. When application requests any encrypted information, there is local decryption of the needed data at database server itself and thus clear text data is returned to the requesting application. Here, the encryption like SSL isn't needed also the setup and running time overhead of hardware is eliminated. As the entire decryption process is handled at database server it eliminates processor, network and setup overhead which helps in increasing the performance. Now the performance is around 25 seconds considering 0.05ms per row i.e. 3,000-32,000 rows decrypted per second which depends on column and table level encryptions and also caching.
This topology has linear increase in performance with number of database server. It has been observed that , a system with twelve database servers, decrypted 2,100,000 rows per second.
Additional tuning to limit the number of crypto operations
It is known that any cryptographic operation essentially adds overhead which depends on number of operation performed. These number of operations can be reduced by use of various techniques with mature solutions. These techniques if used, can reduce the overhead significantly especially when there are large number of operations involved.
Search on encrypted data without first decrypting to clear text
Techniques have been developed, such that they allow searching on encrypted data itself without need of decrypting the data. The result thus obtained is then converted to clear text. But many vendors don't provide this functionality which results in large cryptographic operations but reduces performance to great extent. It can be accelerated further by indexing the encrypted column itself which reduces response time by factor of 10 to 30 in some databases like Oracle which increases through put significantly as compared to solutions which are not using the indexing.
Exact searching is also possible if encrypted column is initialized by same initialization vector for entire column. Partial matches can be tricky and may need full table scans if it not tuned by accelerated index search on the data. Encrypted columns can be primary key or part of it as the encryption always provide same result for same set of data and also any two different data will also generate different cipher text only if key and initialization vector used are consistent. Also it must be noted that, if entire column is to be encrypted depending on data migration, then existing primary keys and reference keys may have to be dropped and re build after the encryption. Thus generally encryption of column which is part of primary key is not advised if accelerated index search is not used. Also migration must be done carefully if the table is to be converted to hold encrypted data, all referenced which it has as constraints must be converted first. Sometimes this referential constraints may have to be disabled to proceed and can be re-enabled once data for all related tables have been encrypted. So due to this complexity, encrypting a column which is part of foreign key is generally not recommended. Also encrypted foreign keys does not have any significant impact on performance unlike indexes and primary keys.
Indexes are created so that it speeds of process of searching a particular record in database. If index is to be encrypted it must be done carefully as it may degrade the search performance in large database as each time there is additional overhead of decrypting the index field especially if accelerated database indexing is not used. This is not generally preferred by DBA as indexing is usually done on fields like SSN or Credit card numbers - which needs to be encrypted.
Thus it is generally adviced to avoid encrypting indexed column if accelerated index-search on encrypted data is not used.
Limit the types and amount of data (e.g. column level) that are encrypted
It is important in order to identify which data needs protection and which doesn't and can be store in clear text.It is common occurrence in many organization that more data is encrypted than needed. Prima facie it may seem harmless i.e., but it will surely degrade performance. It is advised generally to implement encryption on column level to support organization's security policy.
Perform operations without decrypting the data
As the search operation other advanced operations like joins can be directly operated upon encrypted data itself, without need of converting to clear text, thus maintaining the system performance.
Query rewrite to improve encryption overhead
A common sub-expression elimination (CSE) can be applied to expensive user defined functions for a query. Common sub-expression detection as well as elimination are techniques of complier level optimization. An occurrence of an expression is a common sub-expression (CS) if there is another occurrence of the expression whose evaluation always precedes this one in execution order and if the operands of the expression remain unchanged between the two evaluations .
ENCRYPTION KEY MANAGEMENT
The key management is also important part of any security solution involving keys. It includes generation and maintenance of keys throughout their life. As keys are used for encryption as well as decryption of data, the protection of keys is equally important as that of the data.Security comprises of two major factors-
Where the keys are stored and
Who has rights to access them
Thus keys must be generated and managed securely. This can be done by centralizing the task of key management. This not only provides greater efficiency but also reduced operational costs. Thus any security solutions must include automated mechanism for key rotation, replication and backup. In order to balance security, cost and performance the solutions deployed must be combination of software and specialized cryptographic hardware, HSM etc. Storing all the keys in single database having restricted access is good option but is generally not advised as anybody who has access to keys can thereby access the data.
A multilevel security can be one possible solution where in at higher level there is a tamper proof hardware device and at lower level is software security depending on sensitivity of the data. This kind of security combines best features of both hardware and software based security solutions. The software solutions have good performance when implementing encryption for small blocks but may result in exposing of keys while hardware solutions don't have performance but they don't tend to expose the keys stored using them.
Secure encryption of short blocks
When CBC (Cipher Block Chaining) is used as block encryption technique, a random initialization vector (IV) is used for encryption and decryption which need not be kept encrypted but is needed in order to decrypt the data. Sometimes IV is associated with each column and in that case storing it in separate table is advisable. Another more secure solution is generating IV for each row and stored with data itself. Also in some cases multiple columns are encrypted and in such cases if, space is limited, same IV can also be used for each value in row although the encryption keys may be different for each column.
Controlling the use of encryption keys
There are numerous method like PKI, password based authentication, host based authentication, third party authentications like Kerberos available for authentication. Generally most of solutions has password as method of authentication. This password can be changed by associated user as well as DBA. The DBA may change user password in response to request from the user. In some cases the DBA can also temporarily disable user or alter the password without the detection as the the DBA has necessary rights.
A separated security policy
The Data Directory consists of various views and catalogs. The contents in these catalogs are updated by database server by system commands and are generally not to be altered manually by any user even the DBA. However, DBA can also alter these contents. So in order to avoid any changes made to these contents security catalog can be implemented with following properties
It cannot be manually altered and
Access to it controlled by various strict access and authentication control policies.
Giving entire control of security to DBA can be dangerous in the event of compromising of DBA which would result in compromising of entire system. To avoid this, each user is given a certain level of control over the security. This method of access control is done by the concept of privileges. Various views and stored procedures can be used in order to give this limited access to users.DBA is generally responsible for entire system to run efficiently and also has capacity to the damage the system the most.
Thus separating the security directory can be a possible way to this where a security administrator (SA) assigns a particular set of permissions to the users. In any enterprise solution, SA operates using a separate middle ware called the Access control System which works closely with DBMS and is responsible for verification, authentications, audit, encryption and decryption. This kind of solution separates DBA from SA. Example: in case of outsourcing of database services, the DBA would be a external party while SA would be kept in company. Thus the DBA in this case won't be able to access the decrypted content and may have privilege to other routine tasks associated with it.
We started with first analyzing the performance issue and evaluated various solutions in order to enhance the solution. As we have seen that Hybrid solution is by far the best. The deployment of hybrid solutions have many challenges like more overhead for searching, infrastructure and management of IT infrastructure components. But in order to implement encryption, these overheads are unavoidable and tolerable.
Also for implementation of this kind of model it is necessary to have sophisticated hardware and software solutions and also professionals in key management and other tasks.
It can be said that database privacy as infrastructure service can be a good model and also has ability to emerge successful in most applications