The Different Types Of Normalization Computer Science Essay

Published: Last Edited:

This essay has been submitted by a student. This is not an example of the work written by our professional essay writers.

Definition. Normalization is the process of efficiently organizing data in a database. There are two goals of the normalization process: eliminating redundant data (for example, storing the same data in more than one table) and ensuring data dependencies make sense (only storing related data in a table). Both of these are worthy goals as they reduce the amount of space a database consumes and ensure that data is logically stored.

The Normal Forms

The database community has developed a series of guidelines for ensuring that databases are normalized. These are referred to as normal forms and are numbered from one (the lowest form of normalization, referred to as first normal form or 1NF)

Types of normalization

First Normal Form (1NF)

First normal form (1NF) sets the very basic rules for an organized database:

Eliminate duplicative columns from the same table.

Create separate tables for each group of related data and identify each row with a unique column or set of columns (the primary key).

Second Normal Form (2NF)

Second normal form (2NF) further addresses the concept of removing duplicative data:

Meet all the requirements of the first normal form.

Remove subsets of data that apply to multiple rows of a table and place them in separate tables.

Create relationships between these new tables and their predecessors through the use of foreign keys.

Third Normal Form (3NF)

Third normal form (3NF) goes one large step further:

Meet all the requirements of the second normal form.

Benefits of Normalization

Normalization provides numerous benefits to a database. Some of the major benefits include the following :

Greater overall database organization

Reduction of redundant data

Data consistency within the database

A much more flexible database design

A better handle on database securit SELECT lastname, bcfgrade

FROM club, player

WHERE club.clubid = player.clubid

AND = 'Horfield & Redland'



SELECT name, CONCAT( contact, ' ', lastname ) AS contact_lastname, CONCAT( contact, ' ', firstname ) AS contact_firstname, noclocks

FROM `player` , club

WHERE player.clubid = club.clubid


SELECT lastname, bcfgrade,



bcfgrade *5




AS eolgrade

FROM club, player

WHERE club.clubid = player.clubid



name = 'bristol & clifton'

AND bcfgrade <216


ORDER BY eolgrade ASC


SELECT lastname, firstname, role

FROM club, officer, player

WHERE club.clubid = officer.clubid

AND player.playerid = officer.playerid

AND YEAR like "%2006%"


Q 6(a)-Referential integrity.

Definition: Referential integrity is a database concept that ensures that relationships between tables remain consistent. When one table has a foreign key to another table, the concept of referential integrity states that you may not add a record to the table that contains the foreign key unless there is a corresponding record in the linked table. It also includes the techniques known as cascading update and cascading delete, which ensure that changes made to the linked table are reflected in the primary table.

Referential integrity simply means that the values of one column in a table depend on the values of a column in another table. For instance, in order for a customer to have a record in the ORDERS_TBL table, there must first be a record for that customer in the CUSTOMER_TBL table. Integrity constraints can also control values by restricting a range of values for a column. The integrity constraint should be created at the table's creation. Referential integrity is typically controlled through the use of primary and foreign keys.

In a table, a foreign key, normally a single field, directly references a primary key in another table to enforce referential integrity. In the preceding paragraph, the CUST_ID in ORDERS_TBL is a foreign key that references CUST_ID in CUSTOMER_TBL.

Importance in database design,, construction and operation. Referential and entity integrity are crucial to preserving valid relationships between tables and data within a database. SQL queries will begin to fail if the data keys that connect the dots between their relationships do not match. If an entity or table is relying on the keys in another entity or table, then relationships between the two can be lost if bad data is entered into one location.

For instance, referential integrity can be used to ensure foreign key values are valid. For instance, a database table listing all the parts installed on a specific aircraft should have referential integrity connecting the part numbers to a table listing valid part numbers for that aircraft so that in the event of a bad part number being "fat-fingered" into the database, the RBDMS will return an error concerning the bad data (IBM 2001).

Data Integrity

Enforcing data integrity guarantees the quality of the data in the database. For example, if an employee is entered with an employee ID value of 122, the database should not permit another employee to have an ID with the same value. If you have an employee rating column intended to have values ranging from 1 to 5, the database should not accept a value outside that range. If the table has a column that stores the department number for the employee, the database should permit only values that are valid for the department numbers in the company.

Two important steps in planning tables are to identify valid values for a column and to decide how to enforce the integrity of the data in the column. Data integrity falls into the following categories:

Entity integrity

Domain integrity

Referential integrity

User-defined integrity

Q6b-Technical issues in managing database


There is always a limit to the kind of recovery that can be provided. If a failure not only corrupts the ordinary data, but also the recovery data--redundant data maintained to make recovery possible--complete recovery may be impossible. As described by RandeU [RAND78], a recovery mechanism will only cope with certain failures. It may not cope with failures, for example, that are rare, that have not been thought of, that have no effects, or that would be too expensive to recover from. For example, a head crash on disk may destroy not only the data but also the recovery data. It would therefore be preferable to maintain the recovery data on a separate device. However, there are other failures which may affect the separate device as well--for example, failures in the machinery that writes the recovery data to that storage device. Recovery data can itself be protected from failures by yet further recovery data which allow restoration of the primary recovery data in the event of its corruption. This progression could go on indefinitely. In practice, of course, there must be reliance on some ultimate recovery data (orrather, acceptance that such recovery data cannot be not totally reliable).

Techniques and utilities that can be used for recovery, crash resistance, and maintaining consistency after a crash are described in this survey. Included are descriptions of how data structures should be constructed and updated, and how redundancy should be retained to provide recovery facilities.

This survey deals with recovery for data structures and databases, not with other issues that are also important when processes operate on data, such as locking, security, and protection [LIND76]. One possible approach to recovery is to distinguish different kinds of failures based on two criteria: 1 the extent of the failure, and 2 the cause of the failure. The three kinds of failures typically distinguished in many database systems are: a failure of a program or transaction; a failure of the total system; or a hardware failure. Different "recovery procedures" are then used to cope with the different failures. A good description of such an approach to recovery has been given by Gray [GRAY77]. However, recovery is approached from a different angle in this paper. To be able to recover, two kinds (i.e., functionally different kinds) of data are distinguished: 1 data structures to keep the current values and 2 recovery data to make the restoration of previous values possible. This paper examines:• how these data structures and recovery data can be structured, organized and manipulated to make recovery possible (all this is referred to as the recovery technique); • how the data structure can interact

with the structure of the recovery data;• what kinds of failures can be coped with by the different organizations; • what kind of recovery {e.g., restoration of the state at time of failure, the previous

state, and so on) can be provided using these organizations; • and how different techniques can be

combined in one system to cope with different failures or to provide different kinds of recovery (e.g., one technique may be used as a fall back for another one). Concurrency.

Concurrency Definition: Database concurrency controls ensure that transactions occur in an ordered fashion. The main job of these controls is to protect transactions issued by different users/applications from the effects of each other. They must preserve the four characteristics of database transactions: atomicity, isolation, consistency and durability


Security is defined as:

• (Availability) Access to information of authorized users

• (Secrecy) No access to information to unauthorized users

• (Integrity) Only authorized users should be able to modify data

Attacks on databases, "Trojan horse"

â-® Two users of databases: marek and mirek. The user marek is a "bad guy", while

the user mirek possesses an information that the user marek wants. The bad guy

marek wants this information on the continuing basis

â-® The bad guy marek does not want to use cron because the administrator raphael

will see it at once

â-® The user marek breaks into mirek's account once and does three things. First,

installs a hidden application program. Second, using the active component of

DBMS installs a rule that executes application program (which copies the user

mirek updates into a table in marek's table.) Third, gives user mirek permission

to write to marek's table.


1-Optimization is all about designing and maintaining a database to insure speed of retrieval.

2-The most important factor is basic design.

i-Understand the network

ii-Know what system is doing

3-Next, understand the basics of query indexes and performance statistics.

As the sophistication of database increases, the need to optimize performance will also increase.

Goals of optimisation.

Optimisation attempts to achieve the greatest efficiency possible in operations.

4-It also strives to identify best methodology to use and best solution to problems.

Database is optimised in terms of, file storage, data access speed, query performance, efficiency of indexes.

5- Environment.

Optimisation requires understanding of the entire system.

Environment is an important aspect of optimisation, hardware, networking connections, server performance, coding , indexes.

Indexing in a related database creates a performance trade off that is often overlooked.