Normalization Dimensional And Nosql Computer Science Essay

Published:

This essay has been submitted by a student. This is not an example of the work written by our professional essay writers.

Normalization ensures that the relations derived from the data model do not display data redundancy, which can cause update anomalies when implemented Connolly, Begg 2005. Normal form is a way of measuring the levels, or depth, to which a database has been, normalized (Churcher, 2012). There are three major different types of normal forms used in oracle 11g namely;

First normal form (1NF): The 1NF in oracle basically tends to look into the database and remove any data redundancy (repeating groups) within the tables.

Second normal form (2NF): The second normal form rule is that all data that are partially dependant on the primary key should be moved to another table including the part of primary key they depend on.

Third normal form (3NF): a table is said to be in the third normal form if a non-key fields depend on a field that is not the primary key.

2.2 Dimensional

According to Eric (2002), dimensional modeling is a process and outcome of designing logical database schemas used to support online transaction processing (OLTP) and data Warehousing solutions. Oracle 11g dimensional model consists of a fact table that is further divided into several dimension tables called the star schemas (a fact table with several dimension tables surrounding it). Star schema dimension makes reporting quite easy and efficient has each dimension table can be independently queried which is an added capability in the oracle11g database. Similarly, according to Thornthwaite (2008), SQL server, makes use of the star schema to speed up query and as well making reporting easier.

2.2.1 Data warehouse and Business Intelligence (BI)

According to Peter (2006), data warehouse is a subject-oriented, integrated, time varying, non-volatile collection of data in support of the management decision making system. In contrast, according to Michelle (2007), BI is a business management term that refers to the applications and technologies used to gather, provide access to, and analyze data and information about a company's operations.

Data warehouse gathers data from various sources. Depending on the BI report needed, the data warehouse can be divided into multidimensional databases to suite the need of BI. BI architecture with the appropriate BI analytic packages can access, analyze data, generate reports and also provide detailed information for decision support system (DSS) from the data warehouse. However, oracle11g database provides various functionalities and capabilities to help aid the development of a data warehouse and supports BI. Below are some detailed functionalities/capabilities provided by oracle11g to support data warehouse and BI.

Scalability: Oracle 11g provides different forms of scalability to include partitioning, which is splitting the database into different units while still maintaining consistency; parallelism, by making sure operation gets the needed resources to perform a particular task while maximizing database resources. On the other hand, SQL server provides same form of scalability offered by oracle11g.

Analyze: Oracle11g provides OLTP, data mining and statistic to perform analytic operations within its data warehouse to support BI. However, the major strength of oracle11g is that, all the analytic packages are embedded within it. This means, an organization can save more money of getting different servers, infrastructures and maintaining same level of security for different analytic packages. In contrast, SQL servers provides analytic packages to support BI but are not embedded within it meaning, it is costly using SQL server for BI compared to oracle11g.

Performance: Oracle11g offers various capabilities to improve the performance of its database. These capabilities includes; star query optimization, advanced indexing and aggregate techniques for reporting, parallelized query, high input/output bandwidth and many more. SQL server though provides some level of performance but not up to that of oracle11g.

2.3 NoSQL (Key Pair Value and Document Stores)

Organizations in the past have found it difficult to store big data and retrieving those data in a timely manner became an issue. With the emergence of NoSQL running key pair value (KPV) and document stores, accessing of big data in a timely fashion became a possibility. However, Orace11g has its own version of NoSQL called oracle NoSQL and runs a version of KPV called Oracle Berkely. Oracle11g NoSQL has its performance and functionalities designed to support, atomicity, consistency, integrity and durability (ACID).

2.3.1 Key-Pair-Value (KPV)

Oracle11g berkeley KPV, uses a key to uniquely identify a pair of value which is a data within the database. A key uses a Major Key Path and a Minor Key Path, both of which are specified by the application. The Major and Minor Key Paths, provides fast, indexed lookup. The operation of finding the value associated with a key is called a lookup or indexing and the relationship between a key and its value is called a mapping or binding.

Oracle11g berkeley KPV stores provides various capabilities to include, the in-memory variants retains data in memory for improved performance (useful for distributed cache mechanism), the on-disk versions save data directly to disk (useful for data storage), high availability due to partitioning and sharding abilities and low latency. Unlike the NoSQL KPV, oracle berkerly KPV, supports ACID functionalities there by, providing data integrity. In contrast, SQL server, those not have the key-pair value functionality for storing and retrieving data in a timely fashion. However, SQL server makes use of indexing. Though, indexing makes data access fast but retrieving an updated data becomes a problem with indexing.

2.3.2 Document Stores

Document stores are an extension of the KVP. Values in a document store are stored in a structured format, a document and a name which is understood by the database. Since the data are transparent data, the database, will be able to perform additional task without having to translate the data into a readable format it understands. In addition, in document stores, queries are not limited to key alone and using document stores allows the fetching of an entire page of data with a single query. On the hand, SQL Azure does not also have document store to facilitate easy retrieval of large volume of data.

3.0 SECURITY AND USER ACCOUNTS

According to Neagu (2012), User account security probably raises the most controversies and is the most difficult aspect of database security. Before a user access the database, an account is created with specific access privileges.

3.1 Attributes of User Accounts

User accounts consist of several attributes. Each defined at the creation stage of the account. The attributes include;

Username: A username uniquely identifies a user of a database. For example, Scott

Authentication method: A set of authentication is needed after a user creates an account. The commonly used form of authentication is password. For example Tiger

Default tablespace: Each database user is entitled to a default tablespace. Schema objects to include tables and indexes that are created are stored in the default tablespace

Tablespace quota: This is the size of tablespace a user is allocated. A user by default has no quota on the tablespace. However, if the user has a privilege to create objects, a quota must be allocated to the user

Temporary tablespace: A temporary tablespace is needed when a user is running an SQL statement that is more than its allocated tablespace, the database stores the segment in a temporary tablespace of the user.

User profile: Roles are mechanism for setting a particular limit on the level of database resources that a database user can access

Account status: User account status can be in various forms to include Lock, Expired, Open and many more.

3.2 SECURITY FEATURES

Oracle 11g database provides various security capabilities to help safe guard the database and its content. The security comes in form of limiting the access of users and also detecting violators of access. The following below are some security mechanisms associated with user account within oracle environment;

3.2.1 Privileges: Privileges are used to grant a user access to what they can actually do within the database. They are issued with the GRANT statement and withdrawn with the REVOKE statement. Privileges are into two different forms, Object and System privileges. The object privileges enables a user to perform to execute the CREATE, SUBMIT, DELETE and UPDATE statement while the system privileges are statements/actions that affect the data dictionary for example create a table or user, session, tablespace any many more.

GRANT privilege ON schema.object TO username; (object privilege statement)

GRANT privilege [, privilege...] TO username; (system privilege statement)

2.1.2 Roles: A role is a collection of privileges that are grouped together and can be granted and revoke at the same time. Roles can be granted using the statement below;

CREATE ROLE Rolename or WITH ADMIN statement

The WITH ADMIN statement implies that a role can be granted to another role. After the role as been created, the intended privileges (either object or system) are granted to the role depending on what level of resources users of the role can access.

2.1.3 Profiles: Profiles ensures that password policies are enforced and makes sure a user does not take more than the allocated resources in a session.

PASSWORD_LIFE_TIME: the number days it takes for a password to expire.

RESOURCE_LIMIT can be enforced by reducing the access rate to resources.

FAILED_LOGIN_ATTEMPTS: number of times an error can occur in a password before account is locked and many more.

2.1.4 Database Triggers: A trigger is an SQL code which is ran just before or just after a data manipulation language (DML) event occurs on a particular database table (Craig, 2011). The statement below shows how a trigger can be configured.

CREATE TRIGGER schema.trigger_name

BEFORE

DELETE OR INSERT OR UPDATE

ON schema.table_name

pl/sql_block

The code above issues a trigger before DML statement is executed. However, the trigger could be made to run either after the DML statement has been issued.

2.1.5 Database Auditing: This uses AUDIT_TRAIL to capture the activities of attempted access by users, capture both failed login and successful login and also capture users' activities within the database. Below is the statement to setup the database AUDIT with the AUDIT of privilege statement.

SQL> audit create any trigger;

SQL> audit select any table by session;

The first command triggers if there is an attempted connection to the database while the second command will AUDIT any SELECT statement made to any table.

4.0 MANAGING TABLES AND OTHER STRUCTURES

4.1 Table Types: Relational database is the mostly used oracle database table type which as both column and rows. The other available table types within oracle database are object tables and XML table.

4.2 Table column attributes: This is the place where data are stored and the database administrator (DBA) needs to specify the attributes when columns are created.

Data type: This defines the type of data to store in the column. The DBA must specify the type of data to be stored in each of the created columns. Examples date, number, variable, integer and many more.

NOT null column constraint: This ensures that a valid value is inserted into the column.

Default value: This value is automatically stored in the column whenever a new row is inserted and no value specified for the column.

Encryption: The DBA can encrypt the column data by enabling automation encryption.

4.3 Table-level constraint: DBA can apply some specific rules to preserve data integrity. For example, if an SQL statement does not comply with the specified constraint, an error message is given and the statement is rolled back. Examples of constraints include;

Primary key: which uniquely identifies a row and those not accept a NULL value. Below is the syntax for primary constraint

COLUMN [data type] [CONSTRAINT <constraint name> PRIMARY KEY] (Column level)

CONSTRAINT [constraint name] PRIMARY KEY [column (s)] (Table level)

Foreign key: For a column(s) all values in the child table is present in the parent table.

COLUMN [data type] [CONSTRAINT] [constraint name] [REFERENCES] [table name (column name)] (Column level);

CONSTRAINT [constraint name] [FOREIGN KEY (foreign key column name) REFERENCES] [referenced table name (referenced column name)] (Table level)

Unique key: Specifies that no two roles can have duplicate values

COLUMN [data type] [CONSTRAINT <name>] [UNIQUE] (Column level);

CONSTRAINT [constraint name] UNIQUE (column name)

Check: Requires that a column(s) satisfy the requirements for every row.

COLUMN [data type] CONSTRAINT [name] [CHECK (condition)] (Column level)

CONSTRAINT [name] CHECK (condition) (Table level)

NOT NULL: This prevents the inclusion of a NULL value in a column.

[Column Name] [data type] [CONSTRAINT (constraint name)] [not null]

4.4 Creating a Table

A table can be created in oracle11g through the table management window of database control, from the database home page. However, a table can also be modified using some SQL commands. Below are some SQL commands for modifying a table;

alter table dept add (NEW Number);

alter table dept modify (NEW Float);

alter table dept drop column NEW;

The first command, adds a new column NEW with a data type called Number. The second modifies the data type from Number to Float. The last command drops the column NEW.

4.5 Indexes

An index is simply a data structure that provides a fast access path to rows in a table based on the values in one or more columns (the index key), (Steelman, 2012). This provides a very good way of searching for values, rather than searching the whole rows and columns. Below are the types of indexes supported by oracle11g database.

4.5.1 B-Tree (Default index): B-Tree index provides excellent retrieval performance by associating a key to a row or ranges of rows. B-Tree index has two different blocks, leaf blocks and branch blocks. Branch blocks store the minimum key prefix needed for making a branching decision between two keys. The technique enables the database to fit a lot of data on each branch block. The leaf blocks contain value of every indexed data and a corresponding RowID used to locate the actual row. Each entry is sorted using key and RowID. Within a leaf block, a key and RowID is linked to its left and right sibling entries.

4.5.2 Bitmap Index: Bitmap index stores a bitmap for each and every index key and each index key, stores pointers to multiple rows. Bitmap indexes are mainly designed to support data warehousing when queries reference many columns in an ad hoc fashion. A bitmap index, is suitable when the number of distinct values is small compared to the number of table rows known as cardinality and the indexed table is either read-only or not subject to significant modification by DML statement.

The syntax for creating and modifying an index ;

CREATE [UNIQUE | BITMAP] INDEX [ schema.]indexname

ON [schema.]tablename (column [, column...] ) ;

However, in other to modify the attribute of an index, the index has to be dropped and recreated. Below is the syntax for dropping and re-creating an index.

drop index [index name];

create index [index name]on table name(column [, column...] ).

4.6 Temporary Tables: A temporary table has a definition that is visible to all sessions, but the rows within it are private to the session that inserted them (Watson, 2008). They are used most often to provide workspace for the intermediate results when processing data within a batch or procedure. They are also used to pass a table from a table-valued function, to pass table-based data between stored procedures or, more recently in the form of Table-valued parameters, to send whole read-only tables from applications to SQL Server routines, or pass read-only temporary tables as parameters. Once finished with their use, they are discarded automatically.

CREATE GLOBAL TEMPORARY TABLE temp_tab_name

(column datatype [,column datatype…] )

[ON COMMIT {DELETE | PRESERVE} ROWS]

5.0 DATABASE PERFORMANCE AND MANAGEMENT AND QUERY OPTIMIZATION

5.1 Database Performance and Management

According to Poder (2011), Oracle11g adds several functionalities for the use of DBA for the improvement of performance and scalability. The most significant of these new features is the introduction of new server cache feature. The new server cache increases the performance of PL/SQL function by returning results from the cache instead of running the SQL codes again. Below are other functionalities for improving the performance level of a database.

5.1.1 Caching Functionality: Oracle Database 11g introduces various new caching capabilities that let DBA's utilize memory more efficiently, that results into faster query processing. There are two types of caching: the server result cache that caches SQL query results as well as PL//SQL function results in the SGA and the oracle call inference (OCI) consistent client cache that lets DBA cache query results on the client.

5.1.2 Optimizer statistic: Oracle Database 11g enhancements improve Optimizer Statistics by making them faster, better, and safer. First, Oracle Database 11g performance enhancements make the collection of optimizer statistics faster - decreasing the total amount of time required to gather and compute optimizer statistics. Secondly, the computed optimizer statistics are more thorough, providing better information to the CBO by correlating statistics, such as Number of Distinct Values (NDV) and histograms, on multiple columns. Lastly, Oracle Database 11g makes gathering statistics safer.

5.1.3 SQL Plan Management: Oracle's Cost-Based Optimizer (CBO) generates SQL Execution Plans based on multiple factors including: CBO version, CBO parameters, object statistics, system settings, etc. Changes in any one of these components can cause the optimizer to generate different execution plans for a particular SQL statement. Various features, such as optimizer hints and stored outlines, have been available to assist in plan stability. Oracle Database 11g takes plan stability to the next level with the advent of the SQL Plan Management capability. With this new feature, Oracle automatically maintains a history of past execution plans and uses this information to ensure that dynamic plan changes don't affect SQL performance adversely. This functionality will help ensure that applications continue to perform consistently.

5.1.4 Automatic SQL Tuning: The first step in automatic SQL tuning is to find the bad SQL statements to tune. Once the Automatic SQL Tuning Advisor task generates a list of candidate SQL statements, the advisor orders the candidate SQL list in the order of importance. The SQL Tuning Advisor then tunes each of the candidate SQL statements in the order of importance and recommends SQL profiles to improve the performance of the SQL statements.

5.2 QUERY OPTIMIZATION

When a SQL query is submitted to an Oracle database, Oracle query optimizer decide how to access the data. The process of making this decision is called query optimization (Media, 2008). Oracle11g looks for the best path in retrieving data from the database using what is called the execution path.

5.2.1 Query Optimizer: when an SQL query is ran in oracle, oracle database decides on the fastest execution path and how to access the data. There are two different kinds of query optimizer namely Role-based optimizer and Cost based optimizer. Role-based optimizer uses some set of predefined rules to determine query optimization. On the other hand, Cost-based optimizer selects the execution path that requires the least number of logical input and output (I/O) operations and thus the lowest cost for the completion of the query.

5.2.2 Choosing an Optimizer goal: This means choosing the least amount of resources necessary to process all rows accessed by the statement. Oracle also optimizes a statement with the goal of best response time. This means that it uses the least amount of resources necessary to process the first row accessed by a SQL statement.

5.2.3 Optimizer Operations: SQL statements can be executed in many different ways. These include full table scans, index scans, nested loops, and hash joins. The query optimizer determines the most efficient way to execute the SQL statement after considering many factors related to the objects referenced and the conditions specified in the query. This determination is an important step in the processing of any SQL statement and can greatly affect execution time.

5.2.4 Access Path: The optimizer first determines which access paths are available by examining the conditions in the statements WHERE clause and it's FROM clause. The optimizer then generates a set of possible execution plans using available access paths and estimates the cost of each plan, using the statistics for the index, columns, and tables accessible to the statement. Finally, the optimizer chooses the execution plan with the lowest estimated cost. Example includes the following;

Full scan: This type of scan reads all rows from a table up to the high water mark (HWM). The HWM marks the last block in the table that has ever had data written to it. During a full table scan, all blocks in the table that are under the high water mark are scanned.

RowId Scan: The rowid of a row specifies the datafile and data block containing the row and the location of the row in that block. Using a RowID to locate a row is the fastest way to retrieve a single row.

Index Scan: Using an index scan, a row is retrieved by traversing the index. Oracle 11g searches the index for the indexed column values accessed by the statement. If the statement accesses only columns of the index, then Oracle reads the indexed column values directly from the index, rather than from the table. Bitmap indexes, index joins and index skip scan are all examples of index scan

5.4 Joins

Joins are statements that retrieve data from more than one table. A join is characterized by multiple tables in the FROM clause, and the relationship between the tables is defined through the existence of a join condition in the WHERE clause. In a join, one row set is called inner, and the other is called outer. Examples hash joins, outer joins, full outer joins and many more.

6.0 BACKUPS, RESTORE AND RECOVER

BackUps

According to Margeret (2009), Backup is the activity of copying files or databases so that they will be preserved in case of equipment failure or other catastrophe. Below are the different kinds of backups in oracle11g

6.1.1 Consistent Backups

A backup is said to be consistent/offline/opened if the backup is carried out while the database is shutdown (Watson, 2008). Datafiles are mostly inconsistent due to some blocks that have been copied into the buffer cache and updated but not yet written into disk. However, to make the datafiles consistent, the database has to be shut down and datafiles closed while all updated blocks in the datafiles are moved to disk. In contrast, according to Orin, Peter, et.l, (2012), SQL server does not permit any form of offline backup. If one filegroup of the database is offline when you perform a full backup, the backup will fail.

User Managed Consistent Backups: These are backups that are carried out using operating system utilities and when the database is in shutdown mode. It involves copying the control files, datafiles and online redo log file (optional).

Sever Managed Consistent Backups: This is made with the recovery manager (RMAN) and can only be carried out when the database is in a mount state. This is so because; access to the controlfiles is needed in other to read the datafiles. At this stage, there is a possibility that the controlfile can be written to thereby bringing about data inconsistency. The RMAN, avoids this data inconsistency by taking and backing up a snapshot of a read-consistent controlfile.

6.1.2 Oracle Incremental Backup

Incremental backups include only those things that have changed since the previous backup and save those things into a separate, additional, backup file or location (Leo, 2008). Oracle11g incremented backups are divided into three different types.

Incremental level 0: This is a starting point for all backups and it contains all blocks. It copies all blocks containing data backup of datafile into a backup set. The command line below is used to configure incremental level 0 backup.

backup as backupset incremental level 0 database;

Incremental level 1: This extracts all blocks that have changed since the last backup level 0 or level 1 backup. Incremental level 1 is also known as differential backup. Below is the command to configure level 1 incremental backup.

backup as backupset incremental level 1 database;

However, SQL server also supports the incremental backup.

Cumulative: This backup blocks that has been used since the most recent level 0 incremental backup. Cumulative incremental backups reduce the work needed for a restore by ensuring that you only need one incremental backup from any particular level. The command below is use to configure cumulative incremental backup.

backup as backupset cumulative database;

6.2 Data Recovery and Restore

The preceding scenario outlined the basics of the restore-and-recovery process. Several variants on this scenario are important to your backup and recovery work.

Datafile Media Recovery: Restore Datafiles, Apply Redo: This is one of the basics form of recovery. It can be used to recover from lost of data, damage current datafiles or control files. Datafile media recovery, are also used to recover the changes that are in the redo logs but are yet to be in the datafiles for a tablespace that went offline. The process involves the manual retrieval of datafile from the backup. After the retrieval, the database does automatically detect that this datafile is out of date and must undergo media recovery. It is important to know that for a datafile to be ready for media recovery, the database that datafile belong to, must not be opened or the datafile can be offline while the database is offline.

Complete, Incomplete Point-In-Time Recovery: A recovery is said to be complete if the database is brought back to the most recent point in time without no loss of committed transaction. Incomplete or point-in-time recovery restores the database to its state at some previous target. Point-in-time recovery is also your only option if you have to perform a recovery and discover that you are missing an archived log covering time between the backup you are restoring from and the target SCN for the recovery

Automatic Recovery after Instance Failure: Crash Recovery: A crash recovery happens the first time oracle database instance is started after a crash. The goal of a crash recovery is to bring database to a transaction-consistent state and preserving all the committed changes up to the point when the instance failed. Crash recovery, uses both the online redo files and current online datafiles as left on the disk

Writing Services

Essay Writing
Service

Find out how the very best essay writing service can help you accomplish more and achieve higher marks today.

Assignment Writing Service

From complicated assignments to tricky tasks, our experts can tackle virtually any question thrown at them.

Dissertation Writing Service

A dissertation (also known as a thesis or research project) is probably the most important piece of work for any student! From full dissertations to individual chapters, we’re on hand to support you.

Coursework Writing Service

Our expert qualified writers can help you get your coursework right first time, every time.

Dissertation Proposal Service

The first step to completing a dissertation is to create a proposal that talks about what you wish to do. Our experts can design suitable methodologies - perfect to help you get started with a dissertation.

Report Writing
Service

Reports for any audience. Perfectly structured, professionally written, and tailored to suit your exact requirements.

Essay Skeleton Answer Service

If you’re just looking for some help to get started on an essay, our outline service provides you with a perfect essay plan.

Marking & Proofreading Service

Not sure if your work is hitting the mark? Struggling to get feedback from your lecturer? Our premium marking service was created just for you - get the feedback you deserve now.

Exam Revision
Service

Exams can be one of the most stressful experiences you’ll ever have! Revision is key, and we’re here to help. With custom created revision notes and exam answers, you’ll never feel underprepared again.