This essay has been submitted by a student. This is not an example of the work written by our professional essay writers.
Query optimization is the activity of choosing a well-organized execution strategy for processing query. The area of query optimization is very large within the database field. The report will describe about query optimization that enable oracle's customers and user to attain higher performance. Query optimization techniques are supreme in the extent of its functionality. This paper will describe all important parts of query optimization. We also describe techniques and ways to solve the complex query.
In order to database, we know that the query is written by user in any high level language and expect the result of optimize from the system by using any DBMS. There are many plans that a database management system (DBMS) can follow to process it and give its answer. The final result will be the same in the different methods followed, but how you make sure that is not costing so much you and it is using limited memory and getting result from the most capable way. This query optimizer deals with these things. So the query optimizer is an important part of a DBMS that attempts to decide the most capable way to execute a query. The cost-based query optimizer's is an approximate cost to each achievable query plan, and also choose the plan which will cost to you less. Costs are used to calculate approximately to the runtime cost of measures of the query, like no. of I/O operations required, the memory requirements, the CPU requirements, and other factors determined from the data dictionary. The first relation model was developed profitable; it has one biggest drawback that the functioning of the queries was insufficient. So according to tackle this problem many research have been completed and to also helped to create capable algorithms for the query processing. Well all queries can be complex and simple. So that to examine the query which is the most cost effective. However, there are many way and techniques to solve the query. But in query optimization has only two techniques are HEURISTIC RULES that order for operations/process in a query and COST BASED RULE TECHNIQUES that comparing different strategies based on relative costs and selecting one that minimize resource usage.
WHAT IS QUERY OPTIMIZATION?
One of the great qualities of a relational database is its capability to access data without predefining the access paths to the data. When SQL query is submitted to an oracle database, oracle must decide how to access the data. The process of making this decision is called query optimization, because oracle looks for the best way to retrieve. If get a better and efficient performance of relational database, it is very important to use of query optimization, especially when execute difficult SQL statements. So that query optimizer adopts the better way to execute query. Query optimizer selects for instance, for example, whether or not to use index for a given query, what join tech's use when joining multiple tables. These technique adopt by query optimizer have great effect on SQL performance and query optimization is a key technology for every application like content management, data warehouse and analysis system. The query optimizer totally obvious to the applications and end user because complicated SQL is produced by applications and query, optimizer break these complicated queries into better performance shape but equitant to the query. Query optimizer uses COST BASED STRATEGY. In this strategy multiple implementation plan is created for given query and then calculate cost for each plan and also accept for the best capable plan on the basis of lowest approximate cost.
Main steps in processing in high-level language
Query process follows this way in this diagram and the other way query follow first three steps only in one time, and after completed during three steps then code is created and saves in the database and then executed by query processor. The functions of query parser is parsing and translating a given high-level language query into its direct form such as relational algebra expressions. In parser function it can check the query of whole sentence and also check for the semantic of the query. A parse-tree of the query is to created and translated into relational algebra expression.
A relational algebra expression of a query indicates that how to evaluate a relational algebra expression. For example, examine the query:
SELECT salary FROM STAFF WHERE salary<=40000;
A relational algebra operation can be executed using various strategies/algorithms. For example, to implement the preceding selection, we can do a linear search in the EMPLOYEE file to retrieve the tuples with salary>=40000. However, if an index available on the salary attribute, we can use the index to locate the tuples. In order to specify how to evaluate a query, the system is responsible for construct to query plan which is made up of relational algebra expression. The process of choosing a suitable query execution plan is known as query optimization. This process is performing by Query Optimizer.
TYPES OF OPTIMIZER
Oracle supports two different oracle query optimizers is a RULE-BASED OPTIMIZER AND COST-BASED OPTIMIZER. In the Oracle 10g, the rule based optimizer is desupported. Well rule-based optimizer works according to particular rules, for example, if index is given, so this index will be used for access and individual of the values in the WHERE condition. Well rule-based strategy is for processing SQL statements is decided at the time of parsing. According to Cost based optimizer is the best strategy with the help of statistical information about the size of table of columns. A cost-benefit plan is created for the many access options. The best strategy is chosen to execute the command depend on the value to be defined in WHERE condition. Therefore, the eventual search strategy can only be determined at the time of execution. Well to improve query optimizer, oracle was introduced optimizer in oracle 7, which select the strategy to require the minimum resource to use necessary to process all rows accessed by query. Cost based optimizer also takes into smatter hints that the user may provide.
SELECT S.FNAME, S.DEPTNO, B.DEPTNO, B.LNAME FROM STAFFS, BRANCH B WHERE S.DEPTNO=B.DEPTNO;
These statement go into three phases through which a SQL statement is parsed and executed.
Parse- In this first stage is parse in SQL query for syntaxes (to see the syntactically query is correct) and semantics(make sure that tables and columns exit and you have permissions to access the objects) are checked and create a query processor tree which defines logical steps to execute the SQL. All this is done in the library cache of the SGA. This process is also called as 'algebrizer'.
Optimize- The Optimizer is used at the next stage in SQL query. This task is done by using 'Optimizer'. 'Optimizer' takes data statistics like how many rows and unique data, do the table span over more than one age etc. In other words it takes information about data to data. These all statistic are taken, the query processor tree is taken and a cost based plan is prepared using resources, CPU and I/O. The optimize generate and evaluates many plan using the data statistic, query processor tree, CPU cost, I/O cost etc to choose the best plan. The optimizer arrives to an estimated plan which comes out from the optimizer and actual plan is the one which is generated once the query is actually executed.
Execute- The final step is to execute the plan which is sent by the optimizer .
Query optimization by oracle
The most successful optimizer in market is oracle optimizer was introduce in 1992 with oracle 7 and cost based optimizer improved after the next year of this it was an attraction for real world customer's familiarity. Well the good quality is not developed in laboratory or any theoretical assumptions except, well is develop by adopting actual customer's needs/requirements. Mostly database applications use oracle query optimizer than to other query optimizer, has repeatedly benefit for real world application input. Oracle optimizer has four major components just discuss little bit introduction about this components and will later discuss in briefly.
During query optimization oracle uses different techniques to transform SQL statements into low level language. The basic aim of query optimization is to convert the original SQL statement into a semantically equivalent SQL statement that can process more resourcefully. These all transformations are directly and entirely 'transparent' to the end user and applications, transformation process not generated by user or application; it is arise automatically during query optimization. Oracle implements a range of SQL transformations but mainly dropped into two categories.
1. Heuristic Query Transformation/Heuristic Rules Techniques.
2. Cost Based Query Transformation/Cost-based Query Optimization Techniques.
In heuristic rules, is the rule that works well in most of cases, but not in all cases? In define that it orders of the execution and operation of the data in a query, which uses transformation rules to transfer relational algebra expression into an equivalent form. So oracle knows to apply heuristic query transformation approach because this will not disgrace performance.
View merging with simplest way
CREATE VIEW TEMP-VIEW AS
SELECT EMP.EMPNAME, DEPT.DEPTNAME, EMP.SALARY FROM EMPLOYEE EMP, DEPARTMENT DEP
SELECT EMP.EMPNAME, DEPT.DEPTNAME FROM TEMP-VIEW
If no any query transformation process use here then it can simply join all the rows of EMPLOYEE and all the rows of DEPARTMENT table, and then select the rows with suitable value condition on salary.
But after transformation it looks like
SELECT EMP.EMPNAME, DEPT.DEPTNAME FROM EMPLOYEE EMP, DEPARTMENT DEPT
When transformation processing is start then condition of salary>4000 can be applied before the join tables of the employee and department table, so this query transformation technique is the best or superior because to reduce the number of records to be join. So, it is very simple example for value of query transformation.
Execution Plan Selection:- For every SQL statements, well the optimizer select a execution plan which can be seen using oracle's 'EXPLAIN PLAN' or 'v$sql_pla'n views. When execution plan describes all the steps when SQL processed, and giving result and how tables are joined together and how table are access by index. The optimizer believes many executions plan but use the best one option for each SQL statement.
Cost model and Statistics:-Oracle's optimizer estimates the cost of individual operations that make up the execution of a SQL statement. In order to optimizer to choose the best execution plans and need the best cost estimate plan. The best plan is select for the basis of input, output and memory resources for query process and statistical useful information like indexes, tables and performance information concerning the hardware server platform. so these process for assembly these statistics and performance information needs to both highly capable and automatic.
Dynamic runtime optimization:- well every aspect is not planned for SQL execution at a runtime. Oracle makes dynamic adjustments for the processing strategies. So that the dynamic optimization is to achieve the good performances for the queries for the application in which queries are used by
Cost-based rules techniques
In the query optimization is to find the most capable way for the query processing. According to Connolly and begg, p620, Cost based rule provide a hand it to choose this by express the estimate costs of the all the possible no. of options and select the one which option has most optimum and lowest cost of the processing. However, query optimizers are usually Cost Based techniques. The cost based optimizer is aim to calculation for all the available tables, cluster, and indexes get into the given query. Well, we know oracle is not able to do all these all calculations automatically, well we can leave on the user because it user's responsibility to make and correct these calculations and keep them up to date. However, in the PL/SQL these all calculations for the tables, indexes, and all schema objects are created and made by using DBMS_STATS package in database. Parallel method is used by the oracle to collect these all the calculations whenever these are required for the query but only index statistics are gathered in serial manner.
Example- we are searching all managers who works in a London branch-
SELECT * FROM STAFF S, BRANCH B
WHERE S.BRANCHNO= B.BRANCHNO AND (S.POSTION=MANAGER' AND B.CITY='LONDON');
Now we all see all the possible different strategies to get the answer on the minimum cost for the cost based rule techniques.
(staff.branchno=branch.branchno)(STAFF X BRANCH)
σ(position='manager') Ù (city='London')(
Staff staff.branchno=branch.branchno BRANCH)
(σ position='manager'(staff)) staff.branchno=branch.branchno
(σ city='London' (branch))
To examine this example, we have two tables 'staff' and 'branch'. So there are 1000 records in staff table and 50 records in branch table and 50 managers, one ,manager is for each branch and 5 London branches. So now we can compare each query on the basis required all tables to get the answer of the given query each time. Then to understand and to see each query by assuming that there is no index or sort keys on both relations and results of each operation are store on disk and the records are accessed only one at a time.
In the first query is the Cartesian product of staff and branch tables, which needs (1000+50) =1050 disk access to read the relations and this relation will hold 2*(1000*50) tuples after the creation. So we need to read and to test the relation of tuples against of selection at the cost of one more disk access of this query will be:-
(1000 + 50) + 2*(1000 * 50) = 1050
In second query is a join statement on branch and staff tables so it will read 1000 disk accesses and release relation with 100 tuples. In these join operations is more expensive than selection operation. In selection statement it read each branch tuple to make sure all London branches, which needs only 1000 disk accesses and release a relation with 50 tuples. The query costing total will be like that-
2*1000 + (1000+50) =
In third query options significantly reduces size of relations being joined together. In this query we read all records from staff table and also to make sure the manager records, which needs 1000 disk accesses and releases with 50 records and to use selection operations to reads all the records in branch and to make sure branches is only related from London in the query, which needs 50 disk accesses and release with 5 records. So now, the final query is to be join the reduce staff and branch relations, which needs (50+5), and costing total of this query will be-
1000 + 2*50 + 5 + (50 + 5) =
The result of every plan or query is clearly describe or understand us which are the best plan for the solution to select of reading records from the number of tables.
Query optimization is an important component for achieving good performance and also examine of query optimization and also focus on the techniques which mostly used by commercial systems for its various modules. In this project I am using many strategies but solutions have not yet found their way into practical systems but it could certainly do in future. We believe that next 20 years ago will be as active as the earlier twenty and it will also bring many advances technology use in query optimization, changing many of the approaches currently used in this project. In addition, oracle will continue to provide its customer with leading performance and manageability.