Overview Of On Line Analytical Processing Computer Science Essay

Published: Last Edited:

This essay has been submitted by a student. This is not an example of the work written by our professional essay writers.

Abstract-These instructions give you guidelines for preparing papers for IEEE TRANSACTIONS and JOURNALS. Use this document as a template if you are using Microsoft Word 6.0 or later. Otherwise, use this document as an instruction set. The electronic file of your paper will be formatted further at IEEE. Define all symbols used in the abstract. Do not cite references in the abstract. Do not delete the blank line immediately above the abstract; it sets the footnote at the bottom of this column.

Index Terms-About four key words or phrases in alphabetical order, separated by commas. For a list of suggested keywords, send a blank e-mail to [email protected] or visit http://www.ieee.org/organizations/pubs/ani_prod/keywrd98.txt


LAP is an acronym for On Line Analytical Processing. OLAP performs multidimensional analysis of business data and provides the capability for complex calculations, trend analysis, and sophisticated data modeling. It is quickly becoming the fundamental foundation for Intelligent Solutions including Business Performance Management, Planning, Budgeting, Forecasting, Financial Reporting, Analysis, Simulation Models, Knowledge Discovery, and Data Warehouse Reporting. OLAP enables end-users to perform ad hoc analysis of data in multiple dimensions, thereby providing the insight and understanding they need for better decision making.[1]


Knowledge is the foundation of all successful decisions. Successful businesses continuously plan, analyze and report on sales and operational activities in order to maximize efficiency, reduce expenditures and gain greater market share. Statisticians will tell you that the more sample data you have, the more likely the resulting statistic will be true. Naturally, the more data a company can access about a specific activity, the more likely that the plan to improve that activity will be effective. All businesses collect data using many different systems, and the challenge remains: how to get all the data together to create accurate, reliable, fast information about the business. A company that can take advantage of reliable information and turn it into shared knowledge, accurately and quickly, will surely be better positioned to make successful business decisions and rise above the competition.[1]

The main benefits of OLAP

One main benefit of OLAP is consistency of calculations. No matter how fast data is processed through OLAP software or servers, the reporting that results is presented in a consistent presentation, so executives always know what to look for where. This is especially helpful when comparing information from previous reports to information contained in new ones and projected future ones. "What if" scenarios are some of the most popular uses of OLAP software and are made eminently more possible by multidimensional processing. Another benefit of multidimensional data presentation is that it allows a manager to pull down data from an OLAP database in broad or specific terms. In other words, reporting can be as simple as comparing a few lines of data in one column of a spreadsheet or as complex as viewing all aspects of a mountain of data. Also, multidimensional presentation can create an understanding of relationships not previously realized. All of this, of course, can be done in the blink of an eye.[1]

the olap cube

OLAP (online analytical processing) cubes can be thought of as extensions to the two-dimensional array of a spreadsheet. For example a company might wish to analyze some financial data by product, by time-period, by city, by type of revenue and cost, and by comparing actual data with a budget. These additional methods of analyzing the data are known as dimensions. [2]

The OLAP Cube Funcionalities

The OLAP cube consists of numeric facts called measures which are categorized by dimensions. The cube metadata (structure) may be created from a star schema or snowflake schema of tables in a relational database. Measures are derived from the records in the fact table and dimensions are derived from the dimension tables . OLAP functionality is characterized by dynamic multi-dimensional analysis of consolidated enterprise data supporting end user analytical and navigational activities including:

calculations and modeling applied across dimensions, through hierarchies and/or across members

trend analysis over sequential time periods

slicing subsets for on-screen viewing

drill-down to deeper levels of consolidation

reach-through to underlying detail data

rotation to new dimensional comparisons in the viewing area [3]

The OLAP Server

An OLAP server is a high-capacity, multi-user data manipulation engine specifically designed to support and operate on multi-dimensional data structures. A multi- dimensional structure is arranged so that every data item is located and accessed based on the intersection of the dimension members which define that item. The design of the server and the structure of the data are optimized for rapid ad-hoc information retrieval in any orientation, as well as for fast, flexible calculation and transformation of raw data based on formulaic relationships. The OLAP Server may either physically stage the processed multi-dimensional information to deliver consistent and rapid response times to end users, or it may populate its data structures in real-time from relational or other databases, or offer a choice of both. Given the current state of technology and the end user requirement for consistent and rapid response times, staging the multi-dimensional data in the OLAP Server is often the preferred method.[3]

1.Analysis , Multi-dimensional

The objective of multi-dimensional analysis is for end users to gain insight into the meaning contained in databases. The multi-dimensional approach to analysis aligns the data content with the analyst's mental model, hence reducing confusion and lowering the incidence of erroneous interpretations. It also eases navigating the database, screening for a particular subset of data, asking for the data in a particular orientation and defining analytical calculations. Furthermore, because the data is physically stored in a multi- dimensional structure, the speed of these operations is many times faster and more consistent than is possible in other database structures. This combination of simplicity and speed is one of the key benefits of multi-dimensional analysis.[3]

2. Array , Multi-dimensional A group of data cells arranged by the dimensions of the data. For example, a spreadsheet exemplifies a two-dimensional array with the data cells arranged in rows and columns, each being a dimension. A three-dimensional array can be visualized as a cube with each dimension forming a side of the cube, including any slice parallel with that side. Higher dimensional arrays have no physical metaphor, but they organize the data in the way users think of their enterprise. Typical enterprise dimensions are time, measures, products, geographical regions, sales channels, etc.[3]

3. Calculated member A calculated member is a member of a dimension whose value is determined from other members' values (e.g., by application of a mathematical or logical operation). Calculated members may be part of the OLAP server database or may have been specified by the user during an interactive session. A calculated member is any member that is not an input member.[3]

4. Cell

A single datapoint that occurs at the intersection defined by selecting one member from each dimension in a multi-dimensional array.[3]

5. Consolidate

Multi-dimensional databases generally have hierarchies or formula-based relationships of data within each dimension. Consolidation involves computing all of these data relationships for one or more dimensions, for example, adding up all Departments to get Total Division data. While such relationships are normally summations, any type of computational relationship or formula might be defined.[3]

6. Dense

A multi-dimensional database is dense if a relatively high percentage of the possible combinations of its dimension members contain data values. This is the opposite of sparse.[3]

7. Derived data

Derived data is produced by applying calculations to input data at the time the request for that data is made, i.e., the data has not been pre-computed and stored on the database. The purpose of using derived data is to save storage space and calculation time, particularly for calculated data that may be infrequently called for or that is susceptible to a high degree of interactive personalization by the user. The tradeoff is slower retrievals.[3]

8. Dimension

A dimension is a structural attribute of a cube that is a list of members, all of which are of a similar type in the user's perception of the data. For example, all months, quarters, years, etc., make up a time dimension; likewise all cities, regions, countries, etc., make up a geography dimension. A dimension acts as an index for identifying values within a multi-dimensional array. If one member of the dimension is selected, then the remaining dimensions in which a range of members (or all members) are selected defines a sub-cube. If all but two dimensions have a single member selected, the remaining two dimensions define a spreadsheet (or a "slice" or a "page"). If all dimensions have a single member selected, then a single cell is defined. Dimensions offer a very concise, intuitive way of organizing and selecting data for retrieval, exploration and analysis.[3]

9. Drill Down / Up

Drilling down or up is a specific analytical technique whereby the user navigates among levels of data ranging from the most summarized (up) to the most detailed (down). The drilling paths may be defined by the hierarchies within dimensions or other relationships that may be dynamic within or between dimensions.[3]

10. Formula

A formula is a database object, which is a calculation, rule or other expression for manipulating the data within a multi-dimensional database. Formulae define relationships among members. Formulae are used by OLAP database builders to provide great richness of content to the server database. Formulae are used by end users to model enterprise relationships and to personalize the data for greater visualization and insight.[3]

11. Member , Dimension

A dimension member is a discrete name or identifier used to identify a data item's position and description within a dimension.[3]

12. Member Combination

A member combination is an exact description of a unique cell in a multi-dimensional array, consisting of a specific member selection in each dimension of the array.[3]

13. Missing Value , Missing Data

A special data item which indicates that the data in this cell does not exist. This may be because the member combination is not meaningful or has never been entered.[3]

14. Multi-dimensional Query Language

A computer language that allows one to specify which data to retrieve out of a cube. The user process for this type of query is usually called slicing and dicing. The result of a multi-dimensional query is either a cell, a two-dimensional slice, or a multi-dimensional sub-cube.[3]

15. Navigation

Navigation is a term used to describe the processes employed by users to explore a cube interactively by drilling, rotating and screening, usually using a graphical OLAP client connected to an OLAP server.[3]

16. Nesting ( of Multidimensional Columns and Rows )

Nesting is a display technique used to show the results of a multi-dimensional query that returns a sub-cube, i.e., more than a two-dimensional slice or page. The column/row labels will display the extra dimensionality of the output by nesting the labels describing the members of each dimension.[3]

17. OLAP Client

End user applications that can request slices from OLAP servers and provide two- dimensional or multi-dimensional displays, user modifications, selections, ranking, calculations, etc., for visualization and navigation purposes. OLAP clients may be as simple as a spreadsheet program retrieving a slice for further work by a spreadsheet- literate user or as high-functioned as a financial modeling or sales analysis application.[3]

18. Page Dimension

A page dimension is generally used to describe a dimension which is not one of the two dimensions of the page being displayed, but for which a member has been selected to define the specific page requested for display. All page dimensions must have a specific member chosen in order to define the appropriate page for display.[3]

19. Page Display

The page display is the current orientation for viewing a multi-dimensional slice. The horizontal dimension(s) run across the display, defining the column dimension(s). The vertical dimension(s) run down the display, defining the contents of the row dimension(s). The page dimension-member selections define which page is currently displayed. A page is much like a spreadsheet, and may in fact have been delivered to a spreadsheet product where each cell can be further modified by the user.[3]

20. Pre-consolidated , pre-calculated Data

Pre-calculated data is data in output member cells that are computed prior to, and in anticipation of, ad-hoc requests. Pre-calculation usually results in faster response to queries at the expense of storage. Data that is not pre-calculated must be calculated at query time.[3]

21. Reach through

Reach through is a means of extending the data accessible to the end user beyond that which is stored in the OLAP server. A reach through is performed when the OLAP server recognizes that it needs additional data and automatically queries and retrieves the data from a data warehouse or OLTP system.[3]

22. Rotate

To change the dimensional orientation of a report or page display. For example, rotating may consist of swapping the rows and columns, or moving one of the row dimensions into the column dimension, or swapping an off-spreadsheet dimension with one of the dimensions in the page display.[3]

23. Scoping

Restricting the view of database objects to a specified subset.

24. Selection

A selection is a process whereby a criterion is evaluated against the data or members of a dimension in order to restrict the set of data retrieved.[3]

25. Slice

A slice is a subset of a multi-dimensional array corresponding to a single value for one or more members of the dimensions not in the subset[3].

26. Slice and Dice

The user-initiated process of navigating by calling for page displays interactively, through the specification of slices via rotations and drill down/up[3].

27. Sparse

A multi-dimensional data set is sparse if a relatively high percentage of the possible combinations (intersections) of the members from the data set's dimensions contain missing data. The total possible number of intersections can be computed by multiplying together the number of members in each dimension. Data sets containing one percent, .01 percent, or even smaller percentages of the possible data exist and are quite common [3].