The Data Warehouse Components Computer Science Essay

Published: Last Edited:

This essay has been submitted by a student. This is not an example of the work written by our professional essay writers.

A data warehouse is set of resources that can be access and retrieve information of organization's that has stored data files, designed to take reports & data analysis. But it retrieves and analyzes data support to extract, transform and load data. This will manage the data dictionary as well as consider essential's components of a data warehousing. A fully definitions for data warehousing includes business intelligence tools, tools can do extract, do transform & load data into the repositories, and tools can handle manage and retrieve of metadata. Data warehousing method helps to improve an organization's needs for reliable, attractive, consolidated, uniqueness, integrated analysis and doing reports of its contained data, at a given different levels of aggregation.

Data warehousing having a feature of doing essential elements of decision support method. Directly aims at ensure the knowledge people use to make better decisions accurate and faster daily business needs. It will order to supply in a decisional database contain of meta-data and need to access or enable communication simultaneity between various functions in areas of the data warehouse & an ETL Tool is requires to get done the warehousing process completely.

1.2) Data Warehouse Components:

The construction of a data warehouse is divided in to two stages Known as back room & front room. The first ensures the building up of the warehouse database. The second provides the restitution of data from data mart in order to fulfill analyst's demand. According to standard data warehouse architecture, the data warehouse systems are composed of:

ETL or Warehousing tools

Restitution Tools

Meta Data

1.3) ETL Tools

ETL stands for Extract, transform, and load. These processes are use in database usage for especially improving in data warehousing efficiency in involves. If this can do accurate quickly as well as convenient that tool may best among the contend

Data Extract to outside resources

Transform the data contain to fit Company needs (This can handle quality levels of data)

Load the data into the specific target.

Stock the data in the data warehouse is generally takes more time consuming job needed to make data warehousing; reporting and make business intelligence decisions a success. Data Extracting to the data warehouse includes several steps.

ETL Architecture can make below Choices

Data Mapping

Extract data to staging position.

Cleansing of data transformations

Data consistency applying to transformations

Data Loading

The extract, transformation and loading process includes a number of steps:

1.3.1) Extract

Data reading form the source system is the first part of ETL Process. In the data warehousing projects mostly data has taken from the different source systems. These separate systems have may use a different collection of data as well as different data types (formats). Source formats are contains relational databases & flat files. This is called Common data.

The extracted data are streaming in data source and load in to Fly (method) to the Destination database. This will one way of performing the Extract, transform, and load when there no intermediate data storage has require. Normally, focus the goal to extraction phase is to converting the data set into a one (single) format e for transformation process is appropriate.

1.3.2) Transform

The stage of transform will do a set of collection function or rules to the extracted data from the source to derive the data for loading into the end target. There will not do Data manipulation. Data sources need less process to do task. Otherwise some times, Business and technical matters need the require focus database to do transformation

1.3.3) Load

DATA load doing the loading the data to the destination, most of the data warehouses. Most of time updating extract data is done on daily, weekly or monthly basis. Data warehouses might overwrite existing data information with new cumulative information. Some other data warehouses adding latest data in a historicized form, as an example we can take hourly. We can consider the data warehouse required to maintain inventory records of last two years. Data warehouse can overwrite data that has older than two years. More complex data ware house systems can handle auditing of all change to the data load at the data warehouse.

1.4) ETL Tool Functions

Data collections contain a database and the required hardware capacity platform is a must for this work. Highly recommended way is the selection of an ETL tool, but this is not a must. If we use ETL tools have to evaluate, this will get the following characteristics to pay:

Functional Capability use.

Can read directly form data

Metadata support

2.0) Commercial tools for Data Ware House

2.1) Microsoft Office Performance Point Server

One of the warehouse tools is Microsoft Office Performance Point Server. This tool develops by software from Microsoft Corporation to use the business intelligence in business sector. First Version 1.0 V was released in November 2007. But this version 1.0 product was not officially released until November 2007. This has enabled through Microsoft to add as a feature of deep analytical for reports created by its Performance Point Monitoring Server. Planning component of Performance Point Server 2007 is to be discontinued in April 1, 2009. Microsoft will discontinue this product as an independent product. It is done on Dashboard, Analytic Reports creating and, Scorecard in to share point server. This software will make massive change with improvement of the company's business intelligence needs. Microsoft office performance point Server released the latest stable release as version 1.0SP2 in 2008 and running on operating system is Microsoft Windows platform. This product has license called EULA.

Performance Point Server 2007 is fully support with other all Microsoft office products including such as Word, PowerPoint, Excel, SQL Server, Visio and SharePoint Server. This software gives a support to planning and budgeting component in business which is integrated with Excel & SQL Server Analysis Services. This software supports integrate with allows Performance point to join systems business enterprise use to in order to keep track of information accurate within all systems that they having. Data cubes are using through Performance Point will manage the require information. In year 2007, the Business intelligence prompt, also called as BPM (Business performance management) systems as well as CPM (corporate performance management) is a growing strongly in the market Own to the increasing the amount of information data collected through businesses about customers of them. There will be three types of components will handle this, Details are given below.

Monitoring Server Operation

Planning Sever Operation

Management Reporter

The Monitoring Server Operation has lot of the monitoring and analytical features. It includes Dashboards, Scorecards, LPIs, Strategy Maps, and Filters & Reports. Dashboard Designer will saves information content and information of security to a SQL Server. Database is managed will done through the Monitoring server as well as Data source connections are also done through this server.

Planning Server will built the stack collection of SQL Server with this extensive usage of Microsoft Excel for line-of-business reporting and analysis. The Performance Point Planning Server gives varies of collection management processes. This includes the performance to define, modify and maintain logical business models integrated with business rules, workflows, and enterprise data.

Financial Reporting is specifically designed to perform as component. This can read directly the Planning Financial Models (PFM). There is a development kit is also available on this to allow component to perform report off other extra repositories.

2.2) Oracle Business Intelligence Suite Enterprise Edition

Another type of warehouse tool is Oracle Business Intelligence Suite Enterprise Edition. This is also known as OBI EE Plus. This product developer is Oracle Cooperation. Stable release is and it release in 1st September 2009. Written language is C++ and Java. Oracle Business Intelligence Suite Enterprise Edition used operating system are Windows, Linux, Solaris, HPUX, AIX and MACOSX. This Oracle's set of business intelligence tools consisting two business intelligences. There are:

Former Siebel business intelligence

Hyperion Business intelligent.

Previous called Siebel products were came initially marketed by Oracle Corporation launch as Oracle Business Intelligence Suite Enterprise. The Oracle Business Intelligence Suite Enterprise Edition is used interactions with Oracle Business Intelligence Applications. Industry counter-part and main competitors of Oracle Business Intelligence Suite Enterprise Edition are IBM Cognos, SAP Business Objects, Microsoft Business intelligence (BI), and SAS.

The Complete software Deployment of OBIEE contains the following some of major components given below:

Oracle Business Intelligence Publisher.

Oracle Business Intelligence Scheduler.

Oracle Business Intelligence systems management.

Oracle Business Intelligence Administration Tool.

Oracle Business Intelligence Client software.

Oracle Business Intelligence JDBC Driver.

Oracle Business Intelligence Scheduler.

Oracle Business Intelligence Catalog Manager

Oracle Business Intelligence Job Manager

There are lots of components in this product. Some of major components are;

Oracle Business Intelligence Admin Tool

Oracle Business Intelligence Answers

Oracle Business Intelligence Server

Oracle Business Intelligence Marketing

Oracle Business Intelligence Interactive Dashboards

Hyperion Web Analysis

In this project these major components describe in detail. According to that Oracle BI Admin Tool is an administrative tool can used to construct repositories consisting of a Physical layer, Business Model (BM) and mapping layer, an abstract end-user contain Presentation Layer subsequently visible in Business intelligence Answers. Oracle Business intelligence Answers are contain an ad-hoc query and analysis tool it processes data from the multiple data sources in a fully web based environment. Users can do remote access from complexity data structure & they will view & work with a logical view of the information collection. Analysis server has providing the aggregation engine and calculation contain integrates data information from multiple relational, OLAP, unstructured, and other sources are called as Oracle Business intelligence Server. Oracle BI Marketing is which marketing needs, formerly known as Segmentation Server. And the last major component is Hyperion Web Analysis.

3.0) Open Source tools for Data Ware House

3.1) Eclipse BIRT Project.

This project base to give audience to get varies of report use to normal application or a program. This tool support generating reports and Business intelligence capabilities. This tool is open source as it is fully independent non profitable product founded by Eclipse systems.

This Software has been developed base Java and Java platform independent enterprise edition project targets on online analytical processing and delivering support to allow developers to integrate the reports and easily to design. Report designer can also work as standalone to create charts in the application.

Software containing two major components

Creating the reports by Visual reports designer intergraded with software Eclipse.

Report generating that in JAVA environment as runtime component..

Eclipse BIRT Project supports variety of databases such as SQL DATABASE, JFIRE and etc.

BIRT is open source reporting system. It will allow doing layout and publish the user define formats. These have capabilities of matrices and real time reporting. Built in report features as well as adding capabilities, report scheduling end user interactivity and Ad-Hoc Reporting.

Release versions of Eclipse

Very first versions of eclipse BIRT Project is version 1.0. It was released on 2005:03:01. This version is contained Report engine, Chart Engine and report designer. Version by version they have improve the system. Latest release is version 2.6. It contains Eclipse Helios simultaneous as Polar & Radar chart, Pie chart rotation, Palette Hashing, Sort locate and strength, Native PDF drawing from SVG, Chart Background Images, etc. This release fixed over 500 bugs too. Stable release is version 2.6 on 24th June 2010. In future they will release 2.6.1 version on 24th 09 2010 as SR 1 and 2.6.2 version 25th February 2010 as SR 2.

Technical background of Eclipse BIRT Project

Infrastructure to minimum for require to BIRT project as given below


Project Mail List

Component Mail List

General Mail List

Bug Database


Source Repository

Conclusion for BIRT PROJECT

Powerful reporting tool give capable of real time reporting and displaying charts and new reporting features are built in this product. It will support to Enterprise level requirement and flexible.


Pentaho is an open source BI system and is a pioneer in commercial open source BI. It provides a wide range of capabilities to the users including enterprise reporting, data analysis, dashboard, data integration (ETL), data mining and workflow. This project was initiated by a team of open source veterans in the year 2004. This is widely recognized as the leading open source Business intelligent solution. It has a large referenceable customer base with variety of BI and Data warehouse developments. MySQL, Motorola, Honeywell, DivX are some of leading Pentaho customers.

Pentaho Components summary

Pentaho Reporting

Access and format data from different sources (ex: RDBMS, XML, OLAP)

Produce reports in many popular formats (ex: word, excel, txt, PDF, html)

Multiple report types (ex: Operational, Analytical, and Financial)

Directly against sources or using centralized metadata layer.

Pentaho Analysis

ROLAP architecture which works with all popular open source and proprietary DBs

Ability to view data dimensionally (ex: by region, by channel, by time)

Ability to navigate and explore easily

Web based or Excel based front ends.

Pentaho Dashboards

Tight business process integration. This includes an embedded workflow and also can receive or trigger events in external systems.

Comprehensive auditing of user activity, performance and data access

Context sensitive drilling to reports or analysis

Integrated security

Modern standards based architecture (J2EE, AJAX, JDBC 2.0)

Pentaho Data Integration

Rich Feature Set

Enterprise-class performance and scalability

Broad Database Support

100% Meta-data Driven

Graphical, model-driven design

In commercial aspects Pentaho uses an annual subscription including Gold and Platinum technical support models and does not includes software licensing fee. Pentaho upgrades are also available without any extra fee. They claim that Pentaho will reduce BI licensing cost up to 90%. Latest version of Pentaho BI suit available is "Pentaho BI Suite 3.6 GA"