Designing Dimension Relationships
In addition to designing dimensions appropriately, it is important to configure relationships between measure groups and dimensions, and between dimension attributes, so that all aggregations are calculated correctly.
In this lesson, you will review different relationship types that can be created between dimensions and measure groups. You will learn the considerations for creating relationships between different attributes of the same dimension and learn how attribute relationships can affect performance. This lesson will cover guidelines for using hierarchies effectively and the concept of ragged hierarchies. You will learn how to design relationships between dimension attributes and work with many-to-many relationships.
Discussion: Reviewing Dimension Usage
Question: What are the relationship types that can be created between dimensions and measure groups?
Considerations for Creating Dimension Relationships
You need to consider the following for creating dimension relationships:
Get your grade
or your money back
using our Essay Writing Service!
Materialize reference dimensions. To improve performance, you should materialize reference dimensions.Materializing a relationship causes SSAS to store the attribute member in the intermediate dimension that links the attribute in the reference dimension to the fact table in the MOLAP structure. Materializing the relationship is the default behavior to maximize query performance, but at the expense of an increase in processing time and storage space. The only exception to this is when you are creating a reference dimension relationship between a local dimension and a linked measure group.
Use fact relationships for grouping together related fact table rows. When you define a dimension based on a fact table item, it is called a fact dimension. Fact dimensions are also known as degenerate dimension. Fact dimensions are useful for grouping together related fact table rows, such as all rows that are related to a particular invoice number. Although you can put this information in a separate dimension table in the relational database, creating a separate dimension table for the information provides no benefit. This is because the dimension table would grow at the same rate as the fact table, and would just create duplicate data and unnecessary complexity.
Using SSAS, you can determine whether to duplicate the fact dimension data in a MOLAP dimension structure for increased query performance or to define the fact dimension as a ROLAP dimension and save storage space at the expense of query performance. When you store a dimension with the MOLAP storage mode, all dimension members are stored in an SSAS instance in a compressed MOLAP structure. The dimension members are also stored in the partitions of the measure groups. When you store a dimension with the ROLAP storage mode, only the dimension definition is stored in the MOLAP structure. The dimension members themselves are queried from the underlying relational fact table at query time. You should decide the appropriate storage mode based on how frequently the fact dimension is queried, the number of rows returned by a typical query, the performance of the query, and the processing cost. Defining a dimension as ROLAP does not require that all cubes that use the dimension also be stored with the ROLAP storage mode. This is different from SSAS 2000.
Use many-to-many relationships to expand the dimensional model and to support complex analytics. When you define a dimension, usually, each fact joins to one and only one dimension member, whereas a single dimension member can be associated with many different facts. In relational database terminology, this is referred to as a one-to-many relationship. However, sometimes, a single fact can connect to multiple dimension members. In relational database terminology, this is referred to as a many-to-many relationship. In SSAS 2008, you should define a many-to-many relationship between a dimension and a measure group by specifying an intermediate fact table that is joined to the dimension table. An intermediate fact table is joined, in turn, to an intermediate dimension table to which the fact table is joined. To define a many-to-many relationship between a dimension and a measure group through an intermediate measure group, the intermediate measure group must share one or more dimensions with the original measure group. In a many-to-many dimension, values do not aggregate more than once to the All Member.
Always on Time
Marked to Standard
Question: An OLAP solution used by a bank has two dimension tables—Customer table and Account and a fact table containing a list of transactions performed against Accounts. Each customer can have more than one account and one account can have more than one customer (joint account holders). Each account and each customer can have many transactions. In order to slice transaction data by Customer, would you create a referenced relationship or a many-to-many between the Customer dimension and the measure group?
Identifying Correct Attribute Relationships
In relational data warehouses, each dimension table usually contains a primary key, attributes, and foreign key relationships to other tables in certain instances. SSAS analyzes the relationships between your attributes to correctly aggregate data, effectively store and retrieve data, and create useful aggregations. To help you create these associations among your dimension attributes, SSAS provides a feature called attribute relationships.
When you initially create a dimension, SSAS auto-builds a dimension structure with many-to-one attribute relationships between the primary key attribute and every other dimension attribute. With this design, whenever you raise a query that includes an attribute from this dimension, data is always summarized from the primary key and then grouped by the attribute. In addition, SSAS uses the fact data to identify meaningful member combinations. To optimize this dimension design, you should understand how your attributes are related to each other and then take steps to let SSAS know what the relationships are.
Whenever you add an attribute relationship between two attributes, it is important to first verify that the attribute data strictly adheres to a many-to-one relationship. As a rule, you should create an attribute relationship from attributeA to attributeB if and only if the number of distinct pairs from A and B is the same or smaller than the number of distinct members of A.
How Attribute Relationships Affect Performance
You can optimize performance by defining relationships supported by the data. Attribute relationships help performance in two significant ways:
- Indexes are built and cross products need not go through the key attribute.
- Aggregations built on attributes can be reused for queries on related attributes.
Typically, many-to-one relationships follow data hierarchies such as the hierarchy of products, subcategories, and categories. While a data hierarchy can commonly suggest many-to-one relationships, do not automatically assume that this is always the case. Whenever you add an attribute relationship between two attributes, it is important to first verify that the attribute data strictly adheres to a many-to-one relationship.
Although an attribute is no longer directly related to other attributes, it may still be indirectly related to these attributes through a chain of attribute relationships. This chain of attribute relationships is also called cascading attribute relationships. Using cascading attribute relationships, SSAS can help make better performance decisions concerning aggregation design, data storage, data retrieval, and MDX calculations. Apart from using attribute relationships for performance considerations, you can also use them to enforce dimension security and to join measure group data to nonprimary key granularity attributes. The core principle behind designing effective attribute relationships is to create the most efficient dimension model that best represents the semantics of your business.
In SSAS, you can expose attributes to users by using two types of hierarchies:
- Attribute hierarchies. From a performance perspective, attributes that are only exposed in attribute hierarchies are not automatically considered for aggregation. Without the benefit of aggregations, query performance against these attributes hierarchies can be somewhat slow. To enhance performance, it is possible to flag an attribute as an aggregation candidate by using the Aggregation Usage property. However, before you modify the Aggregation Usage property, you should consider whether you can take advantage of user hierarchies.
- User hierarchies. In user hierarchies, attributes are arranged into predefined multilevel navigation trees to facilitate end user analysis. SSAS helps you build natural and unnatural hierarchies; each with different design and performance characteristics.
- Note: Each of these hierarchies has a different impact on the query performance of your cube.
Flexible and Rigid Relationships
This Essay is
a Student's Work
The relationships between members of some attributes such as dates in a given month or the gender of a customer are not expected to change.Other relationships, such as salespeople in a given region or the marital status of a customer, are more prone to change over time.You should set RelationshipType to Flexible for those relationships that are expected to change and set RelationshipType to Rigid for relationships that are not expected to change. When you set RelationshipType appropriately, the server can optimize the processing of changes and re-building of aggregations. By default, the user interface always sets RelationshipType to Flexible.
Question: A geography dimension contains attribute relationship between the Country, State, and City. Should these be defined as rigid or flexible relationships?
Benefits of Cascading Attribute Relationships
Whenever you define a new attribute relationship, it is critical that you remove any redundant relationships for performance and data accuracy. To help you identify redundant attribute relationships, Business Intelligence Development Studio (BIDS) provides a visual warning to alert you about the redundancy; however, it does not require you to eliminate the redundancy. It is a best practice to always manually remove the redundant relationship. After you remove the redundancy, the warning disappears.
Alhough an attribute is no longer directly related to some other attribute, it may still be indirectly related to these attributes through a chain of attribute relationships. This chain of attribute relationships is also called cascading attribute relationships.
With cascading attribute relationships, SSAS can help you make better performance decisions concerning aggregation design, data storage, data retrieval, and MDX calculations. Beyond performance considerations, attribute relationships are also used to enforce dimension security and join measure group data to nonprimary key granularity attributes. For example, if you have a measure group that contains sales data by Product Key and forecast data by Subcategory, the forecast measure group will only know how to roll up data from Subcategory to Category if attribute relationships exists between Subcategory and Category.
Guidelines for Using Hierarchies Effectively
The guidelines for using hierarchies effectively are:
Use an unnatural hierarchy to create drill-down paths of commonly viewed attributes. In an unnatural hierarchy, the hierarchy consists of at least two consecutive levels that have no attribute relationships. Usually, these hierarchies are used to create drill-down paths of commonly viewed attributes that do not follow any natural hierarchy. For example, users may want to view a hierarchy of Size Range and Category or the other way around.
Use natural hierarchy for better performance. From a performance perspective, natural hierarchies behave very differently from unnatural hierarchies. In natural hierarchies, the hierarchy tree is materialized on disk in hierarchy stores. In addition, all attributes participating in natural hierarchies are automatically considered to be aggregation candidates. Unnatural hierarchies are not materialized on disk, and the attributes participating in unnatural hierarchies are not automatically considered as aggregation candidates. Rather, they provide users with easy-to-use drill-down paths for commonly viewed attributes that do not have natural relationships. By assembling these attributes into hierarchies, you can also use a variety of MDX navigation functions to easily perform calculations such as percent of parent. An alternative to using unnatural hierarchies is to cross-join the data by using MDX at query time. The performance of the unnatural hierarchies, as against cross-joins at query time, is similar. Unnatural hierarchies provide the added benefit of reusability and central management.
Ensure that you have correctly set up cascading attribute relationships. To take advantage of natural hierarchies, ensure that you have correctly set up cascading attribute relationships for all attributes participating in the hierarchy. It is common to inadvertently miss an attribute relationship at some point in the hierarchy, because creating attribute relationships and creating hierarchies are two separate operations. If a relationship is missing, SSAS classifies the hierarchy as an unnatural hierarchy, even if you intend it be a natural hierarchy.
Look out for the BIDS warning icon for missing attribute relationships. To verify the type of hierarchy that you have created, BIDS issues a warning icon whenever you create a user hierarchy that is missing one or more attribute relationships. The purpose of the warning icon is to help you identify situations where you have intended to create a natural hierarchy but have missed attribute relationships. After you create the appropriate attribute relationships for the hierarchy in question, the warning icon disappears. If you are intentionally creating an unnatural hierarchy, the hierarchy continues to display the warning icon to indicate the missing relationships. In such an instance, you should ignore the warning icon.
Demonstration: How To Design Relationships Between Dimension Attributes
Perform the following steps to design relationships between dimension attributes:
- On the Start menu, click Microsoft Visual Studio 2008, and then click Microsoft Visual Studio 2008. The Microsoft Visual Studio (Administrator) window appears.
- In the Microsoft Visual Studio window, on the File menu, point to Open, and then click File.
- In the Open File dialog box, browse to the D:\Demofiles\Mod04\Starter\AdventureWorks folder, click Adventure Works.sln, and then click Open.
- In the Solution Explorer pane, under the Dimensions folder, right-click Date.dim, and then click Open.
- On the Attribute Relationships tab, in the Attribute Relationships pane, right-click the attribute relationship between Date and Calendar Year, and then click Delete.
- In the Delete Objects message box, click OK.
- Right-click the blank area of the Attribute Relationships tab, and then click New Attribute Relationship. The Create Attribute Relationship dialog box appears.
- In the Create Attribute Relationship dialog box, under Source Attribute, in the Name list, click Month Name.
- Under Related Attribute, in the Name list, click Calendar Year.
- In the Relation type list, click Rigid (will not change over time), and then click OK.
- Note: While creating attribute relationships, ensure that there is a many-to-one relationship between the attributes and not a many-to-many relationship. This means that many members of the attribute in the left side of the relationship should be present under just one member of the attribute on the right side of the relationship. However, this is not the case with the relationship we just created. Each month is present in all years.
- Note: You should create user hierarchies corresponding to the attribute relationships created to help users in navigation and also to be able to use navigation functions in MDX.
What Are Ragged Hierarchies?
A hierarchy can be either symmetrical or asymmetrical (ragged). Usually, each level in a hierarchy in SSAS 2005 has the same number of members above it as any other member at the same level. When a parent-child hierarchy, which is essentially an unbalanced hierarchy, is converted to level-based hierarchy, it becomes a ragged hierarchy. A ragged hierarchy is a hierarchy in which the parent of a member comes from any level above the level of the member, not just from the level immediately above. For example, one Product could roll up into Product Sub-Group and then Product Group, and another could roll up into Product Group and Product Family.
Ragged hierarchies are quite easy to support in a recursive relational structure, such as, a Bill of Materials (BOM).
The recursive structures of the ragged hierarchy do not allow optimum performance in a relational environment.
Not all levels in a dimensional hierarchy should be balanced. For example, an account executive might report to a Market Unit or a Business Unit. In a star schema, a dimension table usually requires fixed columns for each level. In such a situation, the star schema will have considerable difficulty handling unbalanced or ragged hierarchies. BI tools circumvent this with code.
Supporting ragged hierarchies is difficult , especially in a pure dimensional model by using a star schema , because the star schema flattens out the hierarchy , making it difficult to store and interpret the levels. A pure dimensional model—especially a star schema—has great difficulty in supporting ragged hierarchies because the star schema flattens out the hierarchy, and it is difficult to store and interpret the levels. Online Analytical Processing (OLAP) tools circumvent it with rules and code and not the dimensional structure.
Working with Ragged Hierarchies
In a ragged hierarchy, for many members, the parent members are not present in the immediate level above, and you need to put placeholder members as parents in that level. When this occurs, the hierarchy descends to different levels for different drill-down paths. Expanding through every level for every drill-down path is complex.
For client applications that support the display of ragged hierarchies, you can configure hierarchies to hide logically missing members. Depending on whether you are configuring a regular hierarchy or a parent/child hierarchy, two different properties can be set by using Dimension Designer.
In a ragged dimension's table, the logically missing members can be represented in different ways. The table cells can contain null strings or empty strings, or they can contain the same value as their parent to serve as the placeholder.
The HideMemberIf property of a level in a hierarchy is used to hide these placeholders or missing members from client applications. However, in the client applications, these placeholder members do not show properly.
You can display the hierarchy in the client application by using the MDX Compatibility property in the connection string. For example, Provider=msolap.3;Datasource=MySSASServerName;Initial Catalog=MySSASDBName;MDX Compatibility=2.
The MDX Compatibility property helps you determine how placeholder members in a ragged or unbalanced hierarchy are treated. If you set the MDX Compatibility property value to 1, you expose a placeholder member in a ragged hierarchy.
Implementing Cubes, KPIs, and Actions
The ability to browse a multidimensional structure and to pivot and drill down into data helps users analyze business data. However, additional functionality such as key performance indicators (KPIs), actions, and stored procedures can enhance the utility of a cube.
In this lesson, you will learn about the benefits of perspectives, KPIs, and actions. You will also learn about the considerations for implementing these features in an SSAS cube.
Considerations for Implementing Cubes
BIDS supports two methods of working with relational schemas when defining OLAP objects such as cubes and dimensions within an SSAS project or a database. Usually, you define dimension and cube objects based on a logical data model. This data model is constructed in a data source view within an SSAS project or a database. This data source view is defined based on schema elements from one or more relational data sources, as customized in the data source view.
If you have already created a relational data warehouse, you can create a cube based on the existing fact and dimension tables in your data warehouse. This is the most common approach used to create cubes.
If there is no underlying relational database, you can define the facts and dimensions required as you build a cube, and have SSAS generate the relational database schema for you.
Sometimes you might prefer to work out the cube design before building and populating the relational database used to load the cube with data. You will not create a relational database structure for your cube until after you design the cube. After you understand your project's requirements so that you can build the cube properly, you can create the cube, allow SSAS to create an empty source database, import the appropriate data from the relational database, and finally process the cube so that you can see the results from the cube. This approach is sometimes called top-down design and is frequently used for prototyping and analysis modeling. When you use this approach, you use the Schema Generation Wizard to create the underlying data source view and data sources objects in the SSAS project and the relational schema in an underlying database based on the OLAP objects defined in an SSAS project or database.
A database dimension is a collection of related objects, called attributes, which can be used to provide information about fact data in one or more cubes. Cubes contain all dimensions on which users base their analyses of fact data. An instance of a database dimension in a cube is called a cube dimension and relates to one or more measure groups in the cube. A database dimension can be used multiple times in a cube. However, only one time-related database dimension needs to exist, which also means that only one time-related relational database table needs to exist to support multiple cube dimensions based on time. You can either create dimensions independently of cubes, or you can create them when you run Cube Wizard. You can then use Dimension Designer to edit the dimensions created by the wizard.
Question: An SSAS cube is used as a data source for a bunch of reports created by using SSRS.
These reports display summarized high-level data. The reports are provided in the PDF format and then distributed by using e-mail messages to a selected set of users in the organizations. There has been a suggestion to allow users to drill into the summarized data when they receive the reports. Would a drill-down action created in the cube help in this situation?
Considerations for Reducing UDM Complexity
A perspective is a viewable subset of a cube. You can use a perspective to focus users on relevant objects within the cube. The main consideration for using perspectives is the processing efficiency. You can present the same information to users by using multiple cubes or by using perspectives. Perspectives are often a simpler and more efficient solution than multiple cubes. Perspectives control the visibility of objects within a cube, including measure groups, measures, dimensions, hierarchies, attributes, KPIs, actions, and calculations. They do not restrict access, and you can still reference invisible objects by using XML for Analysis, MDX, or Data Mining Extensions (DMX) statements. The perspective sets the visibility of objects. However, all other settings are taken from the underlying cube. Alternatively, rather than using a large cube with perspectives, you could create, smaller cubes. Individually the multiple cubes would process faster. But, if you must process them together, the processing cost is likely to be greater than that with a single larger cube.
Considerations for Using KPIs
Using SSAS, you can add KPIs to your cubes to help decision-makers quickly locate trends and business performance issues.
Because a KPI is a graphical representation of how the current performance of the company compares with targets that have been set, KPIs can represent various objectives through different graphical representations.
Some examples include:
- A traffic light icon that represents the sales for the current year. The colors of the traffic light can indicate whether the data for the current year is above 10 percent, which is green, within 10 percent, which is yellow, or below 10 percent, which is red, of the total sales target.
- A cylinder icon that is used to represent how company profits compare to the target. The cylinder becomes fuller as profits approach the target.
- A face icon that a school administrator can use to indicate the number of students that progress to college compared with the national average. A smiling face indicates that this school sends more than the national average number of students to college; a neutral face indicates that the school matches the national average; and a frowning face indicates that the school sends fewer than the national average number of students to college.
It is important to know how to design effective KPIs that show how the organizational business is performing against goals for specific business metrics. Ill-designed KPIs can produce misleading information on which business-critical decisions might be based, leading to errors.
While KPIs can be created in external applications like PPS, it is helpful to create them in SSAS.
KPI vs. Calculated Members
When the user defines KPI object properties such as a KPI value, a KPI goal, and a KPI trend, the user can specify the MDX expressions for these. At the back-end, SSAS creates hidden calculated measures for each one of these properties it assigns the MDX expression specified in the property to the calculated measure. Later, when the user queries KPIs, the KPI browser generates queries by using MDX functions, which return hidden calculated measures.
Each KPI formula is defined as a hidden calculated measure, except when it is just a reference to another measure. For better performance, you should consider creatingreal calculated measures with properly designed calculations and then use these measures in KPI definitions.
KPIs may create hidden calculated measures, not just for the KPI value, but for other properties too. Creating real calculated measures is not going to help performance by itself. If the calculated measures are created manually in the MDX Script, the cube designer has more control. For example, it is possible to specify performance-related properties such as NON_EMPTY_BEHAVIOR, or non-performance properties such as FORMAT_STRING. It is possible to use this calculated measure inside SCOPE statement. Therefore, it gives flexibility both in functionality and in making performance optimizations.
Question: A cube contains the measures Cost and Sales Amount. Users would like to see a scorecard displaying profit percentages across time periods, product categories, or geographical locations. To achieve these requirements, you need to create a Profit Margin KPI. Will you write the MDX expression to calculate the profit margin as a separate calculated measure and then use it in the Value expression for the KPI, or write the MDX expression in the KPI's Value expression itself?
Considerations for Implementing Actions
An action is an end user-initiated operation upon a selected cube or portion of a cube. The operation can start an URL or a report with the selected item as a parameter, or it can retrieve information about the selected item. Instead of focusing on sending data as input to operational applications, end users can close the loop on the decision-making process. This ability to transform analytical data into decisions is crucial to the successful business intelligence application.
For example, a business user browsing a cube notices that the current stock of a certain product is low. The client application provides to the business user a list of actions, all related to low product stock value, that are retrieved from the SSAS cube, The business user selects the Order action for the member of the cube that represents the product. The Order action initiates a new order by calling a stored procedure in the operational database. This stored procedure generates the appropriate information to send to the order entry system.
You can exercise flexibility when you create actions.
An action can launch an application, or retrieve more detailed information from a database. You can configure an action to be triggered from almost any part of a cube, including dimensions, levels, members, and cells, or create multiple actions for the same portion of a cube. You can also pass string parameters to the launched applications and specify the captions displayed to end users as the action runs.
Following are the considerations for implementing actions:
- Performance. Some actions have various ways to achieve similar results. You must therefore consider performance, flexibility, and user friendliness. For example, slow performance associated with a fact dimension that returns a large number of rows can frustrate users. When you define a drill-through action, you can limit the number of rows it returns, which improves performance. When you choose this method, the user may only see a subset of the data that matches the request.
- Security. You should take care when you define actions because a user who is able to browse the cube is also able to execute the action. Command-line and HTML actions require particular caution, because these types of actions access system resources and may fail due to security limitations beyond SSAS.
- Client application requirements. In addition to performance and security, you should consider the following:
- What support is provided in the client application?
- How is the action invoked from the client application?
- What target is attached to the action? You must be careful to attach the action to a target that produces the correct information. For example, an action that redirects the user to the job description of an employee on the corporate Web site should be attached to the members of the employee dimension, not the department level.
- What format should reports that are generated by report actions be rendered in? For example, do users have Microsoft Office Excel installed, or would an HTML or PDF format better suit their requirements?
Global Considerations for an SSAS Solution
Most cubes or OLAP solutions cater to users that are spread across different geographical locations. Delivering the entire cube in just one language like English may not be very useful in such a situation. Hence, there is a need to consider the special factors for globalizing an SSAS solution so that users from all geographical locations and cultures can derive the maximum benefit from it.
In this lesson, you will learn about the considerations for working with languages and collations. You will also learn the guidelines to increase the portability of an SSAS solution by keeping in mind limitations of client applications.
You can define translations in Business Intelligence Development Studio by using the appropriate designer for the Analysis Services object to be translated.
You should handle translations associated with attributes in database dimensions differently from other translations.
Following are the ways in which you can handle translations associated with attributes:
- You can associate a column binding, instead of an explicit literal value with the CaptionColumn property so that the member names of members for that attribute can be translated.
- You can use a Windows collation other than the collation specified for the instance so that members in the attribute can be appropriately sorted for the language specified in the translation.
If a client application requests information in a specified language identifier, the SSAS instance attempts to resolve data and metadata for SSAS objects to the closest possible language identifier. If the client application does not specify a default language, or specifies the neutral locale identifier, or process default language identifier, SSAS uses the default language for the instance to return data and metadata for SSAS objects.
If the client application specifies a language identifier other than the default language identifier, the instance iterates through all available translations for all available objects. If the specified language identifier matches the language identifier of a translation, SSAS returns that translation. If a match cannot be found, SSAS attempts to return translations with a language identifier closest to the specified language identifier.
Question: A cube contains translations defined for the languages such as Chinese (Hong Kong SAR, PRC), French (French), English (United States), and Hindi (India). English (United States) is the default translation. A client application requests information in the Chinese (Taiwan) language. Which of the four translations will SSAS use in this situation?
Considerations for Using Languages and Collations
Following are the considerations for using languages and collations:
Impact on sort order. SSAS uses Windows collations to specify the selected collation for SSAS instances and objects. A Windows collation identifier corresponds to a combination of code page and sort order information. Binary collations sort data based on the sequence of coded values defined by the locale and data type. A binary collation in SQL Server defines the language locale and the American National Standards Institute (ANSI) code page to be used, enforcing a binary sort order. Binary collations are useful in achieving improved application performance due to their relative simplicity. For non-Unicode data types, data comparisons are based on the code points defined in the ANSI code page. For Unicode data types, data comparisons are based on the Unicode code points. For binary collations on Unicode data types, the locale is not considered in data sorts. Previous binary collations in SQL Server performed an incomplete code-point-to-code-point comparison for Unicode data. Binary collations in SQL Server 2005 onwards include a new set of pure code-point comparison collations. The new BIN2 sort order identifies collation names that implement the new code-point collation semantics. In addition, a new comparison flag is added corresponding to BIN2 for the new binary sort. The advantage of using a BIN2 sort order is that no data resorting is required in applications that compare sorted data.
Default language and collation. The default language and collation settings for an SSAS instance specify the settings used for data and metadata. This is done if translation for a specific language identifier is not provided for an SSAS object or if a client application does not specify a language identifier when connecting to an SSAS instance.
You should ensure that the Collation property of OLAP objects is consistent with the collation of the relational source data when dealing with multilingual data.
Collation controls both sorting and equivalence of strings. Therefore it is important to let the SSAS server use the appropriate collation for the data. If you use the wrong collation, you might see incorrect results, especially if the column is used for a distinct count measure.
Collations can be set at the level of the server, database, cube, and dimension and also on individual column bindings.It is important to let the collation of your OLAP objects be consistent with the collation of the relational data on which they are built.
EnableFast1033Locale. If you use the English language identifier, specific to the United States, as the default language for SSAS instances, you can get additional performance benefits by setting the EnableFast1033Locale configuration property. This is an advanced configuration property available only for the language identifier of the United States. Setting the value of this property to True enables SSAS to use a faster algorithm for string hashing and comparison.
Considerations for Working with Client Applications
When working with client applications in multiple languages for SSAS, the following general guidelines allow you to increase the portability of your business intelligence solution:
Translations provide display information for the names of SSAS objects. However, the identifiers for the same objects are not translated. Whenever possible, you should use the identifiers and keys for SSAS objects instead of the translated captions and names.
Handling Date and Time Values
When you perform month and day-of-week comparisons and operations, you should use the numeric date and time parts instead of date and time part strings. Date and time part strings are determined by the language identifier specified for the instance, and the current translation provided by the instance for the members of the time dimension. You need to take advantage of the date and time functions in MDX for negotiating time dimensions. You also need to use the VB.NET date and time functions for returning numeric date and time parts, instead of the name strings. You should use the literal date and time part strings when returning results to be displayed to a user. This is because the strings are frequently more meaningful than a numeric representation. However, you should not code any logic that depends on the displayed names being in a specific language.
Lab 4: Designing an OLAP Solution Architecture by Using Microsoft SQL Server 2008 Analysis Services
After completing the lab, you will be able to:
- Design dimensions.
- Create dimensions and hierarchies.
- Create a time dimension.
- Create a cube.
You are a lead database designer at Adventure Works. Adventure Works stores sensitive data in its databases. Adventure Works sells its products through a Web site and retail outlets in different geographical locations. Your organization has been tracking sales in its online transaction processing (OLTP) database. The company wants to analyze sales data and compare the results of the two sales channels. You need to identify the key sales trends by analyzing the given interview transcripts of Sales Managers of Retail Division and Internet Division, and the Marketing Director.
Interview Transcript of Stephen Yuan Jiang, Sales Manager—Reseller Division
The Reseller Sales division covers all of our worldwide resellers, and we're responsible for setting sales quotas for all of our sales employees, and for tracking sales to resellers. The Sales Management team doesn't really do much in the way of detailed analysis, other than reviewing printed reports.
The exception to that is a small team of sales analysts, led by José Saraiva. Usually they show us sales totals for each quarter by region and by product category or sub category, but they're able to drill down into specific resellers and products, so for example if one region shows a particularly low sales total one quarter, we can find out whether sales were low across all resellers and products. The sales are totaled by quarter, but I'd like for us to be able to break that down into monthly totals and sometimes even daily totals if necessary. Although not very frequently, but sometimes we may also want to see a breakup of sales figures against Sales Order Numbers.
We define sales quotas on a quarterly basis for our sales people, and I'd like to be able to track each salesperson's performance against the quota to see whether I need to provide any additional support or training to help them meet their target.
Interview Transcript of Linda Mitchell, Sales Manager—Internet Division
The Internet Sales division processes all orders placed through our Web site. I need to be able to see what products are bought through our Internet site, and information about the customers who buy the products. At the moment, I get a monthly report showing me order totals for each product that month. I can also use Excel to export a list of all orders for the month, but the list is generally too big to be useful for anything other than grouping by customer to find high-spending repeat customers. It also takes ages to export the data, and I often just cancel part way through.
I know that we sell products all over the world, but I'd like to be able to compare order patterns and figures in geographical locations. I'd be really interested in being able to see the different results by country or region, but I'd also like to be able to drill down to state or province level, then into cities, and right to postal code levels.
The monthly order totals are useful, but I'd like to be able to find patterns in Internet orders at a lower level of granularity. For example, it would be useful to know what proportion of our Internet orders are placed at weekends, and to be able to see how Internet order totals are affected by holidays. I'd also like to be able to see the cost associated with each sale, so that we can get an idea of which items are most profitable.
Interview Transcript for David M. Bradley, Marketing Director
The Marketing department tries to promote sales for both the Internet and reseller channels. Our main strategy is to create time-limited special offer campaigns to promote sales of specific products, but we'd like to be able to gather customer data that would help us take a more direct marketing approach.
We currently look at marketing campaigns by financial periods. For example, we need to see aggregated sales totals for the company's financial year, and be able to drill into financial quarters, months, and right down to individual days. We need to compare these campaigns based on the financial period, whether the sales were from the Retail or Internet Sales division, several different customer demographic properties and products. To justify marketing campaign costs we also want to see the average revenue earned per customer in a particular time period or for a new product launch. The product information needs to be broken down by category and subcategory. Currently, we are using several complex, static reports in Excel in addition to Reporting Services. No one report can give us the option to easily slice and dice all of the properties we need to determine the effectiveness of different campaigns.
We'd like to be able to analyze Internet sales by customer, based on demographics such as age, address, education level, marital status, gender, occupation, number of children, yearly income, car ownership, and so on. We know that some of our Internet customers have provided this information, but it's “locked up” somewhere in the sales system. We'd like to be able to get at that data and use it to identify buying patterns and trends that will help us create effective direct marketing campaigns. For example, we have recently identified that a customer's gender and commute distance between home and office plays a very important role in his buying patterns. In the near future we would like to see most of our sales figures sliced by commute distance and gender.
For the reseller channel, we need to be able to see a breakdown of sales by region, and we need to be able to drill into individual resellers to see how the type of reseller (such as specialty bike shop, value-added reseller, or warehouse) affects what the reseller buys from us.
Additionally, when we analyze customer sales, we'd like to be able to see the total number of discrete sales, not just the total amount of money spent. Also we may frequently want to see consolidated sales figures of the reseller as well as the internet sales department.
Exercise 1: Designing an OLAP Solution
In this exercise, you will review the interview transcripts and the scenario provided.
The main tasks for this exercise are as follows:
- Organize the Data Source View by using Diagrams.
- Design the OLAP solutions.
Task 1: Organize the data source view.
- Review the tables in the AdventureWorks Data Source View in D:\Labfiles\Mod04\Starter1\Module4_Starter.sln.
- In Diagram Organizer, create three diagrams, Reseller Sales Division, Internet Sales Division, and Marketing Division.
- Add the tables to each division in the respective diagram.
Task 2: Design the OLAP solution.
- Compare the tables in the Data Source View with the requirements mentioned. Compare your answers with DSV_Finished.xls.
- Open D:\Labfiles\Mod04\Starter1\Dimensions.xls and fill out the list of attributes present for each Dimension and the values for the properties mentioned. Compare your answers with D:\Labfiles\Mod04\Solution\Dimensions_Finished.xls.
- Open D:\Labfiles\Mod04\Starter1\Relationships.xls and fill out the relevant information regarding the Attribute Relations and User Hierarchies to be created for each Dimension. Compare your answers with D:\Labfiles\Mod04\Solution\Relationships_Finished.xls.
- Open D:\Labfiles\Mod04\Starter1\Measures.xls and fill out the relevant information regarding the measures to be included in the cube. Compare your answers with D:\Labfiles\Mod04\Solution\Measures_Finished.xls.
- Review the Data Source View to see whether you require any additional relationships between Dimensions and Measure Groups apart from the default Regular Relationships.
Results: After completing this exercise, you should have organized the Data Source View and begun to design the OLAP solution.
Exercise 2: Implementing the Data Source View
In this exercise, you will work with the Data Source View created in the Analysis Services project and implement the calculated members. You will also create and add the required named queries.
The main tasks for this exercise are as follows:
- Create calculated columns.
- Create named queries.
Task 1: Create calculated columns.
- Open adventureworks.dsv from D:\Labfiles\Mod04\Starter1\Module4_Starter.sln.
- Right-click the DimCustomer table and click New Named Calculation. Specify the column name as FullName and the following query as the Expression.
WHEN MiddleName IS NULL THEN
FirstName + ' ' + LastName
FirstName + ' ' + MiddleName + '.' + ' ' + LastName
3. Repeat step 2 for other calculated columns you designed as part of step 1 in Task 3 of the pervious exercise.
Task 2: Create named queries.
1. Specify FactSalesSummary as the Name of the query.
Results: After completing this exercise, you should have created calculated columns and named queries.
Exercise 3: Implementing Dimensions
In this exercise, you will add required attributes to the dimensions in the project created in previous exercise. You will also create the User Defined Hierarchies and Attribute Relationships as defined in exercise 1.
The main tasks for this exercise are as follows:
- Add Attributes.
- Set Attribute Properties.
- Create User Defined hierarchies and Attribute Relationships.
Task 1: Add attributes.
- Open Product.dim from D:\Labfiles\Mod04\Starter2\\Module4_Starter.sln.
- Add attributes to the Product Dimension as designed in Task 3 of Exercise 1.
- Assign a value of Key to the Usage property of the ProductKey attribute.
- Change the NameColumn property as EnglishProductName of the ProductKey attribute.
- Repeat the process for the following dimensions:
- Sales Territory
- Reseller Sales Order Details
- Internet Sales Order Details
Task 2: Set attribute properties.
1. Modify the following properties of Product Dimension as per the design decisions taken in Task 3 of Exercise 1:
- GroupingBehaviour of each Attribute.
2. Repeat the process for other dimensions.
Task 3: Create user-defined hierarchies and attribute relationships.
- Open Product.dim from D:\Labfiles\Mod04\Starter2\Module4_Starter.sln.
- Drag the CategoryName attribute from the Attributes pane to the Hierarchies pane.
- Drag the SubCategoryName attribute from the Attributes pane to the Hierarchies pane and drop it below the CategoryName attribute.
- Drag the Product attribute from the Attributes pane to the Hierarchies pane and drop it below the SubCategoryName attribute.
- Review the Attribute relationships created and verify the attribute relationships with Relationships.xls.
- Review your attribute relationship design.
- Repeat the process for other dimensions.
Results: After completing this exercise, you should have added required attributes to the dimensions and set attribute properties. You should have also created User Defined Hierarchies and Attribute Relationships.
Exercise 4: Creating a Cube
In this exercise, you will add a cube to the existing Analysis Services project and include the dimensions created. You will also add the relevant measures to the cube and then review the Dimension Usage. You will need to create a new Referenced Relationship and a new Fact Relationship.
The main tasks for this exercise are as follows:
- Create a Cube and add Dimensions and Measures.
- Create relations between Dimensions and Measure Groups.
- Deploy and test the cube.
Task 1: Create a cube and adding dimensions and measures.
- Open the AdventureWorks cube from D:\Labfiles\Mod04\Starter2\Module4_Starter.sln.
- Add cube dimension for AdventureWorks node.
- In the Measures pane, right click and click New Measure.
- In the Source Table list, select the FactInternetSales table.
- In the Source Column list, select the SalesAmount column.
- Select the Sum function in the Usage list.
- Repeat the process for other measures.
Task 2: Create relations between dimensions and measure groups.
- In the Dimension Usage tab of the Cube Designer view the Regular relationships and the Fact relationships for the Internet Sales Order Details and Reseller Sales Order Details dimensions.
- Allow slicing of measures in the Reseller Sales measure group by the Geography dimension.
- Set the relationship type as Referenced.
- Select the Reseller dimension as the intermediate dimension and Geography Key as the reference and intermediate dimension attribute.
Task 3: Deploy and test the cube.
- Build and deploy the Deploy Module4_Starter project.
- Reconnect the cube designer and Review the interview transcripts and test the cube.
Results: After completing this exercise, you should have added a cube to the project and included the dimensions and measures. You should have also created a new Referenced Relationship and a new Fact Relationship.
Module Review and Takeaways
Provide, when appropriate, 1-2 questions that provide context and opportunities for students to be actively engaged in knowledge transaction:
- Use brainstorming questions that activate students' past experience to help them effectively connect with the topic at hand. (This is a form of inductive learning which allows students to go from the familiar to the not-so-familiar; thereby reducing cognitive overload.)
- Use small real world scenarios/cases to invite students to apply the processes and principles right after they have learned them.
- Use reflective questions to invite students to reflect upon their own practices at work and make comparisons to the best practices and principles taught in class.
- How can surrogate keys benefit an OLAP solution?
- What are the advantages of creating a Time dimension table versus using a Server Time dimension?
- When could you use Fact Relationships?
- What are the two types of attribute relationships you can create?
The following are the key points to take away from this module:
The main function of an SSAS is to create a data store that users can query to answer business-related questions. Users can access the information provided by the SSAS to:
- Understand the business through analysis of current performance and key metrics.
- Make business decisions based on past and present correlations between business contexts (dimensions) and results (facts).
- In a data warehouse, the most common OLAP design is implementing a combination of star and snowflake schemas. Where a dimension would benefit from reduced redundancy and referential integrity, you can use a snowflake schema and continue to use a star schema for all other dimensions.
- You should use surrogate keys in dimensional models, instead of relying on operational production codes.
- To design your dimensions, you should:
- Select the business process to follow.
- Declare the grain.
- Choose the dimensions that apply to each fact row and identify the facts.
- Attributes add to the complexity and storage requirements of a dimension, and the number of attributes in a dimension can significantly affect performance. This is especially true of attributes that have AttributeHierachyEnabled set to True.
- SSAS must understand the relationships between your attributes to correctly aggregate data, effectively store and retrieve data, and create useful aggregations. To optimize this dimension design, you must understand how your attributes are related to each other and then take steps to let SSAS know what relationships are.
- In SSAS, attributes can be exposed to users by using two types of hierarchies— attribute hierarchies and user hierarchies. From a performance perspective, attributes that are only exposed in attribute hierarchies are not automatically considered for aggregation. In user hierarchies, attributes are arranged into predefined multilevel navigation trees to facilitate end-user analysis.
- In a natural hierarchy, all attributes participating as levels in the hierarchy have direct or indirect attribute relationships from the bottom of the hierarchy to the top of the hierarchy. In an unnatural hierarchy, the hierarchy consists of at least two consecutive levels that have no attribute relationships.
- In natural hierarchies, the hierarchy tree is materialized on disk in hierarchy stores. In addition, all attributes participating in natural hierarchies are automatically considered to be aggregation candidates.
- A ragged hierarchy is a hierarchy in which all child entries do not have parents at the same level.
- You can present the same information to users by using perspectives or using multiple cubes. Perspectives are often a simpler and more efficient solution than multiple cubes.
- Each KPI formula is defined as hidden calculated measures, except when it is just a reference to another measure. For better performance, you should createreal calculated measures with properly designed calculations and then use these measures in KPI definitions.
- To determine what types of actions to implement, consider the functionality that is required and any limitations in the client application.
- If a client application requests information in a specified language identifier, the SSAS instance attempts to resolve data and metadata for SSAS objects to the closest possible language identifier.
- Translations provide display information for the names of SSAS