Database Choice And Download And Population Computer Science Essay

Published: Last Edited:

This essay has been submitted by a student. This is not an example of the work written by our professional essay writers.

I have chosen Microsoft SQL Server as my choice of DBMS for this coursework. I have chosen this DBMS over Oracle, MYSQL and Postgres as this DBMS is a Microsoft product, there is great integration with other Microsoft products, such as: Sharepoint, ASP.NET, Excel, to name a few. But the swinging factor that made me choose this DBMS is, because of the lab sessions at the institution that I am studying for this module, is using Microsoft SQL Server, thus I feel most comfortable using this DBMS.

I used this guide, found online, to help me decide which package of Microsoft SQL Server 2012 express, to download and install onto my system; Microsoft SQL Server 2012 express can be downloaded from here,

Of the different packages available, I chosen to download and install the Express with Tools package. Following the guide, I did not encounter any problems arising from the installation or from the setup process.

Populating with Mondial database

From the link given in the coursework,, the mondial database was not difficult to search using Google. Mondial database, The archived file was downloaded and extracted. Three separate .sql files exist in the archive.

As the three files are .sql extensions, double-clicking it opens Microsoft SQL Server 2012 management studio.

First the 'mondial-schema-sqlserver2008' file was opened and executed. This file contains all the instructions on creating a new mondial database and also populating it with new tables that would hold data.

Next, the 'mondial-inputs-sqlserver2008' was opened and executed. This file contains all the data that would fill the newly created tables. Now, the mondial database contains information.

Lastly, the 'modial_latlong' file was opened and executed. This file updates the database by adding a new column to the city, island and mountain table.

A2 User's Manual Summary

Chapter 1: Getting Started (SQL Server 2012)

1.01. Product Specifications for SQL Server 2012:

Compute Capacity Limits by Edition of SQL Server: discusses on the compute capacity limits of different editions of SQL Server 2012 and how they would differ in a physical and virtualized environment with hyper-threaded processors

Editions and Components of SQL Server 2012: different editions of SQL Server 2012 caters to the requirements of the organization or individual, this section provides the information that would help the organization/user to pick the edition of SQL Server 2012 that would suite their requirements

Features Supported by the Editions of SQL Server 2012: details of the features that different edition of SQL Server 2012 provides and supports

Maximum Capacity Specifications for SQL Server: specifies the maximum sizes and numbers of various objects defined in SQL Server components

1.02 What's New in SQL Server 2012:

New and enhanced features of SQL Server 2012: Cross-Cluster Migration of AlwaysOn Availability Groups for OS Upgrade, Selective XML Index, DBCC SHOW_STATISTICS works with SELECT permission, New function returns statistics properties, New function returns statistics properties, SSMS Complete in Express; SlipStream Full Installation, Business Intelligence highlights (with SQL Server 2012 SP1, Office and SharePoint Server 2013), Management Object Support Added for Resource Governor DDL.

1.03 Quick-Start Installation of SQL Server 2012:

A quick how to guide on setting-up SQL Server 2012 onto your machine.

1.04 Get Started with Product Documentation for SQL Server:

Instructions on how to install, access and navigate through the product documentation.

1.05 Getting Assistance (SQL Server 2012):

Different avenues of information and assistance on SQL Server 2012

Accessibility for People with Disabilities: Accessibility Features for SQL Server 2012, Alternative Formats of the Microsoft documentation for customers with disabilities

Providing Feedback for SQL Server 2012: report usage data for SQL Server 2012

1.06 Backward Compatibility:

Backward compatibility information for SQL Server components.

Deprecated SQL Server Features in SQL Server 2012: features that are scheduled to be removed in a future release of SQL Server

Discontinued SQL Server Features in SQL Server 2012: features that are no longer included in SQL Server 2012

Breaking Changes to SQL Server Features in SQL Server 2012: describes about the changes in SQL Server 2012 that might cause the functions, scripts and applications that are based on previous versions of SQL Server to not work normally in this version

Behaviour Changes to SQL Server Features in SQL Server 2012: describes about the behaviour changes that would affect how features works or interacts in SQL Server 2012 as compared to earlier versions of SQL Server, SQL Server Multi-Subnet Failover Cluster, SQL Server Failure detection in SQL Server Failover Cluster.

1.07 About the SQL Server License Terms:

The license terms is installed onto system once SQL Server is installed but only a single copy of the license terms is installed when there is more than one edition of SQL Server installed onto the machine and the license applies to all the editions.

1.08 Microsoft SQL Server Privacy Statement:

Explains about the data collection and use practices for the Microsoft SQL Server 2012.

1.09 SQL Server Windows Logo Certification:

Details about the SQL Server certification for the Windows Server logo program.

1.10 SQL Server Files Left After Uninstall:

List of files left after the uninstallation of SQL Server and their location directories.

1.11 Legal Notice for Documentation (SQL Server Books Online):

The legal notice for this user manual.

Chapter 2: SQL Server Database Engine

2.01 What's New (Database Engine):

Enhancements to the database engine in SQL Server 2012.

Availability Enhancements (Database Engine): ensures that when a downtime occurs, data is not lost

Manageability Enhancements (Database Engine): improved manageability of the database engine

Programmability Enhancements (Database Engine): making it simpler to program database

Scalability and Performance Enhancements (Database Engine): the new data warehouse query acceleration feature, improves data warehouse query performance and SQL Server 2012 has more partition support as compared to previous versions

Security Enhancements (Database Engine): includes provisioning during setup, new SEARCH PROPERTY LIST permissions, new user-defined server roles, and new ways of managing server and database roles

Resource Governor Enhancements (Database Engine): enables effective govern performance in multi-tenancy environments, includes support for 64 resource pools, better CPU usage control and greater resource pool affinity for partitioning of physical resources.

2.02 SQL Server Database Engine Backward Compatibility:

This topic describes the backward compatibility in versions of SQL Server 2012.

Deprecated Database Engine Features in SQL Server 2012: eg. Ability to return result sets from triggers

Discontinued Database Engine Functionality in SQL Server 2012: eg. 80 compatibility levels; Databases must be set to at least compatibility level 90

Breaking Changes to Database Engine Features in SQL Server 2012: eg. CREATE LOGIN WITH PASSWORD = 'password' HASHED option cannot be used with hashes created by SQL Server 7 or earlier

Behaviour Changes to Database Engine Features in SQL Server 2012: eg. The LOG function now has an optional base parameter

Full-Text Search Backward Compatibility: FULLTEXTCATALOGPROPERTY ('LogSize') is not supported in the next release of SQL Server, there is no full-text search features are discontinued in SQL Server 2012, Inline functions with a full-text operator (in SQL Server 2005: both columns appear but in SQL Server 2008 & later: only one column appear), updates all the word breakers and stemmers used by Full-Text Search and Semantic Search.

2.03. SQL Server Management Tools Backward Compatibility:

This topic describes the changes in behaviour between versions of SQL Server.

Deprecated Management Tools Features in SQL Server 2012: eg. SQL Server 2005 Registered Server API to be removed in the future release of SQL Server

Discontinued Management Tools Features in SQL Server 2012: eg. ActiveX subsystem for SQL Server Agent has been removed in SQL Server 2012. There is no replacement functionality.

Breaking Changes to Management Tools Features in SQL Server 2012: eg. Not able to create a utility control point on a SQL Server 2008 R2 instance of SQL Server in SQL Server 2012 Management Tools

2.04. Database Features and Tasks:

Database Engine Instances: An instance of the Database Engine is a copy of the sqlservr.exe executable that runs as an operating system service and each instance manages several system databases and one or more user databases. Each computer can run multiple instances of the Database Engine

Database Features: features and tasks associated with databases, database objects, data types, and the mechanisms used to work with or manage data

Database Engine Cross-Instance Features: tools and tasks associated with managing and monitoring servers and database instances

High Availability Solutions (SQL Server): high-availability solutions masks the effects of a hardware or software failure and maintains the availability of applications so that the perceived downtime for users is minimized

Security and Protection (Database Engine): SQL Server uses operating system files for operation and data storage, thus the best practice is to restrict access to these files; two types of authentication modes for Database Engine, Windows Authentication mode and mixed mode; SQL Server passwords can contain up to 128 characters, including letters, symbols, and digits.

2.05. Technical Reference:

Command Prompt Utility Reference (Database Engine): Command prompt utilities enable you to script SQL Server operations, eg. Profiler Utility, start SQL Server Profiler from a command prompt; <drive>:\Program Files\Microsoft SQL Server\110\ Tools\Binn

Database Engine PowerShell Reference: SQL Server includes a set of Windows PowerShell 2.0 cmdlets for performing common actions in the Database Engine. In addition, Query Expressions and Uniform Resource Names (URN) can be converted to SQL Server PowerShell paths, or used to specify one or more objects in the Database Engine. Eg. To evaluate whether a Database Engine object complies with a Policy-based Management policy; Cmdlet: Invoke-PolicyEvaluation cmdlet

Errors and Events Reference (Database Engine): Each error message has a unique error number, contains diagnostic information about the cause of the error and most messages have substitution variables in which information, such as the name of the object generating the error, is inserted. There are levels of severity for the errors that are raised by the SQL Server Database Engine

Showplan Logical and Physical Operators Reference: the query plan can be viewed graphically by using SET SHOWPLAN statements in the SQL Server Management Studio.

SQL Server Event Class Reference: events can be recorded as they occur using the SQL Server Profiler and these recorded events are instances of the event class in the trace definition

2.06. Transact-SQL Reference:

Transact-SQL is a dialect of the SQL language where all applications use in SQL Server to communicate with the server.

Reserved Keywords (Transact-SQL): are used to define, manipulate, and access database

Other statements of T-SQL includes: BACKUP and RESTORE Statements, Collation, Data Definition Language (DDL) Statements, Data Manipulation Language (DML) Statements, SET Statements

2.07. XQuery Language Reference (SQL Server):

Transact-SQL supports a subset of the XQuery language that is used for querying the xml data type. This sub-topic explains about XQuery language and its usage in SQL Server 2012

Chapter 3: Analysis Services

Analysis Services provides a range of solutions for building and deploying analytical databases used for decision support in Excel, PerformancePoint, Reporting Services, and other business intelligence applications.

3.01. What's New (Analysis Services):

Some examples; PowerPivot supports deeper integration with Excel, DAX Functions in SQL Server 2012, hierarchies and images in tabular models

3.02. Analysis Services Backward Compatibility:

Deprecated Analysis Services Functionality in SQL Server 2012: eg. the Create Action statement is included for backwards compatibility and is replaced by the Action object

Discontinued Analysis Services Functionality in SQL Server 2012: eg. Decision Support Objects (DSO) library that provided compatibility with SQL Server 2000 Analysis Services databases is also discontinued and no longer part of SQL Server

Breaking Changes to Analysis Services Features in SQL Server 2012: Setup commands removed for a PowerPivot for SharePoint installation

Behavior Changes to Analysis Services Features in SQL Server 2012: eg. Analysis Services, Multidimensional Mode; cube browser in Management Studio and Cube Designer has been removed, PowerPivot for SharePoint; higher permission requirements for using a PowerPivot workbook as an external data source

3.03. Analysis Services Features and Tasks:

Analysis Services foundational documentation is organized by modelling and server mode so that you can focus on just the tools, tasks, and features that are available in the mode you installed.

Analysis Services Instance Management: An instance of Analysis Services is a copy of the msmdsrv.exe that runs as an operating system service and each instance is fully independent of other instances on the same server, having its own configuration settings, permissions, ports, start-up accounts, file storage, and server mode properties. This sub-topic includes information on how to determine the server mode of an analysis services instance, configure service accounts and how to register an analysis services instance in a server group

Tabular Modeling (SSAS Tabular): Tabular models are in-memory databases in Analysis Services and it supports data access through the cache mode and the DirectQuery mode.

Multidimensional Modeling (SSAS): The primary reason to use multidimensional model is to achieve fast performance of ad hoc queries against business data. The other benefit to use Analysis Services multidimensional database, is its integration with commonly used business intelligence reporting tools such as Excel, Reporting Services, and PerformancePoint, as well as custom applications and third-party solutions

Data Mining (SSAS): Data mining will help to discover patterns in the data and it allows you to make intelligent decisions about complex problems

PowerPivot for SharePoint (SSAS): PowerPivot for SharePoint provides centralized data access to PowerPivot workbooks in a SharePoint environment

3.04. Technical Reference (SSAS):

Data Mining Stored Procedures (Analysis Services - Data Mining): SystemGetCrossValidationResults; this partitions the mining structure into the specified number of cross-sections, trains a model for each partition, and returns accuracy metrics for each partition, SystemGetClusterCrossValidationResults; this statement will return similar results as the above command but the statement only works if the mining structure contains at least one clustering model, SystemGetAccuracyResults; this will return cross-validation accuracy metrics for a mining structure and all related models, SystemGetClusterAccuracyResults; this will return cross-validation accuracy metrics for a mining structure and related clustering models

Errors and Events Reference (PowerPivot for SharePoint): example of an error; 'An error occurred during an attempt to establish a connection to the external data source. The following connections failed to refresh: PowerPivot Data', this occurs when query is done on a machine that does not have PowerPivot for SharePoint installed on it

Analysis Services PowerShell Reference: allows the usage of Windows PowerShell to navigate, administer, and query Analysis Services objects with cmdlets

PowerPivot Reference for SharePoint PowerShell: list of PowerShell cmdlets used to configure or administer a PowerPivot for SharePoint installation. Examples: Get-PowerPivotServiceApplication cmdlet will return the global properties of the SQL Server Analysis Services object in the farm

Query and Expression Language Reference (Analysis Services): Sub-topic includes: Multidimensional Expressions (MDX) Reference, Data Analysis Expressions (DAX) Reference and Data Mining Extensions (DMX) Reference

User Interface Reference (Analysis Services): help topics for Microsoft SQL Server Analysis Services

Chapter 4: SQL Server Integration Services

4.01. What's New (Integration Services): new project deployment model allows projects to be deployed onto the Integration Services server, Development Enhancements includes: the ability to create connection managers at the project level that can shared by multiple packages in the project and the ability to undo and redo up to twenty actions in the SSIS Designer

4.02. Integration Services Backward Compatibility:

Deprecated Integration Services Features in SQL Server 2012: there are no deprecated Integration Services features in SQL Server 2012

Discontinued Integration Services Functionality in SQL Server 2012: options for connecting shapes on the design surface of the Control Flow tab and Data Flow tab have been discontinued in SQL Server 2012, the Execute DTS 2000 Package task has also been discontinued

Breaking Changes to Integration Services Features in SQL Server 2012: There are no breaking changes in SQL Server 2012 Integration Services (SSIS) features

Behavior Changes to Integration Services Features in SQL Server 2012: There are no behavior changes in SQL Server 2012 Integration Services (SSIS) features

4.03. Integration Services Features and Tasks:

Integration Services (SSIS) and Studio Environments: includes two studios for working with Integration Services, SQL Server Data Tools (SSDT) for developing the Integration Services packages that a business solution requires (able to create packages that include complex control flow, data flow, event-driven logic, and logging) and SQL Server Management Studio for managing packages in a production environment (able to create folders to organize packages in a way that is meaningful to your organization)

Integration Services (SSIS) Packages: able to combine a collection of connections, control flow elements, data flow elements, event handlers, variables, parameters, and configurations into a package

Integration Services (SSIS) Connections: able to connect to source and destination data stores such as text, XML, Excel workbooks, and relational databases to extract and load data

Integration Services (SSIS) Projects: an Integration Services project stores and groups the files that are related to the package

Integration Services (SSIS) Parameters: allow you to assign values to properties within packages at the time of package execution. Project parameters are created at the project level and package parameters are created at the package level

Integration Services (SSIS) Queries: provides the tool, query builder, to create a query for use int he Execute SQL task, the OLE DB source and the OLE DB destination, and the Lookup transformation. Query builder may also; join related tables, query or update databases, view and edit results immediately

Integration Services (SSIS) Expressions: provides the graphical tool, expression builder, for building expressions. The expression builder tool helps create and edit a property expression or write the expression that sets the value of a variable using a graphical user interface that lists variables and provides a built-in reference to the functions, type casts, and operators that the Integration Services expression language includes.

Integration Services (SSIS) Variables: variables in Integration Services packages can be used for; updating properties of package elements at run time, including an in-memory lookup table, Building expressions that include variable values

Integration Services (SSIS) Event Handlers: custom event handlers can be created to manage an OnError event. Event hadlers may perform these tasks as well; clean up temporary data storage when a package or task finishes running, retrieve system information to assess resource availability before a package runs

Integration Services Service (SSIS Service): this is a Windows service for managing Integration Services packages.0020 SQL Server 2012 supports the Integration Services service for backward compatibility with earlier releases of Integration Service. The Integration Services service provides these management capabilities; start/stop/monitor remote and locally stored packages, importing and exporting packages, managing package storage

Integration Services (SSIS) Server: the Integration Services server is an instance of the SQL Server Database Engine that hosts the SSISDB database. The database stores the following objects: packages, projects, parameters, permissions, server properties, and operational history

Deployment of Projects and Packages: supports two deployment models, project deployment model and package deployment model

Monitoring for Package Executions and Other Operations: Integration Services (SSIS) Logging (monitor logs), Performance Counters (monitor the performance of the data flow engine)

Security Overview (Integration Services): Security in SQL Server Integration Services consists of several layers that provide a rich and flexible security environment. These includes, the use of digital signatures, package properties, SQL Server database roles, and operating system permissions

4.04. Technical Reference (Integration Services):

Errors and Events Reference (Integration Services): an example, DTS_E_CANNOTACQUIRECONNECTIONFROMCONNECTIONMANAGER (this indicates that the package cannot run because a specified connection failed)

Transact-SQL Reference (Integration Services Catalog): this sub-topic describes the Transact-SQL API for administering Integration Services projects that have been deployed to an instance of SQL Server

Chapter 5: Data Quality Services

5.01. Introducing Data Quality Services:

DQS provides assistance to manage the quality and integrity of data sources. It also helps in the discovery, building and managing knowledge about your data. With the knowledge it would be easier to perform, data cleansing, matching and profiling activities

5.02. Data Quality Services Concepts:

Knowledge Management Concepts: DQS knowledge management includes the processes used to create and manage the knowledge base, both in a computer-assisted manner and interactively

Data Quality Project Concepts: The data steward performs data-quality operations (cleansing and matching) using a data quality project in the Data Quality Client application

Data Quality Administration Concepts: DQS administrator can perform variety of administrative tasks using the Data Quality Client application

5.03. Data Quality Services Features and Tasks:

Data Quality Client Application: This enables performing of data quality operations using a standalone tool. This application enables creation of knowledge bases, create and run data quality projects, and perform administrative tasks.

DQS Knowledge Bases and Domains: DQS enables the use of both computer-assisted and interactive processes to create, build, and update your knowledge base, which helps in the process to cleanse data.

Data Quality Projects (DQS): DQS uses the knowledge base to improve the quality of the source data by performing data cleansing and data matching activities. Then, exporting the resultant data to a SQL Server database or a .csv file

Data Cleansing: Data Cleansing is the process of analysing the quality of data in a data source, manually approving/rejecting the suggestions by the system and making changes to the data. It identifies incomplete or incorrect data in the data source, provides a two-step process to cleanse the data, computer-assisted and interactive, it standardizes and enriches customer data by using domain values, domain rules and reference data

Data Matching: Data Matching helps to reduce data duplication and improve data accuracy in a data source. Data matching eliminates differences between data values that should be equal, determining the correct value and reducing the errors that data differences can cause

Reference Data Services in DQS: This function enables subscription to third-party reference data providers, to easily cleanse and enrich your business data by validating it against their high-quality data

Data Profiling and Notifications in DQS: Data profiling in DQS is the process of analysing the data in an existing data source and displaying statistics about the data in DQS activities

DQS Administration: This allows administration and managing various DQS activities performed on Data Quality Server, configure server-level properties related to DQS activities, configure the Reference Data Service settings and configure DQS log settings

DQS Security: The security infrastructure is based on the SQL Server security infrastructure, where the database administrator grants a user a set of permissions by associating the user with a DQS role. This ensures that the user has access to and the functional activities that the user is allowed to perform

Chapter 6: SQL Server Replication

6.01. What's New (Replication):

Replication Support these features for AlwaysOn Availability Groups:

A database that is enabled for Change Data Capture (CDC) can be part of an availability group

A database enabled for Change Tracking (CT) can be part of an availability group

Replication does support extended events but however, only for internal use and it supports up to 15000 partitions for tables and indexes

6.02. Replication Backward Compatibility:

Deprecated Features in SQL Server Replication:

The RMO API is deprecated. New applications can be build using Transact-SQL

Heterogeneous replication to non-SQL Server subscribers is deprecated. To move data, create solutions using change data capture and SSIS

Oracle Publishing is deprecated. To move data, create solutions using change data capture and SSIS

Breaking Changes in SQL Server Replication:

Transactional Replication: Initializing a transactional subscription from a backup. To initialize a subscription from a backup in SQL Server 2008, a user must be a member of the dbcreator server role

Merge Replication: New identity ranges assigned. For tables that use automatic identity range management, replication might assign new identity ranges during upgrade. If any tables have a larger identity range assigned to the Subscriber than to the Publisher, replication assigns a range to the Publisher equal to that of the Subscriber

6.03. Replication Features and Tasks:

Types of Replication:

Snapshot Replication- distributes data exactly as it appears at a specific moment in time and does not monitor for updates to the data

Transactional Replication- typically starts with a snapshot of the publication database objects and data. As soon as the initial snapshot is taken, subsequent data changes and schema modifications made at the Publisher are usually delivered to the Subscriber as they occur (in near real time)

Merge Replication- typically starts with a snapshot of the publication database objects and data. Subsequent data changes and schema modifications made at the Publisher and Subscribers are tracked with triggers.

Heterogeneous Database Replication: SQL Server 2012 supports these heterogeneous scenarios for transactional and snapshot replication,

Publishing data from Oracle to SQL Server

Publishing data from SQL Server to non-SQL Server Subscribers

Replication Agents: Replication uses many stand-alone programs (agents) to perform the tasks associated with tracking changes and distributing data

Republish Data: In a republishing model, the Publisher sends data to a Subscriber, which then republishes the data to any number of other Subscribers. Especially useful when a Publisher must send data to Subscribers over a slow or expensive communications link. If there are a number of Subscribers on the far side of that link, using a republisher shifts the bulk of the distribution load to that side of the link

Replication over the Internet: this method allows remote, disconnected users to access data when they need it using a connection to the Internet. Replication of data over the internet can be done though, Virtual Private Network or web synchronization option for merge replication

Security and Protection (Replication): Reduce threats to a replication topology through encryption of data, authentication and authorization to control access to replicated database objects and to the computers and agents involved in replication processing, management of logins and passwords in Replication

Administration (Replication): This sub-topic contains information on administering replication topologies.

Best Practices for Replication Administration

Develop and test a backup and restore strategy

Script the replication topology

Create thresholds and alerts

Monitor the replication topology

Establish performance baselines and tune replication if necessary

Validate data periodically

Adjust agent parameters through profiles

Adjust publication and distribution retention periods

Understand how to change article and publication properties if application requirements change

Understand how to make schema changes if application requirements change

Monitoring (Replication): This is an important aspect of deploying replication. As replication activity is distributed, it is essential to track the activity and status across all computers involved in replication. Tools that can be used to monitor replication:

Microsoft SQL Server Replication Monitor

Microsoft SQL Server Management Studio

Transact-SQL and Replication Management Objects (RMO)

Alerts for replication agent events

System Monitor

Publish Data and Database Objects:

The following objects can be publish using replication:


Partitioned Tables

Stored Procedures - Definition (Transact-SQL and CLR)

Stored Procedures - Execution (Transact-SQL and CLR)


Indexed Views

Indexed Views as Tables

User-Defined Types (CLR)

User-Defined Functions (Transact-SQL and CLR)

Alias Data Types

Full text indexes

Schema Objects (constraints, indexes, user DML triggers, extended properties, and collation)

Provide these information to create a publication:

The distributor

Location of the snapshot files

The publication database

Type of publication to create

The data and database objects to include in the publication

Static row filters and column filters for all types of publications, and parameterized row filters and join filters for merge publications

The Snapshot Agent schedule

Accounts under which the following agents will run: the Snapshot Agent for all publications; the Log Reader Agent for all transactional publications; the Queue Reader Agent for transactional publications that allow updating subscriptions

Name and description for the publication

Configure Distribution: This sub-topic deals with the process of configuring the Publisher to a distribution database server (distributer). While only one publisher can be assigned to a distributer, a distributer can serve many publishers.

Subscribe to Publications: A subscription is a request for a copy of the data and database objects in a publication and it defines which publication will be received, and where and when it will be received.

Types of subscriptions:

Push Subscription; publisher will notify subscriber that there is a change and the changes will be pushed, when the subscriber demands for it, continuously or on a scheduled basis.

Pull Subscription; subscriber requests changes made at the publisher. This allows the user at subscriber to determine when the data changes are synchronized

Provide the following information to create a subscription:

Name of the publication

Name of the Subscriber and the subscription database

Whether the Distribution Agent or Merge Agent runs at the Distributor or at the Subscriber

Whether the Distribution Agent or Merge Agent runs continuously, on a scheduled basis or on demand only

Whether the Snapshot Agent should create an initial snapshot for the subscription and whether the Distribution Agent or Merge Agent should apply that snapshot at the Subscriber

Accounts under which the Distribution Agent or Merge Agent will run

For merge replication, the type of subscription. Either server or client

Initialize a Subscription: Subscribers must be initialized. This is so that they would have a copy of the schema from each article in the publication that they have subscribed to and any replication objects that are required.

Initialization methods:

Initialize a Subscription with a Snapshot (default initialization method):

Initialize a Transactional Subscription Without a Snapshot

Reinitialize Subscriptions

Synchronize Subscriptions (Replication): Replication Agents synchronize subscriptions, Distribution Agent synchronizes subscriptions to transactional and snapshot publications and Merge Agent synchronizes subscriptions to merge publications

Synchronize Data: This is the process of data and schema changes being updated between the Publisher and Subscribers after the initial snapshot has been applied at the Subscriber

Synchronization can occur:

Continuously, typical for transactional replication

On demand, typical for merge replication

On a schedule, typical for snapshot replication

Validate Replicated Data: Validation can be performed for specific subscriptions or for all subscriptions to a publication. By specifying the validation types, the Distribution Agent or Merge Agent will validate data the next time it runs

Scripting Replication: As part of a disaster recovery plan, all replication components should be scripted. Scripts can be stored with backup files so as, when there is a need for a replication topology to be reconfigured, this can happen

6.04. Technical Reference (Replication):

Replication Views (Transact-SQL): These views contain information that is used by replication in SQL Server. When the database is removed from a replication topology, all the replication objects are also removed from user databases

Replication Tables (Transact-SQL): Replication adds system tables to the user database when the database is configured as a Publisher or a Subscriber. When the user database is removed from the replication topology these tables will be removed too

Replication Stored Procedures (Transact-SQL): These procedures are documented and available as a method for accomplishing one-time tasks; implementing replication is an example. The Replication Management Objects (RMO) aids in the tasks of adding programmatic control of replication to an application or carrying out repeated replication tasks

Properties Reference (Replication): This section contains information about the properties of replication wizards and dialog boxes

Tools Reference (Replication): This sub-topic is about the tools SQL Server provides to implement, administer and to troubleshoot replication

Errors and Events Reference (Replication): This topic is on the cause and resolution information for errors related to replication

Chapter 7: Reporting Services (SSRS)

7.01. What's New (Reporting Services):

Support for Power View in Microsoft Excel 2013 and Power View in Microsoft SharePoint 2013

The Reporting Services report server in SharePoint mode supports SharePoint 2013

There is a new version of the Reporting Services add-in for SharePoint that supports SharePoint 2013 and SharePoint 2010

Reports can be viewed interactively in Apple Safari on iOS devices

7.02. Reporting Services Backward Compatibility:

Deprecated Features in SQL Server Reporting Services in SQL Server 2012:

HTML Rendering Extension Device Information Settings










Reporting Services BIFF8 rendering extensions Reporting Services reports to the Microsoft Word and Microsoft Excel 1997-2003 binary interchange file format

Report Definition Language (RDL) 2005 and earlier

Custom Report Items (CRI) compiled for SQL Server Reporting Services 2005 and earlier

Semantic modeling language (SMDL) report models

Discontinued Functionality to SQL Server Reporting Services in SQL Server 2012: There is none at the moment

Breaking Changes in SQL Server Reporting Services in SQL Server 2012:

SharePoint Mode Server References Require the SharePoint Site

Not able to browse or reference directly to the Report Server using the virtual directly name in the URL path

Changes to SharePoint Mode Command-Line Installation

Input settings, /RSINSTALLMODE only works with Native mode installations and it does not work for SharePoint mode installations

The Reporting Services WMI Provider no longer supports Configuration of SharePoint Mode

Report Model Designer is not available in SQL Server Data Tools

Behaviour Changes to SQL Server Reporting Services in SQL Server 2012:

View Items permission will not download Shared Datasets (SharePoint Mode)

Report Server trace logs are in a new location for SharePoint mode (SharePoint Mode)

GetServerConfigInfo SOAP API is no longer supported (SharePoint Mode)

Report Server Configuration and Management Tools, Configuration Manager is not used for SharePoint Mode

Changing the server from one mode to another is disallowed

7.03. Reporting Services Concepts (SSRS):

Report Server Concepts: A report server is a computer that has an instance of Reporting Services installed. The report server stores items such as reports, report-related items, resources, schedules and subscriptions and it can be configured as a stand-alone single server, as a scale out farm or integrated with SharePoint Server

Reports and Related Item Concepts

Reports and report definitions

Report data connections and data sources

Report Datasets

Report parameters

Report items

Data regions and maps

Report parts

Data alerts

Types of Reports

Drilldown reports


Main/detail reports and drillthrough reports

Linked reports

History reports

Cached reports


Model reports and Clickthrough reports

Saved reports

Published reports

Upgraded reports

Stages of Reports

Report definition

Compiled report and intermediate report format

Snapshot or Report History

Processed report

Rendered report

Exported report

7.04. Reporting Services Features and Tasks (SSRS):

Reporting Services Report Server (SSRS): The report server is the central component of a Reporting Services installation and it supports two modes of deployment; Native mode and SharePoint mode.

Reporting Services Reports (SSRS):

Reports are XML based report definitions that includes report data and report layout elements

On a client file system, report definitions have the file extension .rdl

After a report is published, the report item is stored on the report server or on the SharePoint site


Able to view reports, publish them to a report server or SharePoint site to share with your team or organization

Able to manage report data sources separately from the report definition

Report Data (SSRS):

Comes from multiple sources of data in your organization

Creating data sources and datasets that represent the underlying report data would be the first step in designing a report

Report Parameters (Report Builder and SSRS):

Enables control of report data, connect related reports together and vary report presentation

Understand how parameters and dataset queries are related, how to include parameters in expressions and how to manage parameters independently from a report definition on the report server or SharePoint site are the keys to design a report that uses parameters

Report Parts in Report Designer (SSRS):

Enables publishing of created tables, charts, and other report items in a project as report parts to a report server or SharePoint site integrated with a report server it can be reused in other reports

Report Parts function the same way in Report Designer and in Report Builder

Schedules: Two types of schedules are provided in Reporting Services; shared schedules and report-specific schedules

Subscriptions and Delivery (Reporting Services):

Is a standing request to deliver a report at a specific time or in response to an event and in an application file format that is specified in the subscription

Data Alerts (SSRS):

Data Alerts informs user about report data that is interesting or important at a relevant time

Power View (SSRS):

A feature of SQL Server 2012 Reporting Services

Provides an interactive data exploration, visualization and presentation experience

Provides intuitive ad-hoc reporting for business users such as data analysts, business decision makers and information workers

Security and Protection (SSRS):

Support for Extended Protection for Authentication through the use of channel binding and service binding

Support for different authentication types for users and client applications to authenticate with the report server. Helps achieve the appropriate level of security required by your organization

URL Access (SSRS):

Enables sending of commands to a report server through a URL request

Extensions (SSRS):

Helps to modularize the types of input or output it accepts for authentication, data processing, report rendering and report delivery

Enables existing Reporting Services installations to utilize new software standards in the industry; such as new authentication scheme

Supports custom authentication extensions, data processing extensions, report processing extensions, rendering extensions and delivery extensions

Available extensions are configurable in the RSReportServer.config configuration file

Tools (SSRS):

Graphical and scripting tools which supports the development and use of rich reports in a managed environment

Tool set includes development tools, configuration and administration tools and report viewing tools

7.05. Technical Reference (SSRS):

Errors and Events Reference (Reporting Services)

Contains information about errors and events for SQL Server Reporting Services

Error information can be found in the Reporting Services log files

Feature Reference (Reporting Services)

SQL Server Reporting Services includes several tools and applications that can be used to create, manage and view reports

Chapter 8: Master Data Services

8.01. What's New (Master Data Services):

Use Excel to Manage Master Data

Match Data before Loading

Load Data into MDS Using Entity-Based Staging

New Model Deployment Tools

Redesigned and Higher-Performance Web User Interface

SharePoint Integration Introduced

Support for Multi-level Recursive Hierarchies

Improved Many to Many Mapping

Codes Automatically Generated

Security Simplified

Installation is Part of SQL Server

8.02. Backward Compatibility (Master Data Services):

Deprecated Master Data Services Features in SQL Server 2012:

Staging Process used in SQL Server 2008 R2 is no longer available in the Master Data Manager web application but is still available in SQL Server Management Studio

Metadata model should not be used as it will be removed in the future release of SQL Server

Discontinued Master Data Services Features in SQL Server 2012:


model object permissions can no longer be assigned to the Derived Hierarchy

Staging Process:

Create or delete collections, Add members to or remove members from collections and Reactivate members and collections can no longer be done

Model Deployment Wizard:

Packages that contain data can no longer be created and deployed by using the wizard in the Master Data Manager web application

Code Generation Business Rules:

Entity must be edited to enable automatically-generated Code values in System Administration

Bulk Updates and Exporting:

In the Master Data Manager web application, bulk updates of attribute values for multiple members can no longer be done


In the Explorer functional area, users can no longer revert their own transactions

Web Service:

Master Data Services web service is now enabled automatically, as required by Silverlight

PowerShell Cmdlets:

MDS no longer includes PowerShell cmdlets

8.03. Master Data Services Features and Tasks

Usage of Master Data Services to manage your organization's master data

Create and manipulate model objects

Learn how to organize and maintain the master data

Import and export master data, as well as share models between test and production environments

Develop custom applications that access the Master Data Services web service

8.04. Technical Reference (Master Data Services):

Master Data Services Configuration Manager

Create or configure a Master Data Services database

Create a web application and to enable integration with Data Quality Services

Associate an MDS database with an MDS web application

Master Data Services Database: Contains all of the information for the Master Data Services system

Stores the settings, database objects, and data required by the Master Data Services system

Contains staging tables that are used to process data from source systems

Provides a schema and database objects to store master data from source systems

Supports versioning functionality, including business rule validation and e-mail notifications

Provides views for subscribing systems that need to retrieve data from the database

Master Data Manager Web Application

Used primarily by administrators to perform administrative tasks

A3 Data description

Screen grabs of basic schema and its cardinality

Fig A3(I) borders table

Fig A3(II) city table

Fig A3(III) continent table

Fig A3(IV) country table

Fig A3(V) desert table

Fig A3(VI) economy table

Fig A3(VII) encompasses table

Fig A3(VIII) ethnic_group table

Fig A3(IX) geo_desert table

Fig A3(X) geo_island table

Fig A3(XI) geo_lake table

Fig A3(XII) geo_mountain table

Fig A3(XIII) geo_river table

Fig A3(XIV) geo_sea table

Fig A3(XV) island table

Fig A3(XVI) is_member table

Fig A3(XVII) lake table

Fig A3(XVIII) language table

Fig A3(XIX) located table

Fig A3(XX) merges_with table

Fig A3(XXI) mountain table

Fig A3(XXII) organization table

Fig A3(XXIII) politics table

Fig A3(XXIV) population table

Fig A3(XXV) province table

Fig A3(XXVI) religion table

Fig A3(XXVII) river table

Fig A3(XXVIII) sea table

B1 On-line information vs databases: What do we mean by the word "database"?

Online information, as the scenario has given, 'chemistry textbook which is available on-line, and searchable via a standard search engine', is a front-end application that interfaces the back-end services, which is the database that stores all the information that end user would care about; the chemistry textbook, to the end user so that they may view their information online.

A database is essentially a computer structure that is able to store large amounts of data and meta-data that would be of interest to the end user. These data could be texts, documents, images, audio or even video. But having just a database would not be enough as the database just stores data. We need a way to manage, edit, query, associate and retrieve the data that was stored in the database.

The solution is a database management system (DBMS). A DBMS is a piece of software that manages the database. So in this scenario, if a student is to use the search feature in the online chemistry textbook to find more information about the element Chromium, this series of sequence would happen;

The front-end application informs the DBMS that data retrieval is going take place

The application would have an engine that understands human language and translates it to SQL language. This is so that a query can be performed

The DBMS interprets the SQL query and optimizes it

The DBMS finds the related information in the database

All the related information to the query is delivered to the application

The application optimizes these information and presents it to the end user

The end user has found the information regarding Chromium

The DBMS does not only manages the database it provides other services. Such services are security, access control, enforced integrity, performance and reliability.

B2 Looking at data: I

Answers to questions (a) to (e)

No. It is not a skilled job. It does not need any technical skill to sit on front of the computer and type in information.

Errors from computer data entry, 27211.

The Phobos mission failed due to the poor design of the command system. There is no error detecting and correcting of codes method being implemented into the system to ensure the correctness of input data

According to this paper, the more skilled the user becomes, less attention they would pay on the routine outcomes, thus this errors goes unnoticed until a problem arises.

The accuracy of OCR varied from 71% to 98%.

B3 Looking at data: II

Why we have 'constraints' in our schemas

Number of characters/digits to be entered into the database=

No. of tables X Avg. rows of each tables X Avg. fields in each row X Avg. width of characters in each field=

10 X 500000 X 8 X 8= 320,000,000 characters/digits

Best error rate in Klemmer's (1962) paper: 0.02%

Characters/digits entered erroneously =

Number of characters/digits to be entered into the database X Klemmer's (1962) best error rate=

320,000,000 X 0.02% = 6,400,000 characters/digits

Error rate cited by Baddeley & Longman [1973]: 0.5%

Characters/digits entered erroneously =

Number of characters/digits to be entered into the database X Baddeley & Longman [1973] error rate=

320,000,000 X 0.5% = 160,000,000 characters/digits

B4 How to minimize entering bad data into our databases: I

Steps to check for valid credit card numbers:

Double every second digit, from the rightmost number

Sum all the individual digits (digits in parentheses are the products from Step 1)

If the sum is a multiple of 10, the account number is possibly valid.

(1) 4657-4686-6987-0760


credit card number





































































8+6+1+7+8+6+7+6+3+9+7+7+0+7+3+x= 85



As 85 is not a multiple of 10, 4657-4686-6987-0760 is not a valid credit card number.

(2) 4658-4686-6987-0760


credit card number





































































8+6+1+8+8+6+7+6+3+9+7+7+0+7+3+0= 86



As 86 is not a multiple of 10, 4658-4686-6987-0760 is not a valid credit card number.

(3) 4659-4998-6988-0760


credit card number








































































As 95 is not a multiple of 10, 4659-4998-6988-0760 is not a valid credit card number.

B5 How to minimize entering bad data into our databases: II

(a) Entity Integrity Constraint

Fig B5(ai) Sea table

Fig B5(aii) Rows in sea table

In fig 5b(a), the column 'Name' is assigned to be the primary key for this table. According to the entity integrity rule, no column that is assigned to be the primary key is to be null and its values must be unique. Fig B5(ai) shows that the property of column cannot be null and in Fig B5(aii) it shows that all the values in the 'Name' column is unique as no two values are the same. Therefore, there is enforcement for entity integrity rule.

(b) Attribute Integrity Constraint

Fig B5(bi)

The attribute integrity constraint ensures that all data that is to be stored in the columns of a table complies with the defined data set that the particular attribute is to represent. According to Fig B5(bi), the sea table does have check constraints for its attributes. Therefore there is enforcement for attribute integrity constraint in this sea table.

(c) Referential Integrity

Having referential integrity implemented in a database, ensures that the relationships between one table with another remains consistent. This would mean that if a table has a foreign key to another table, a record cannot be inserted, updated or deleted in that table if the similar actions was not to occur in the associated table. Since all of the tables in the mondial database does not have any foreign key assigned, referential integrity is not enforced in this database.

B6 Using what we know about data

To: Peter Wilkes

From: Asyraf Salim

Date: 20th December 2012

Subject: Minimizing incorrect data in the database

This memorandum is written to address my concerns in regards to the conversation we had earlier.

Topics that I want to address:

Data entry errors is not rare and training does not help reduce the occurrences of these errors nor does it prevent any errors from happening

Optical character recognition is not a fool proof method to prevent human errors

How should the database be designed

Topic (i): According to the fifth annual study by U.S. Pharmacopeia[1], it was found that computer entry was the fourth-leading cause of medication errors. This accounts to 13% of medication errors.

Providing more training does not help decrease these numbers. This is true according to paper published in the Journal of The Royal Society[2]. The reason given was; users will automate their actions to routine tasks or actions when they become skilled at these tasks so that their attention can be used more selectively.

As the data that will be entered into the database is made up mainly of numeric data alphanumeric codes this problem will quickly arise and this will severely affect the quality of data in the database.

Topic (ii): According to an article published in the March/April 2009 D-lib magazine[3], it was found that OCR accuracy varies from 71% to 98.02% when newspaper collections from 1803-1954 were digitized. With a percentage of 71% accuracy, this means that 29% of all information that can be found in the original newspaper collection will be incorrect in the digital version. Obtaining a 100% accuracy, requires a human operator to verify the digital data with the data on the physical object.

Topic (iii): The database should be designed with error checking functions to ensure that only correct data can be entered into the database. A thoroughly thought out integrity constraints to apply in the design of the database ensures the accuracy and consistency of data the database.

I strongly believe that without enough planning and review to ensure that all possibilities of data integrity constraints are implemented into the design of this database, this database will ultimately fail.


B7 Dependency Diagrams

1) Horse => Groom

Groom => Horse

2) Horse =>> Groom

Groom =>> Horse

3) Groom => Horse

Horse => Groom

4) Horse =>> Groom

Groom =>Horse

B8 Answer on Primary Key choice with examples

Primary Key: Student

The student can only take one subject at a time. Examinations grade for that subject will be over written when the student picks up another subject and sits for exam.

Primary Key: Student Number + Subject

Student can only take the subject once even though the student failed the exam. No retest is allowed

Primary Key: Student Number + Subject + Date

Student can only take one exam per day

Primary Key: Student Number + Subject + Date + Grade

Student can take 1 exam per day. If the grades of an exam that the student took on a different day is the same as the grades the student received on a different exam on a different day, the new exam result cannot be saved

All of these choices for primary keys are wrong, the design of the table is wrong, with this design; there will entity duplication and dependencies. The choice of primary key must be unique to ensure integrity in the data.

B9 E/R diagram and relational schema

E/R diagram

Fig B9(a) E/R diagram for Charles Babbage College

Relational schema

Fig B9(b) Relational schema for Charles Babbage College

Nine relational tables were derived from normalizing the ER diagram Fig B9(a). These nine tables will be implemented in SQL Server.

The following are screen grabs of the tables implemented in SQL Server:


Fig B9(b) Students table

Primary Key: StudentNumber

Foreign Key: None


Fig B9(c) Courses table

Primary Key: CourseCode

Foreign Key: None


Fig B9(d) Teachers table

Primary Key: EmployeeNumber

Foreign Key: None


Fig B9(e) Rooms Table

Primary Key: RoomNumberID

Foreign Key: None


Fig B9(f) Course_Head table

Primary Key: CourseCode

Foreign Key: CourseCode references CourseCode in Courses table

EmployeeNumber references EmployeeNumber in Teachers table


Fig B9(g) Exam table

Primary Key: ExamID

Foreign Key: CourseCode references CourseCode in Courses table


Fig B9(h) Room_PhoneExtension Table

Primary Key: None

Foreign Key: RoomNumberID references RoomNumberID in Rooms table


Fig B9(i) RoomOccupance table

Primary key: None

Foreign key: EmployeeNumber references EmployeeNumber in Teachers table


Fig B9(j) Students_Courses table

Primary Key: None

Foreign Key: CourseCode references CourseCode in Courses table

StudentNumber references StudentNumber in Students table

ExamID references ExamID in Exam table

Query examples

a) All courses student S0274 took?

b) All exams student R8303 took?

c) Whose the teacher in charge of each course?

d) Which teacher(s) occupies room H18 and what's their phone extension?

e) What is the extension number for Mrs Lee room?

B10 Essays on MUMPS and NOSQL


MUMPS is the acronym for Massachusetts General Hospital Utility Multi-Programming Systems and was developed by A. Neil Pappalardo and his colleagues in lat