This essay has been submitted by a student. This is not an example of the work written by our professional essay writers.
Measuring certain features during software development process is of core importance. This paper presents a model to compute complexity and effort for small scale relational database applications on the bases of its components and their features. The model will help developers and project managers in estimating effort, complexity, delivery time and cost of developing database applications.
The importance of software measurements has been recognized in improving process, products and services . Software metrics is measurement based technique in software development process that helps in driving necessary information at software infantry level. Effective software project management requires measuring attributes of software from initial stage to delivery level. Such measurements would make project managers better informed about issues such as delivery time, effort, cost and resource required to successfully deliver the application . Measurements provide bases for software metrics and a number of software metrics have been evolved depending upon measurements to estimate the cost, effort and complexity.
A variety of metrics have been developed to evaluate the complexity of software. McCabes's cyclomatic complexity (code metrics) , Henry and Kafura's Information flow (structured metrics), Hybrid metrics (SLOC) and object oriented metrics (CK). All the above mentioned software metrics have attempted to measure the structural and the design level complexities of the software, termed as static metrics . However we are not aware of any research studies that compute complexity metric of relational database systems based on detailed features of database components such as for table, constraints like assertion and triggers, such additional requirements will increase the implementation effort. In this paper we propose an architecture for Software(S) Metric(M) Analyzer(A) for Relational(R) Database(t) Systems(S) (SMARtS). We also proposed new weights for different factor of each database component.
The proposed SMARtS system will be developed using VC++ and MS Access and will provide the facility to measure the metric for the small scale relational database applications developed in relational database management systems and Visual Studio. We evaluate the authenticity of weights proposed by implementing it on small scale student projects and business applications, and compared the results with the actual results.
This paper is organized as follows: Section 2 gives the detailed introduction of major database components; Section 3 presents the architecture of SMARts; Section 4 presents categorization of components; Section 5 describes working of effort estimator; Section 6 discusses categorization of application and section 7 concludes the paper and proposes future directions.
COMPONENTS OF THE RELATIONAL DATABASE SYSTEMS
The complexity of the conventional database systems developed in any relational DBMS depends upon the five major components: tables, relationships, queries, forms and reports. These components can be classified into simple, average or complex . Characteristics of these components vary based on the complexity determining factors (CDF) such as constraints for example a simple table will become complex if index, assertions, or triggers are applied on it. We assign weights to these CDFs and total weight of CDFs is used to place any database component into one of the three classifications i.e. simple, average or complex. To simplify these calculations the weights for CDFs are chosen as 0.25, 0.50, 0.75. These weights are assigned on experience of developing database applications.
A brief description of each database component and CDFs weight assigning strategy is given below.
2.1 TABLES AND THEIR CDF
A table object is the fundamental structure that is used to store information about subjects related to the application. Main CDFs affecting the classification of the table include number of fields in the table, type of fields, number of fields in primary key, number of foreign keys, constraints, indexes, assertions and triggers.
Greater the number of fields in the tables greater will be the complexity. Fields having complex data types increases the complexity of the table. A foreign key is a correspondence between a set of columns in one table and the set of primary key columns in some other table. Cascade relationships between the tables can be established using foreign keys. Foreign keys constraints increase the table complexity and time to create the table. The complexity of table increases with increasing FKs in the table.
An assertion is used to enforce semantic integrity constraint. An assertion is used to specify a condition that must hold true on every database state . Writing code for assertions increases the complexity. A trigger is a special kind of stored procedure that goes into effect when you modify data in a specified table using one or more data modification operations . Triggers are also useful for enforcing referential integrity, which preserves the defined relationships between tables when you add, update, or delete the rows in those tables therefore triggers are also considered as one of the major CDS of a table.
The weights assigned to each CDf in a table to calculate its classification are given in Table 1:
Table : Weights of CDFs of Table
Simple Data Type
Complex Data Type
Each field in primary key
Each field in foreign key
In any relational database application relationships usage can be avoided. The relationships are used to extract information form multiple tables by joining tables. Primary and foreign keys are used to join tables in one of the following forms of relations: cardinality and degree of relationship affects the complexity of the relationship.
The permissible cardinality of the relationship is one-to-one, one-to-many and many-to-many. None of the RDBMS supports many-to-many relationship and it must be broken down into two one-to-many relationships . Referential Integrity (RI) can be enforced on any the relationship of the table. Referential integrity is used to prevent orphan records and to keep the references synchronized . Cascade delete means that a record from a primary table can not be deleted if matching records exist in a related table. Cascade update ensures that update in primary key must result in updating all the records in the detailed table. Degree of a relationship is the number of entity types participating in a relationship. The implementation of addition items in relationships will definitely increase the complexity of the component as well as effort.
Weights assigned to relationships on bases of above specified CFDs are shown in the following metric:
Table : Weights of CDFs of Relationship
One-to-many (RI Enforced)
Ternary or n-ary relationship is given weight 0.75.
The queries are necessary part of any information system. These are used to retrieve and filter data, calculate summaries and update, move and delete records. The queries are categorized into; Simple select query, nested select query, action query, and crosstab query.
Simple select query is used to retrieve records or summaries across records; Nested select query contains a nested select query; an action query updates a table and a crosstab query performs calculations and presents data in a spreadsheet format.
One may take long time to think how to write the query and consequently it needs more effort e.g. crosstab queries are more difficult as compared to that of simple select queries. We assigned weights to each query types depending upon its complexity. The queries weights are shown table 3:
Table : Weights of CDFs of Queries
Simple select query
Nested select query
The easiest and more accurate way of entering the data in the database is to use forms . The forms can be simple data entry forms, custom dialogue boxes, suforms, switchboards, customized forms. Some of the forms contain special effects.
A simple data entry form is used to add data to the database, or view, edit, and delete existing data in one table. Custom dialogue boxes are used when user input is used for any action. Subforms are used to view, edit, and delete existing data in more than one table. Switchboards are used to simplify the process of starting the various forms and reports in a database. Customized forms are the forms that are modified to change the appearance and functionality of forms.
Our experience leads us to give a weight to form based on their types. The weights assigned to each form type are given in table 4:
Table : Weights of CDFs of Forms
Simple data entry
Custom dialogue boxes
Form with Special Effect
The last component of database in our proposed system is reports. Reports are ready-to-print document of the desired database information so that one can view it away from a computer. From client point of view reports can be simple reports, grouped reports, summary reports, report using expressions, subreport, and custom reports.
A simple report is based on one table or query. A grouped report presents information by dividing it into groups. A grouped report contains various sections; A summary report represents totals, averages, percentages, or running sums to make the data more understandable. A report uses expression to perform mathematical calculations, combining or extracting text, or validating data. A subreport is used to view information from more than one table or query on the same report. Custom reports are reports that are designed for a specific purpose and saved so that one can generate a report in that custom format at any time when desired .
We interviewed database application developers to get information about the complexity and time required to create different reports. W, based on this summary and our experience assigned weights to each report type are shown in the following table 5:
Table : Weights of CDFs of Reports
Report using expressions
PROPOSED ARCHITECTURE FOR SMARtS :
The proposed Software Metric Analyzer for Relational Database Systems (SMARtS) is comprised of three layers; Presentation Layer, Processing Layer, Repository Layer. The architecture of SMARtS is shown in Figure 1.
Graphical User Interface
Complexity Determining Factors Entry Module
Report CDF Module
Form CDF Module
Query CDF Module
Relationship CDF Module
Table CDF Module
Application Effort Estimator
Component and Application Classifier
Figure : Architecture of SMARTS
Presentation layer is the top most layer that is Graphical User Interface (GUI). It is used to take input of various CDFs and output in the form of Effort and Categorization of application. Application is categorized as Light, Moderate or Heavy based on the time and effort required to develop it.
The second layer is processing layer. The processing layer consists of three modules; Complexity determined factors (CDF) Entry, Application and component Effort Estimator (AEE), Component Category Classifier (CCC). Estimator uses CDF values for each component input by the user through presentation layer and passes to CCC. CCC operates on CDF values for categorization of component using weighted sum and ranges stored in its repository.
Third layer is data repository layer, used for data storage. We store three types of data in data repository; Weight data repository contains stored weight sum and the ranges of different database component and the complexity determining factors, metric repository is used to store and retrieve calculations required for effort estimation of component and overall application, and CDF repository stores user entered CDF values. Estimators and Classifier uses Weight repository and Application effort Estimator use metric repository to estimate effort and application category.
User will enter the number of each CDF through Graphical User Interface (GUI) using interactive dialogue boxes. Sample dialogue box for table object is shown in figure 2.
Figure : CDF entry form for Table Figure : Classification of Components
These inputs are passed to estimator. Estimator by using weighted sums of corresponding factors and ranges will determine the classification of each factor and is shown to the user as shown in Figure 3.
CLASSIFICATION OF COMPONENTS
A database component can fall in the category of simple, average or complex which latterly determines the category of application that can be low, moderate or high.
CLASSIFICATION OF TABLES
The complexity of table depends upon type and number of CDF given in Table 1. Each CDF is given a static weight (Wi) according to the complexity of each type of the factor. The classification of table component is computed by using the total weight of table (Ws) using following formula:
Where Nfi is the number for each CDF entered by user and Wi is the corresponding weight the CDF.
The table will be characterized as simple, average or complex based on the computed ranges as shown in table.
Table : Category Ranges for Table Component
Range of Ws
We propose the statistics in Table 7 for assigning the ranges for three different categories.
Table : Factors for Classification
Type of factor
No of Fields
No of Simple Data Types
No of Complex Data Types
No of Fields In P.K.
No of F.K.
The lower and the upper bound for simple category or example is computed as follows.
Lower Bound: 2 fields * .25 + 2 Simple data types * .25 + 1 p.k. * .25 = 1.75
Upper Bound: 5 fields * .25 + 5 Simple data types * .25 + 1 p.k. * .25 = 2.75
4.2 CLASSIFICATION OF RELATIONSHIPS
Relationship classification is based on cardinality and degree of relationship. User will enter cardinality and degree for each relationship. If degree is other than ternary or n-ary its weight (Wo) is obtained from matrix shown in table 2 otherwise Wo is 0.75. Relationship is classified using ranges given in table 8.
Table : Category ranges for Relationship
Range of Wo
0.25 - 0.49
0.50 - 0.74
CLASSIFICATION OF QUERIES ,FORMS AND REPORTS
User enters the number of each the CDF (Nf) for corresponding component (query, form, report). The types of CDF of query, reports and forms and their static weights are mention in table 3, 4, 5 respectively. Nf is multiplied by corresponding static weight (Wi) to calculate computed weight for each factor (Cw). Weighted sum is obtained by summation of all Cw using formula given below:
Ws = where i = 1 to 4 for Queries, i = 1 to 5 for Forms, and i = 1 to 6 for Reports
Total number of instances of a component is obtained by adding the number of inputs given by user for that component (Rt). Obtained weight (Ow) can be obtained by following formula:
The classification of reports, form, queries is determined by using ranges given in table 9:
Table : Category Ranges for Reports
Range of Wo
0.25 - 0.49
0.50 - 0.74
Effort estimation requires to build a relational database application is expressed in person days. The effort can be calculated using expression
Effort = DBP * (0.2 + 0.01 * ∑Fi) ………….. (1)
Where DBP is database points, Fi is the weight given to each adjusting factor (i=1,2,3…..8)and 0.2 and 0.01 are empirical constant set by using suitable weight to compute DBP and best fit techniques on effort application curve. DBP is database points computed through DBP metric  in table 10. We have used the same DBP computing metric with increment of vertical sum to enhance its level to compute overall category of application.
Table : DBP Metric
Category Simple Average Complex
Table * 7 + * 10 + * 15 =
Relationships * 2 + * 3 + * 5 =
Queries * 5 + * 7 + * 10 =
Forms * 4 + * 5 + * 7 =
Reports * 4 + * 5 + * 15 =
DBP (category Sum)=
Number of each component for each category is obtained through classifier and values are set in the metric accordingly. These numbers are multiplied by their corresponding weights to get weight for each component (Cw). DBP Sum is obtained as
DBP (sum) = ∑ Cw. Where the summation limit is from 1 to 5
Adjusting factor also increases the effort and complexity of the application. These adjusting factors summary is given as follows: .
F1: Doest the system ahs interface with visual basic or any other interface?
F2: What is rate of programmer's experience in Access?
F3: Did the program use wizards and to what rate?
F4: Does the system take networking into consideration?
F5: Does the system provide help for user?
F6: Is the system designed for centralize or distributed processing?
F7: Did the programmer use Visual Queries or Issue Quries?
F8: Did the Access application use windows APIs?
The sum of all the factors will be used in equation 1 .This sum could vary between 0 to 45.
CATEGORIZATION OF APPLICATION
Application complexity can be calculated using the classification of the components.
Heavy application = (Vertical Weighted Sum of complex category> Vertical Weighted Sum of average) and (Vertical Weighted Sum of complex category> Vertical Weighted Sum of simple category)
Moderate application= (Vertical Weighted Sum of Number of average category > Vertical Weighted Sum Of Number of simple category) and (Vertical Weighted Sum of Number of average category > Vertical Weighted Sum Of Number of simple category)
Low application= (Vertical Weighted Sum of Number of simple category r > Vertical Weighted Sum Of Number of average category) and (Vertical Weighted Sum of Number of simple category r > Vertical Weighted Sum Of Number of complex category)
Weighted Sum of each category is obtained by summing the weighted value of DBP metrics vertically.
CONCLUSION and FUTURE WORK
In this paper we have proposed an analyzer that calculates the effort and complexity level of the relational database application developed in MS ACCESS. SMARtS can be enhanced to work on any platform and DBMS. (SMARtS can also be generalized for real time relational database applications. In future such analyzer can be developed that take input from an ERD/EERD, converts it into physical design and with the help of some assistance of designer it can calculate the complexity and effort of any database application.