Fuzzy Based Approach For Predicting Software Biology Essay

Published: Last Edited:

This essay has been submitted by a student. This is not an example of the work written by our professional essay writers.

Software development effort prediction is one of the most significant activity in software project management. There are several algorithmic cost estimation models such as COCOMO, Albrecht's' Function Point Analysis, Putnam's SLIM, ESTIMACS etc, but each do have their own pros and cons in estimating development cost and effort. This is because project data, available in the initial stages of project is often incomplete, inconsistent, uncertain and unclear. The need for accurate effort prediction in software project management is a challenge till today. Fuzzy logic-based estimation models are more apt when vague and inaccurate information is to be used. In this paper we present the differences in software development effort prediction using COCOMO, Takagi Sugeno Fuzzy Inference System approach using triangular membership function (TMF) and GBell membership function (GBellMF). A case study based on the COCOMO81 database compares the proposed fuzzy model with the Intermediate COCOMO. It was analyzed that the fuzzy model using TMF provided considerably better effort estimations than the fuzzy approach using GBellMF or the Intermediate COCOMO .

KEY WORDS: Development Effort, EAF, Cost Drivers , Fuzzy identification, membership functions , Fuzzy Rules, COCOMO81 53 projects


In algorithmic cost estimation, costs and efforts are predicted using mathematical formulae. The formulae are derived based on some historical data. The best known algorithmic cost model called COCOMO (COnstructive COst MOdel) was published by Barry Boehm in 1981. It was developed from the analysis of sixty three (63) software projects. Boehm projected three levels of the model called Basic COCOMO, Intermediate COCOMO and Detailed COCOMO. For our case study we focus mainly upon the Intermediate COCOMO.

1.1 Intermediate COCOMO

The Basic COCOMO model is based on the relationship:

Development Effort, DE = a*(SIZE) b;

Where, SIZE is measured in thousand delivered source instructions. The constants a, b are dependent upon the 'mode' of development of projects. DE is measured in man-months. Boehm proposed 3 modes of projects:

1. Organic mode - simple projects that engage small teams working in known and stable environments.

2. Semi-detached mode - projects that engage teams with a mixture of experience. It is in between organic and embedded modes.

3. Embedded mode - complex projects that are developed under tight constraints with changing requirements.

The accuracy of Basic COCOMO is limited because it does not consider the factors like hardware, personnel, use of modern tools and other attributes that affect the project cost. Further, Boehm proposed the Intermediate COCOMO that adds accuracy to the Basic COCOMO by multiplying 'Cost Drivers' into the equation with a new variable: EAF (Effort Adjustment Factor).

The EAF term is the product of 15 Cost Drivers that are listed in Table II .The multipliers of the cost drivers are Very Low, Low, Nominal, High, Very High and Extra High.

For example, for a project, if RELY is Low, DATA is High , CPLX is extra high, TIME is Very High, STOR is High and rest parameters are nominal then

EAF = 0.75 * 1.08 * 1.65 *1.30*1.06 *1.0

If the category values of all the 15 cost drivers are "Nominal", then EAF is equal to 1.

The 15 cost drivers are broadly classified into four categories.

1. Product : RELY - Required software reliability

DATA - Data base size

CPLX - Product complexity

2. Platform : TIME - Execution time

STOR-main storage constraint

VIRT-virtual machine volatility

TURN-computer turnaround time

3. Personnel : ACAP-analyst capability

AEXP-applications experience

PCAP-programmer capability

VEXP-virtual machine experience

LEXP-language experience

4. Project : MODP-modern programming

TOOL-use of software tools

SCED-required development schedule

Depending on the projects, multipliers of the cost drivers will vary and thereby the EAF may be greater than or less than 1, thus affecting the Effort.


A fuzzy model is used when the systems are not suitable for analysis by conventional approach or when the available data is uncertain, inaccurate or vague. The point of fuzzy logic is to map an input space to an output space, and for doing this we need to write a list of if-then statements called rules. All rules are evaluated in parallel, and the order of the rules is unimportant. We must specify all the inputs and outputs of the system before writing the rules.

To obtain a fuzzy model from the data available, the steps to be followed are,

Select a Sugeno type FIS.

Define the input variables and output variable.

Set the type of the membership functions (trimf or gbellmf) for input variables.

Set the type of the membership function as linear for output variable.

The data is now translated into a set of if-then rules written in Rule editor.

A certain model structure is created, and parameters of input and output variables can be tuned to get the desired output.

2.1 Fuzzy Approach for Prediction of Effort

We have used the Intermediate COCOMO model for developing the Fuzzy Inference System. The inputs to this system are MODE and SIZE. The output is Fuzzy Nominal Effort. The framework is shown in Fig 1.

Fuzzy approach specifies the SIZE of a project as a range of possible values rather than a specific number. We also specify the MODE of development as a fuzzy range .The advantage of using the fuzzy ranges is that we will be able to predict the effort for projects that do not come under a precise mode ie comes in between 2 modes. This is not possible using the COCOMO. The output of this FIS is the Fuzzy Nominal Effort. The Fuzzy Nominal Effort multiplied by the EAF gives the Estimated Effort. The FIS needs an appropriate membership functions and rules.

2.2 Fuzzy Membership Functions

A membership function (MF) is a curve that defines how each point in the input space is mapped to a membership value (or degree of membership) between 0 and 1. The input space is also called as the universe of discourse. For our problem, we have used 2 types of membership functions

Triangular membership function

Guass Bell membership function

Triangular membership function(TMF):

It is a three-point function, defined by minimum (α),

maximum (β) and modal (m) values, that is, TMF(α, m, β), where (α ≤ m ≤β),.

The fuzzy sets definitions for the MODE of development appear in Fig 3, and the fuzzy set for SIZE appear in Fig 4 .

It is a three-point function, defined by minimum (α), maximum (β) and modal (m) values, that is, GBellMF(α, m, β), where (α ≤ m ≤β). Please refer to Figure 5 for a sample Guass Bell membership function,

We can get the Fuzzy sets for MODE, SIZE and Effort for GBellMF in the same way as in triangular method, but the difference is only in the shape of the curves.

2.3 Fuzzy Rules

Our rules based on the fuzzy sets of MODE, SIZE and EFFORT appear in the following form:

IF MODE is organic AND SIZE is s1 THEN EFFORT is EF1

IF MODE is semidetached AND SIZE is s1 THEN EFFORT is EF2

IF MODE is embedded AND SIZE is s1 THEN EFFORT is EF3

IF MODE is organic AND SIZE is s2 THEN EFFORT is EF4

IF MODE is semidetached AND SIZE is s2 THEN EFFORT is EF5

IF MODE is embedded AND SIZE is s3 THEN EFFORT is EF5

IF MODE is embedded AND SIZE is s4 THEN EFFORT is EF3

IF MODE is organic AND SIZE is s3 THEN EFFORT is EF4

IF MODE is embedded AND SIZE is s5 THEN EFFORT is EF6

IF MODE is organic AND SIZE is s4 THEN EFFORT is EF4



3. VARIOUS Criterions for Assessment of Software Cost Estimation Models

Variance Absolute Relative Error(VARE)

A model which gives higher VAF is better than that which gives lower VAF. A model which gives higher Pred (n) is better than that which gives lower pred(n). A model which gives lower MARE is better than that which gives higher MARE. A model which gives lower VARE is better than that which gives higher VARE. A model which gives lower BRE is better than that which gives higher BRE.

4. Experimental Study

The COCOMO81 database consists of 63 projects data, out of which 28 are Embedded Mode Projects, 12 are Semi-Detached Mode Projects, and 23 are Organic Mode Projects. Thus, there is no uniformity in the selection of projects over the different modes. In carrying out our experiments, we have chosen 53 projects data out of the 63, which have their lines of code (size) to be less than 100KDSI.



Referring to Table IV, we see that Fuzzy using TMF yields better results for maximum criterions when compared with the other methods. Thus, basing on VAF, MARE & BRE, we come to a conclusion that the Fuzzy method using TMF (triangular membership function) is better than Fuzzy method using GBellMF or Intermediate COCOMO. It is not possible to evolve a method, which can give 100 % VAF. By suitably adjusting the values of the parameters in FIS we can optimize the estimated effort.