Early Software Defect Prediction Computer Science Essay

Published: Last Edited:

This essay has been submitted by a student. This is not an example of the work written by our professional essay writers.

1. Abstract

Today everybody wants high quality software and it is quite late to correct the problem like cost overruns, behind schedule, and reliability by the time, they are detected in a failed software projects due to defect. Defect density is an important measure of software quality; this paper presents an early software defect prediction fuzzy model. Software Defects are affected by the reliability relevant metrics from the perspective of the software development: product, process and organization Therefore software metrics are used for early software defect prediction. The proposed model helps to correct the problem like resources allocation, quality, cost, reliability.

Keywords:- Fuzzy logic, Software Defect, Software Reliability, Software Metrics

2. Introduction

Human dependence on a software continuously increasing as software is now playing a vital role in the last three decades. Lots of money and effort is wasted on unfruitful project development. Also it has become a challenging for industry to made high quality products. The reason behind failed project is that often too late to correct these problems by the time they are detected. Therefore, it is necessary to predict the software defect for early phase of software development process.

Software reliability is an important factor of software quality. Software reliability, which is defined by IEEE "the probability that software will not cause failure of a system for a specified time under specified condition" [1]. It is requested to be assured in almost all safety-critical instrumentation and control system. Software reliability model were designed to quantify the likelihood of software failure [2][3].A failure is defined as "the departure of external result of program operation from requirements "where fault is defined as" the defect in the program that when executed under particular condition, cause a failure"[3].

Software reliability has play an important role in each early software development phase[4].In order to achieve target defect estimate, it is required to predict the number of defect at each phase of software development. Various software reliability models have been developed in the last three decade [3]. These models can distinguish into two categories. Software reliability prediction model (SRPM) and Software Reliability estimation model (SRGM).since software reliability can estimated and predicated using failure data collected during testing phase. But the failure data is not available in the early phase of software development life cycle, therefore, it is necessary to identify the reliability relevant metrics and predict the software fault for early phase of software development process. Since fuzzy set deals with imprecision and uncertainty and all the information present in the early phase are ambiguous and vague .Therefore a fuzzy model will be more appropriate

3. Related Works

Early Prediction models have advantages of early identification of: over budget, behind schedule, and poor quality, etc. Many studies in the past have been made for software reliability estimation and prediction by various models [3][7]. A phase- based model is developed by Gaffney and Davis [5][6] for predicting reliability by using the fault statistics find during the review of various software development phases. Air force's Rome Laboratory made a model for early software reliability prediction [8][9].In this model they selected some factors that are related to fault density in each requirement, design coding phases. For Ada program Agresti and Evanco developed a prediction model based on process and product Characteristics.[10]. By taking size and complexity metrics various researches has been conducted apart from the above.

Various factors affecting the software Reliability in the initial phase. Hang and Pham [11] identified such 32 factors which have impact on the software reliability in initial phase. After that Li and Smidt[12] conducted some related work and they have done ranking of software reliability relevant metrics.

Apart from this, Recently, Kumar and Misra [13] explore a model for early software reliability prediction considering only product metrics and ignored process metrics. Pandey and Goyal [15] have proposed an early fault prediction model using process maturity and software metrics. They have considered the fuzzy profiles of various metrics in different scale and have not explained the criteria used for developing these fuzzy profiles. The proposed model considered all these issue and help to predict the no of defect every phase of the software.

4. Proposed model

Top reliability relevant software metrics, are considered as input to proposed fuzzy model ranked by M. Li [12].In the requirement phase, the defect are predicted using fault density, requirement specification change request and requirement inspection and walk through. The defect predicted at the end of requirement phase is taken as input in design phase along with cyclomatic complexity and design review effectiveness to predict the defect at the end of design phase.

Similarly, the defect predicted at the end of design phase taken as input in coding phase along with programmer capability and process maturity.

















Fig.1 Proposed Model Architecture

At the end of coding phase we will get the total number of fault predicted for the software before testing phase.

4.1. Selection of software metrics

An study conducted by Li and Smidts[12] found out thirty software metrics which influence software reliability. These metrics were ranked with respect to their ability in predicting the software reliability. By taking the all the reliability relevant metrics, predicting the reliability of software is impossible due to the computational complexity.

Due to this reason the proposed model considered top three metrics of requirement phase, top two metrics of design phase and top two metrics of coding phase.

4.2 Defining membership function of input and output variable

There are many method of membership value assignment such as intuition, inference, rank ordering etc [15]-[18],The intuition method based upon the common intelligence of human. In inference we uses knowledge to perform deductive reasoning that is a given a body of facts and knowledge we infer a conclusion .Rank ordering is the other method of membership function assignment, it based on preference by single user, a committee, a poll others opinions method.

In this model triangular and trapezoidal membership[18][19] are considered for representing the linguistic state(L, M, H) for input metrics and (VL,L,M,H,VH)for output metric of fuzzy linguistics variable. The range of all input metrics is normalized.

4.3 Defining Fuzzy Rule Base

Rule may be provided by expert or can be extracted from numerical data[15]-[18].A fuzzy rule can be defined as a conditional statement in the form of

IF X ix A


If part are known as antecedent,and the then part are cosequent.In the proposed model value of Cosequent is assessed by the domain expert.In This model all the erly pases have twenty seven rule.

Requirements Phase fuzzy rule


Design Phases fuzzy rule


Coding Phase Fuzzy Rule


4.4 Inference and Defuzzification

The inference engine of fuzzy system evaluate the each rule and combined the result of each rule just as we human use many different types of intferncial procedure to understand to help us understand for making decision .Since infernce engine of fuzzy system maps fuzzy set into a fuzzy set,but in many application crisp number must be obtain at the output of a fuzzy system.The defuzzification method suh as centroid, max-min,bisection maps fuzzy set into crisp number.[15-18].Thecentroid method is the most popular method.

5 .Prediction Result of Proposed model

The propoed model is validated by taking the worst cse,best case and average case value of all the metrics and predecting the umber of defect in the early phase of software lifecycle is shown in the table.



RSCR 1 0 .5

FD 1 0 .5

RIW 0 1 .5

DEFECT 70 1.33 20



CC 1 0 .5

DRE 0 1 .5

RPD 1 0 .5

DEFECT 80 3 45



PC 0 1 .5

PM 0 1 .5

DPD 1 0 .5

DEFECT 85 6 40

From the above table it is clear that in the worst case

Defect in requirements phase = 70

(This is 82% of total defect)

Defect in design phase = 80-70=10 (This is 12% of total defect)

Defect in coding phase = 85-80=5 (This is 06% of total defect)

6. Conclusion

The proposed early defect prediction model considers metrics that are most vital from the assessment from defect point of view. This would be very useful for early software defect prediction where sufficient data are not available. Top seven metrics are identified to predict the defect at early phase. This will help development team to know the number of defects that are likely to exist in a given phase. i.e., a quantitative defect prediction.