Comparative Study of Advanced Classification Methods

4404 words (18 pages) Essay

29th Mar 2018 Computer Science Reference this

Tags:

Disclaimer: This work has been submitted by a university student. This is not an example of the work produced by our Essay Writing Service. You can view samples of our professional work here.

Any opinions, findings, conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of UKEssays.com.

CHAPTER 7

TESTING AND RESULTS

7.0 Introduction to Software Testing

Software testing is the process of executing a program or system with the intent of finding errors or termed as bugs or, it involves any activity aimed at evaluating an attribute or capability of programming system and determining that it meets its required results. Software bugs will almost always exist in any software module with moderate size: not because programmers are careless or irresponsible, but because the complexity of software is generally intractable and humans have only limited ability to manage complexity. It is also true that for any complex systems, design defects can never be completely ruled out.

7.2 Testing Process

The basic goal of the software development process is to produce data that has no errors or very few errors. In an effort to detect errors soon after they are introduced, each phase ends with a verification activity such as review. However, most of these verification activities in the early phases of software development are based on human evaluation and cannot detect all errors. The testing process starts with a test plan. The test plan specifies all the test cases required. Then the test unit is executed with the test cases. Reports are produced and analyzed. When testing of some unit complete, these tested units can be combined with other untested modules to form new test units. Testing of any units involves the following:

  • Plan test cases
  • Execute test cases and
  • Evaluate the result of the testing

7.3 Development of Test Cases

A test case in software engineering is a set of conditions or variables under which a tester will determine whether an application or software system is correctly working or not. The mechanism for determining whether a software program or system has passed or failed such a test is known as a test oracle.

Test Cases follow certain format, given as follows:

  1. Test case id: Every test case has an identifier uniquely associated with certain format. This id is used to track the test case in the system upon execution. Similar test case id is used in defining test script.
  2. Test case Description: Every test case has a description, which describes what functionality of software to be tested.
  3. Test Category: Test category defines business test case category like functional tests, negative test, accessibility test usually these are associated with test case id.
  4. Expected result and the actual result: These are implemented within respective API. As the testing is done for the web application, actual result will be available within the web page.
  5. Pass/fail: Result of the test case is either pass or fail. Validation occurs based on expected and actual result. If expected and actual results are same then test case passes or else failure occurs in test cases.

7.4 Testing of Application Software

The various testing done on application software is as follows.

  • Integration Testing

7.4.1 Integration Testing

In this phase of software testing individual software modules are combined and tested as a group. The purpose of integration testing is to verify functional, performance and reliability requirements placed on major design items. These “design items”, i.e. assemblages (or unit group of units), are exercised through their interfaces using black box testing, success and error cases being simulated via appropriate parameter and data inputs. Simulated usage of shared data areas and inter process communication is tested and individual subsystems are exercised through their input interface. Test cases are constructed to test that all components within assemblages interact correctly, for example across procedure calls or process activations, and this is done after testing individual modules, i.e. unit testing.

The overall idea is a “building block” approach, in which verified assemblages are added to a verified base which is then used to support the integration testing of further assemblages, In this approach, all or most of the developed modules are coupled together to form a complete software system or major part of the system and then used for integration testing. Integration testing is a systematic technique for constructing the program structure while at the same time conducting test to uncover errors associated with interfacing. The objective is to take unit-tested modules and build a program structure that has been dictated by design.

The top-down approach to integration testing requires the highest-level modules be tested and integrated first. This allows high-level logic and data flow to be tested early in the process and it tends to minimize the need for drivers. The bottom-up approach requires the lowest-level units be tested and integrated first. These units are frequently referred to as utility modules. By using this approach, utility modules are tested early in the development process and the need for stubs is minimized. The third approach, sometimes referred to as the umbrella approach, requires testing along functional data and control-flow paths. First, the inputs for functions are integrated in the bottom-up pattern.

7.4.1.1 Test Cases for Support Vector Machine

Support Vector Machine is tested for the attributes which fall only on positive side of hyperplane, attributes which fall only on negative side of hyperplane, attributes which fall on both positive and negative side of hyperplane and the attributes which fall on the hyperplane. The expected results match with the actual results.

Test Case ID

Input

Expected Result

Actual Result

Status

TC_ID1_01

One set of attributes which falls on the positive side of hyperplane

The predicted class label is 1.

The actual class label of test set is 1

Pass

TC_ID1_02

One set of attributes which falls on the negative side of hyperplane

The predicted class label is -1.

The actual class label of test set is -1

Pass

TC_ID1_03

Two sets of attributes which falls on the positive side of hyperplane

The predicted class label for two sets of attributes is 1

The actual class label of test set is 1

Pass

TC_ID1_04

Two sets of attributes which falls on the negative side of hyperplane

The predicted class label for two sets of attributes is -1

The actual class label of test set is -1

Pass

TC_ID1_05

Five sets of attributes which falls on the positive side of hyperplane

The predicted class label for five sets of attributes is 1.

The actual class label of test set is 1

Pass

TC_ID1_06

Five sets of attributes which falls on the negative side of hyperplane

The predicted class label for five sets of attributes is -1

The actual class label of test set is -1

Pass

TC_ID1_07

One set of attribute which falls on the hyperplane

The predicted class label is 1

The actual class label of test set is 1

Pass

TC_ID1_08

Two sets of attributes one which falls on positive side of hyperplane and other falls on the negative side of hyperplane

The predicted class label is 1 and -1 for each of its test set

The actual class label is same as that of the predicted class label

Pass

TC_ID1_09

Four sets of attributes, two sets fall on the positive side of hyperplane and other two on the negative side of hyperplane

Two sets predicted as 1 and other two sets predicted as -1

The actual class label is same as that of the predicted class label

Pass

TC_ID1_10

Ten sets of attributes, five sets fall on the positive side of hyperplane and other five on the negative side of hyperplane

Five sets predicted as 1 and other five sets predicted as -1

The actual class label is same as that of the predicted class label

Pass

Table 7.1: Test Cases for Support Vector Machine

7.4.1.2 Test Cases for Naive Bayes Classifier

Naive Bayes Classifier is tested for the attributes which belongs to only class ‘1’, attributes which belongs to only class ‘-1’, attributes which belongs to both class ‘1’ and class ‘-1’. The expected results match with the actual results.

Test Case ID

Input

Expected Result

Actual Result

Status

TC_ID2_01

One set of attributes which belongs to class 1

The posterior probability of class 1 is greater than class -1. So the predicted class label is 1

The actual class label is same as that of the predicted class label

Pass

TC_ID2_02

One set of attributes which belongs to class -1

The posterior probability of class 1 is greater than class -1. So the predicted class label is -1

The actual class label is same as that of the predicted class label

Pass

TC_ID2_03

Two sets of attributes that belongs to class 1

The posterior probability of class 1 is greater than class -1. So the predicted class label is 1 for two of the sets.

The actual class label is same as that of the predicted class label

Pass

TC_ID2_04

Two sets of attributes that belongs to class -1

The posterior probability of class -1 is greater than class 1. So the predicted class label is -1 for two of the sets.

The actual class label is same as that of the predicted class label

Pass

TC_ID2_05

Five sets of attributes that belongs to class 1

The posterior probability of class 1 is greater than class -1. So the predicted class label is 1 for five of the sets.

The actual class label is same as that of the predicted class label

Pass

TC_ID2_06

Five sets of attributes that belongs to class -1

The posterior probability of class -1 is greater than class 1. So the predicted class label is -1 for five of the sets.

The actual class label is same as that of the predicted class label

Pass

TC_ID2_07

One set of attributes belongs to class 1

The posterior probability of class 1 is greater than class -1. So the predicted class label is 1

The actual class label is same as that of the predicted class label

Pass

TC_ID2_08

Two sets of attributes, one set of attributes belongs to class 1 and other sets of attribute belongs to class -1

The posterior probability of class 1 is greater than class -1. So the predicted class label of one set of the attribute is 1.

The posterior probability of class -1 is greater than class 1. So the predicted class label of other set of attribute is -1.

The actual class label is same as that of the predicted class label

Pass

TC_ID2_09

Four sets of attributes, two sets of attributes belongs to class 1 and other two sets of attributes belongs to class -1

The posterior probability of class 1 is greater than class -1. So the predicted class label of two sets of the attribute is 1.

The posterior probability of class -1 is greater than class 1. So the predicted class label of other two sets of attribute is -1.

The actual class label is same as that of the predicted class label

Pass

TC_ID2_10

Ten sets of attributes, five sets of attributes belongs to class 1 and other five sets of attribute belongs to class -1

The posterior probability of class 1 is greater than class -1. So the predicted class label of five sets of the attribute is 1.

The posterior probability of class -1 is greater than class 1. So the predicted class label of other five sets of attribute is -1.

The actual class label is same as that of the predicted class label

Pass

Table 7.2 Test Cases for Naive Bayes Classifier

7.5 Testing Results of Case Studies

A particular example of something used or analyzed in order to depict a thesis or principle. It is a documented study of real life situation or of an imaginary scenario.

7.5.1 Problem Statement: Haberman Dataset

Haberman data set contains cases from the University of Chicago’s Billings Hospital on the survival of patients who had undergone surgery for breast cancer. The task is to determine if the patient survived 5 years or longer (positive) or if the patient died within 5 year (negative).

@relation haberman

@attribute Age integer [30, 83]

@attribute Year integer [58, 69]

@attribute Positive integer [0, 52]

@attribute Survival {positive, negative}

@inputs Age, Year, Positive

@outputs Survival

Training SetTest Set

survival

age

year

positive

 

survival

age

year

positive

-1

58

58

3

 

-1

38

59

2

-1

41

69

8

 

-1

39

63

4

1

66

58

0

 

-1

49

62

1

-1

31

65

4

 

-1

53

60

2

-1

62

66

0

 

-1

47

68

4

.

.

.

.

 

.

.

.

.

.

.

.

.

 

.

.

.

.

                 

Weight vector and gamma

w =0.09910.07750.2813

gamma = 0.3742

Predicted Class label of test set

-1.3679

-1.1651

-0.2434

-0.6684

-1

-1.2744

0.0353

0.1147

-0.4655

-1

-0.3392

-0.2648

-0.4224

-0.5472

-1

0.0348

-0.865

-0.2434

-0.5062

-1

-0.5263

1.5358

0.1147

-0.2752

-1

         
         

Confusion matrix of the classifier

True Positive(TP)=8.000000False Negative(FN)=27.000000

False Positive(FP)=8.000000True Negative(TN)=110.000000

AUC of Classifier = 0.517792

Accuracy of classifier = 77.124183Error rate of classifier = 22.875817

F_score=31.372549Precision=50.0Recall=22.857143Specificity=93.220339

Confusion Matrix for SVM

   

Predicted Class Label

   

1

-1

Actual

Class

Label

1

8(TP)

27(FN)

-1

8(FP)

110(TN)

 

Fig 7.1: Bar chart of SVM for various Performance Metric

Predicted Class Label of Naive Bayes Classifier

-1

-1

-1

-1

-1

True Positive(TP)=10.000000False Negative(FN)=25.000000

False Positive(FP)=11.000000True Negative(TN)=107.000000

AUC of Classifier = 0.5202

Accuracy of Classifier =76.4706Error Rate of Classifier = 23.5294

F_score=35.7143Precision=47.6191Recall=28.5714Specificity=90.678

Confusion Matrix for NBC

   

Predicted Class Label

   

1

-1

Actual

Class

Label

1

10(TP)

25(FN)

-1

11(FP)

107(TN)

       

Fig 7.2: Bar Chart of NBC for various Performance Metric

 

Classifier

Metric

SVM

NBC

AUC

0.517792

0.5202

Accuracy

77.124183%

76.4706%

Error Rate

22.875817%

23.5294%

F_Score

31.372549%

35.7143%

Precision

50%

47.6191%

Recall

22.857143%

28.5714%

Specificity

93.220339%

90.678%

Tab 7.3: Comparison of SVM and NBC for various Performance Metric

Fig 7.3: Bar Chart for Comparison of SVM and NBC

7.5.2 Titanic Data set

The titanic dataset gives the values of four attributes. The attributes are social class (first class, second class, third class, and crew member), age (adult or child), sex, and whether or not the person survived.

@relation titanic

@attribute Class real[-1.87,0.965]

@attribute Age real[-0.228,4.38]

@attribute Sex real[-1.92,0.521]

@attribute Survived {-1.0,1.0}

@inputs Class, Age, Sex

@outputs Survived

Training SetTest Set

Survived

Class

Age

Sex

Survived

Class

Age

Sex

-1

-1.87

-0.228

0.521

-1

0.965

-0.228

0.521

1

-0.923

-0.228

-1.92

1

0.965

-0.228

0.521

1

-0.923

-0.228

-1.92

-1

0.965

-0.228

0.521

1

0.965

-0.228

0.521

-1

0.965

-0.228

0.521

-1

0.0214

-0.228

0.521

1

0.965

-0.228

0.521

w = -0.10250.0431 -0.3983

gamma = 0.3141

Predicted Class label of test set

0.9665

-0.2249

0.4969

-0.6209

-1

0.9665

-0.2249

0.4969

-0.6209

-1

0.9665

-0.2249

0.4969

-0.6209

-1

0.9665

-0.2249

0.4969

-0.6209

-1

0.9665

-0.2249

0.4969

-0.6209

-1

confusion matrix of the classifier

True Positive(TP)=154.000000False Negative(FN)=181.000000

False Positive(FP)=64.000000True Negative(TN)=701.000000

AUC of Classifier=0.426392

Accuracy of classifier in test set is=77.727273

Error rate of classifier in test set is=22.272727

F_score=55.696203precision=70.642202Recall=45.970149specificity=91.633987

Confusion Matrix for SVM

   

Predicted Class Label

   

1

-1

Actual

Class

Label

1

154(TP)

181(FN)

-1

64(FP)

701(TN)

Fig 7.4 Bar chart of SVM for various Performance Metric

Predicted Class label of Naive Bayes Classifier

-1

-1

-1

-1

-1

True Positive(TP)=197.000000False Negative(FN)=138.000000

False Positive(FP)=148.000000True Negative(TN)=617.000000

AUC of Classifier = 0.4782

Accuracy of Classifier = 74Error Rate of Classifier = 26

F_Score = 57.9412Precision = 57.1015Recall = 58.806Specificity = 80.6536

Confusion Matrix for NBC

   

Predicted Class Label

   

1

-1

Actual

Class

Label

1

197(TP)

138(FN)

-1

148(FP)

617(TN)

Fig 7.5 Bar chart of NBC for various Performance Metric

 

Classifier

Metric

SVM

NBC

AUC

0.426392

0.4782

Accuracy

77.727273%

74%

Error Rate

22.272727%

26%

F_Score

55.696203%

57.9412%

Precision

70.642202%

57.1015%

Recall

45.970149%

58.806%

Specificity

91.633987%

80.6536%

Tab 7.4: Comparison of SVM and NBC for various Performance Metric

Fig 7.6 Bar Chart for Comparison of SVM and NBC

Department of CSE, RNSIT2014-15Page 1

Cite This Work

To export a reference to this article please select a referencing stye below:

Reference Copied to Clipboard.
Reference Copied to Clipboard.
Reference Copied to Clipboard.
Reference Copied to Clipboard.
Reference Copied to Clipboard.
Reference Copied to Clipboard.
Reference Copied to Clipboard.

Related Services

View all

DMCA / Removal Request

If you are the original writer of this essay and no longer wish to have your work published on the UKDiss.com website then please:

Related Lectures

Study for free with our range of university lectures!