Efficiency Of Coding Techniques Using Svm Classifier Computer Science Essay

Published: Last Edited:

This essay has been submitted by a student. This is not an example of the work written by our professional essay writers.

Classification is the problem of identifying a set of categories to a new comment. To improve the efficiency of the code quality are applied for evaluation. The binary classifier predicts the number of classes with lesser accuracy, and limited number of classes only to predict the accuracy for classifier. To address this problem, support vector machine classifier is used, which helps in detecting the false positive rate, improving code quality and the accuracy will also increased.

INDEXTERMS: Binary classifier, support vector machine, kernalmachine, machine learning.


Formal program specifications are difficult for humans to construct the correct specification mining. The false positive rate is a candidate specification that does not describe a correct behaviour specification. In large software specifications are difficult to debug errors in the source code. These specifications are typically use two state temporal properties and they are limited to expressive powers. Temporal property produce a large set of candidate specifications[1].In learning model, a given set of features to be associated with a potential pair(a,b).Recall and Precision are the two models for binary classifier in learning model.Recall measuers the probability of the given specification.precision are the probability to a returned candidate specification.The linear classifier applications is a training of extremely sufficient[1].

In this paper develops a learning model called support vector machine.Support vector machine is a set of learning machines used for classification and regression.The classification and regression tool that uses for machine learning to maximize predictive accuracy and automatically avoiding the data.Support vector machine defined as a systems of a linear function in a high dimensional space.

Support vector machine is used for many applications such as hand writing analysis,especially used for pattern classification and regression based applications.In statistical learning problem for support vector machine,the given training set of data {(x1,y1)... (xl,yl)} in Rn ï‚´ R s according to unknown probability P(x,y), and a loss function V(y,f(x)) that measures the error, for a given x, f(x) is "predicted" instead of the actual value y.

An automatic specification miner that balance true positives as required behaviors with false positives nonrequired behaviors. In previous miner have high false positive rates is assume that all code as equally likely to be correct.


1.Cross Validation and Boostrap for Accuracy Etimation and model selection:

In this paper,compare the two estimation methods are,cross validation and bootstrap. Estimating the accuracy of a classifier induced by supervised learning algorithm is important not only to predict its features prediction accuracy ,but also for choosing a classifier[3].For estimating the final accuracy of a classifier,we would like an estimation method with low bias and low variance.Some of the assumptions made by the different estimation methods and present concrete examples for each method fails.

The bias of a method to estimate a parameter θ is defined as the expected value minus the estimated value.The unbiased method is a method that has zero bias.When a given method may have bias may be poor due to low variance[4].The results is a better scheme for both bias and variance,when compared to cross validation.In bootstrap has low variance,but extremely large on some problems.

2.Mining Temporal Specifications for

Error Detection:

Specifications are necessary in order to find software bugs using program verification tools.In this paper, presents a new automatic specification mining algorithm that uses information about error handling

to learn temporal safety rules[2].

It is based on the inspection that programs often make error along exceptional control-flow paths, level when they behave correctly on normal execution paths. This focus can improve the effectiveness of the miner for discovering specifications positive for bug finding[1]. Finally, we give a quantitative comparison of our technique's bug-finding powers to common "library" policies. For our domain of importance, mining finds 250 more bugs. We also show the relative insignificance of level candidate policies. In all, we find 69 specifications that lead to the discovery over 430 bugs in 1 million lines of code.

3.Specification Mining With Few False Positives:

In this paper, a novel technique that automatically infers the limited correctness specification with a very low false positive rates.[13] The existing specification miners false positives because they assign equal weights to all program behavior.

To evaluate our technique in two ways: as a preprocessing step for an existing specification miner and as part of a novel pattern inference algorithm. Our technique identifies which traces are most indicative of program behavior, which allows off-the-shelf mining techniques to gain knowledge of the same number of specifications using 60% of their original input. This results in many fewer false positives as compared to state of the art techniques, while still finding useful specifications on over 800,000 lines of code. When minimizing false alarms, we obtain a 5% false positive rate, an order of magnitude improvement over previous work[8]. When combined with bug finding software, our mined specifications locate over 250 policy violations.

4.Naive Bayes Algorithm:

In binary classifier, the naive bayes algorithm is used for predicting the classes in lines of code. We used the source code of a and b, surrounding comments, source code in which a and b were either adhered to or violated, and related documentation to evaluate whether a candidate specification represented a true or false positive. The Naive bayes algorithm is based on conditional probabilities.The naive bayes classifier is to predict the class significance in a given set of attributes for each known set value,

1)Calculate probability for each attribute class value.

2)Use the product rule to find a probability for the attribute.

3)Use bayes rule to derive qualified probabilities for the class variable.

It uses Bayes Theorem, a formula that calculates a possibility by counting the frequency of values and combinations of values in the historical data.Bayes' Theorem finds the probability of an event occurring given the probability of another event that has already occur. If B represents the dependent event and A represents the prior event, Bayes' theorem can be stated as follows.

Bayes' Theorem:

Prob(B given A) = Prob(A and B)/Prob(A)

To calculate the probability of B given A, the algorithm counts the number of classes where A and B occur together and divides the number of classes where A occurs alone.Naive Bayes makes the assumption that each class is conditionally independent of the others. For a given target value, the distribution of each predictor is independent of the other predictors.


1.Only two classes are executed.

2.Execution time will be more.

3.The prediction of the classes will not efficient.


Support Vector Machine (SVM) is a classification prediction that uses machine learning theory to maximize predictive accuracy.Support Vector machines use hypothesis space of a linear functions in a high dimensional feature space.In support vector machine(SVM) is mainly used to detect the attacks.To reduce the execution time speed.Use more number of classes for measuring the code.The SVM classifier that is combined with the k nearest neighbor of the source code.A classification task usually involves training and test sets which consist of data instances. Each instance in the training set contains one target value (class label) and several attributes (features). The goal of a classifier is to produce a model able to predict target values of data instances in the testing set, for which only the attributes are known.

The SVM approach provides steps are,

1. Training data-File containing a set of fixed-length,real-valued vectors that will serve as the training set.

2. Class label- File containing the training set classification labels.

3. Test data- File containing the test set vectors.

The SVM server produce two output files:

a)Training set classification

b)Test set classification

The SVM classifier formula:

SVM decision function

The classifier class is very easy to use, having two functions Train and Classify. To train the classifier, training data set is created.

SVM classifier

Training set classification

Training data

Source code


Class label

Test set classification

Test data

Fig.1.SVM Architecture

SVM Classification:

SVM is a useful technique for data classification. A classification task usually involves with training and testing data which consist of some data instances [16]. Each instance in the training set contains one target values and several attributes. The goal of SVM is to produce a model which predicts target value of data instances in the testing set which are given only the attributes [17].

Classification in SVM is an example of Supervised Learning. A step in SVM involves classification as closely connected to the known classes. This is called feature selection or feature extraction[17]. Feature selection and SVM classification together have a use even when prediction of unknown sample is not necessary.


1.The major strengths of SVM are the training is relatively easy.

2.It scales relatively well to high dimensional data and the trade-off between classifier complexity and error can be controlled explicitly.


In binary classifier of naive bayes algorithm having rare number of classes only to predict the accuracy. So the space complexity in specification will be increased and also execution time is high. So the specification results will not be more efficient. The proposed algorithm support vector machine is one of the factor used for classification is easily predict more number of classes.The SVM classification is used to predict more number of training set in class label and to test each class variable.Then the accuracy of the class will be increased and the execution time reduced in SVM classifier.