# Performance Analysis Of Support Vector Machine Biology Essay

Published:

This essay has been submitted by a student. This is not an example of the work written by our professional essay writers.

Performance Analysis of Support Vector Machine and Bayesian Classifier for Crop and Weed Classification from Digital Images. In conventional cropping systems, removal of the weed population tends to rely heavily on the application of chemical herbicides, which has had successes in attaining higher profitability by increasing crop productivity and quality. However, concerns regarding the adverse effects of excessive herbicide applications on environment, wildlife, and economy as well as emergence of herbicide-resistant weeds have prompted increasing interests in seeking alternative weed control approaches. Rather than the conventional method of applying herbicides uniformly across the field, an automated machine vision system that has the ability to distinguish crops and weeds in digital images and utilize the spatial variability information of weeds in the field to control the amount of herbicide usage can be an economically feasible alternative. This paper investigates the use of support vector machine (SVM) and Bayesian classifier as machine learning algorithm for the effective classification of crops and weeds in digital images. These two techniques are compared based on their performances to determine which one provides better accuracy for crop and weed classification. Young plants with no overlapping with other plants were considered in our study. In our experiments, a total of 22 features that characterize crops and weeds in images were tested to find the optimal combination of features for both SVM classifier and Bayesian classifier which provides the highest classification rate. Analysis of the results reveals that SVM achieves above 98% accuracy over a set of 224 test images, where Bayesian classifier achieves an accuracy of above 95% over the same set of images. Importantly, the best feature sets for both techniques contain only nine features, which is computationally feasible for a real time system.

Keywords: Support vector machine, Bayesian classifier, Herbicide, Weed control, Machine vision system.

## INTRODUCTION

Weeds can be defined as unwanted plants that can survive and reproduce in agricultural fields. Weeds disturb farm production and quality by competing with crops for water, light, soil nutrients, and space. Crop yields may reduce from 10 to 95 percent because of uncontrolled weeds [1]. As a result, better weed control strategies are crucial to sustain crop productivity and upgrade plantation systems. At present, several weed control strategies exist, which include removing weeds manually by human labourers, mechanical cultivation, or applying agricultural chemicals known as herbicides. Applying herbicides is the most common method that has adverse impacts on environment, human health, and other living organisms. It also raises a number of economic concerns. In the United States, the total cost of applying herbicides was estimated to be $16 billion in 2005 [2]. One of the main cost ineffective and strategic problems in herbicides system is that herbicides are applied uniformly on a crop field in most cases. In reality, however, weeds are aggregated [3] and usually grow in clumps or patches [4] within the cultivated field. There could be many parts of the field that have none or insignificant volume of weeds, but herbicides are still applied regardless. On the other hand, applying herbicides manually is very time consuming and costly. If the same types of herbicides are applied in a field repeatedly for the removal of the weeds population, there is often a chance of re-emergence of weeds that have become tolerant to those types of herbicides. According to International Survey of Herbicide Resistant Weeds [5], 346 herbicide resistant biotypes that belonged to 194 species (114 dicots and 80 monocots) are spread over 340,000 fields worldwide. Furthermore, pre-emergence herbicides like atrazine and alachlor are likely to contaminate ground and surface water supplies since they are soil-applied [4], which can cause a major problem to the safety of drinking water.

The performance of the agricultural sector has an overwhelming impact on food security, poverty alleviation and economic development of a country. In order to reduce the pressures on the agricultural sector, crops production and quality must be increased with diminishing cost for weeds control. Spraying herbicides with a knapsack sprayer is the most commonly used technique in agricultural fields. This technique is considered to be inefficient and time consuming where recommended safety measures are rarely maintained. It is here where a machine vision system that has the ability to distinguish crops from weeds so that herbicides can be applied effectively can enhance the profitability and lessen environmental degradation. In this approach, images are taken by an automated system from different parts of a crop field so that weeds can be identified and sprayed accordingly. Two such approaches have been proposed for automated weeds detection in agricultural fields [6]. The first approach classifies crops and weeds based on their geometric differences such as leaf shape or plant structure, while the second approach uses spectral reflectance characteristics [7].

In addition, many researchers have investigated other approaches for the automation of the weeds control process. In [8], a photo sensor plant detection system has been developed which has the ability of detecting and spraying only the green plants. In [9], shape feature analyses were performed on binary images to differentiate between monocots and dicots. Colour, shape and texture analyses have been investigated in [10] for the classification of weeds and wheat crop. In [11], an image processing method of weeds classification has been proposed based on active shape models, which was able to identify young weed seedlings with an accuracy that ranged from 65% to above 90%. In [2], narrow and broad leaves are classified by measuring Weed Coverage Rate (WCR) in a system that used a personal digital assistant (PDA) as the processing device. An algorithm has been developed in [12] to categorize images into narrow and broad classes based on the Histogram Maxima using a thresholding technique for selective herbicide application, which achieved an accuracy of up to 95%. In [13], above 80% accuracy has been obtained by using a combination of statistical grey-level co-occurrence matrix (GLCM), structural approach Fast Fourier Transform (FFT), and scale-invariant feature transform (SIFT) features in a real-time weeds control system for an oil palm plantation.

The objective of our paper is to present a new model for classifying crops and weeds in digital images using support vector machine and Bayesian classifier and to evaluate their performance in an automated weeds control system. These two classification models are compared based on their performances to find which one performs better. Both colour and shape features were included in this study. SVM has been chosen because of its impressive generalization performance, the absence of local minima, and the sparse representation of its solution [14]. On the other hand, advantages of Bayesian classifier are ease of implementation and computational efficiency. Generation of weed maps that can be used for precision spraying is the primary objective of the approach presented here.

## MATERIALS AND METHODS

Image Acquisition: The images used in this study were taken from a chilli field. In addition, five weed species were included that are commonly found in chilli fields in Bangladesh. TABLE lists both the English and the Scientific names of chilli and the selected weed species.

## TABLE

## Selected species

## Class label

## English name

## Scientific name

1

Chilli

Capsicum frutescens

2

Pigweed

Amaranthus viridis

3

Marsh herb

Enhydra fluctuans

4

Lamb's quarters

Chenopodium album

5

Cogongrass

Imperata cylindrica

6

Burcucumber

Sicyos angulatus

The images were taken with a digital camera equipped with a 4.65 to 18.6 mm lens. The camera was pointed towards the ground vertically while taking the images. To ensure a fixed camera height from the ground, the camera was mounted on top of a tri-pod. The lens of the camera was 40 cm above the ground level. An image would cover a 30 cm by 30 cm ground area using these settings. No flash was used while taking the picture and the image scenes were protected against direct sunlight. The image resolution of the camera was set to 1200 - 768 pixels. The images taken were all colour images. Fig. 1 shows the sample images of a chilli plant and the other five weed species.

a b c

d e f

## Fig. 1: Sample images of different plants; (a) chilli (b) pigweed (c) marsh herb

## (d) lamb's quarter (e) cogongrass (f) burcucumber.

Pre-processing: Image segmentation was performed on these images to separate the plants from the soil. A binarization technique based on global thresholding was used for this purpose. First, all the plant images were converted to grey-scale images through a special contrast operation. The fact that plants look greener than soil was used to guide the segmentation. Let 'R', 'G' and 'B' denote the red, green and blue colour components of an RGB image, respectively. For each pixel in an image, firstly an indicator value 'I' was calculated for green vegetation using the colour components:

(1)

This operation has the effect of enhancing the green vegetation greatly in contrast to the background, which has been shown by [15] and [16]. Then for each pixel, the indicator value was mapped to the grey-scale intensity 'g' ranging from '0' to '255' using linear mapping:

(2)

Here, 'Imax' and 'Imin' are the maximum and minimum indicator value within an image, respectively. For further enhancement of the grey-scale image to ensure proper binarization, each image was sharpened using the composite Laplacian mask shown in Fig. 2.

−1

−1

−1

−1

9

−1

−1

−1

−1

## Fig. 2: Composite Laplacian mask used for image sharpening.

Although this sharpening technique is very effective to enhance fine details of an image, it is also sensitive to noises. As a result, random noises may occur in the image. To reduce noise from the grey-scale images, a 3 - 3 median filter was applied. Median ï¬ltering is a non-linear smoothing method that reduces the blurring of edges and signiï¬cantly eliminates impulse noise [17]. Median filter replaces the value of each pixel within an image by the median of the grey levels in the neighbourhood of that pixel.

For each grey-scale image, a binarization threshold value was selected using Otsu's method [18], a nonparametric and unsupervised automatic threshold selection technique that minimizes the between group variance. Let 'T' denote this threshold value. Those pixels with a 'G' value greater than 'T' were considered as plant pixels while the pixels with a 'G' value smaller than 'T' were considered as soil pixels in the binary image. From each image, a binary version was obtained, where the pixels with a value '0' represent soil and pixels with a value '1' represent plant.

Next, to remove noises from the binary images, morphological opening was applied first. In morphological opening, an erosion operation carried out after a dilation operation has been performed on the image. It has the effect of smoothing the contour of objects by breaking narrow isthmuses and eliminating thin protrusions from an image [19]. Then, morphological closing was applied. In morphological closing, a dilation operation is performed after an erosion operation has been applied to the image. It has the effect of eliminating small holes while filling the gaps inside the contour of an image [19]. Fig. 3 shows the result of applying these pre-processing steps on a sample image of pigweed.

a b c d

## Fig. 3: Images of a pigweed; (a) original RGB image (b) grey-scale image (c) grey-scale image after sharpening and noise removal (d) segmented binary image after applying morphological opening and closing.

Feature Extraction: A total of 22 features were extracted from each image. These features can be divided into four categories: colour features, size dependent object descriptors, size independent shape features and moment invariants.

Colour Features: Let 'R', 'G' and 'B' denote the red, green and blue colour components of an RGB image, respectively. Every colour component was divided by the sum of all the three colour components. It has the effect of making the colour features consistent with different lighting conditions.

(3)

(4)

(5)

Here, 'r', 'g' and 'b' are the processed colour components which are independent to different light conditions. While calculating the colour features, only plant pixels were considered. So, the colour features are based on only the plant colour but not the soil (background) colour. The colour features used were: mean value of 'r', mean value of 'g', mean value of 'b', standard deviation of 'r', standard deviation of 'g', and standard deviation of 'b'.

Size Dependent Object Descriptors: The size dependent descriptors were calculated on the segmented binary images. These features are dependent on plant shape and size. Selected size dependent object descriptors were: area, perimeter, convex area, and convex perimeter.

area is defined as the number of pixels with a value '1' in the binary image. Perimeter is defined as the number of pixels with a value '1' for which at least one of the eight neighbouring pixels has the value '0', implying that perimeter is the number of border pixels. Convex area is defined as the area of the smallest convex hull that covers all the plant pixels in an image. Convex perimeter is the perimeter of the smallest convex hull that contains all the plant pixels in an image.

Size Independent Shape Features: Size independent shape features are useful descriptors as they are dimensionless and independent of plant size, image rotation, and plant location within most images [9]. Four size independent shape features were selected for this study: form factor, elongatedness, convexity and solidity. For a circle, the value of form factor is '1' while for all other shapes it is less than '1'. Similarly, long narrow objects have a high elongation value than short wide objects. For an object that is fairly convex, the value of convexity will be close to '1'. This value decreases as the shape of an object becomes more straggly. On the other hand, a low solidity value towards '0' indicates objects that have rough edges and a high solidity value towards '1' indicates those that have smooth edges.

These shape features can be calculated using some size dependent object descriptors:

(6)

(7)

(8)

(9)

Here, thickness is twice the number of shrinking steps needed to make an object disappear within an image. The process is defined as the elimination of border pixels by one layer per shrinking step [20].

Moment Invariant Features: Moment invariants refer to certain functions of moments that are invariant to geometric transformations such as translation, scaling, and rotation [21]. Only central moments are considered in our study.

Let, f(x,y) denote a binary image of a plant. Then, f(x,y) is '1' for those (x,y) that correspond to plant pixels and '0' for those that correspond to soil pixels. Under a translation of co-ordinates, xÊ¹ = x + α, yÊ¹ = y + β, invariants of the (p+q)th order central moments are defined as:

p, q = 0, 1, 2, … … (10)

Here, 'xÌ…' and 'yÌ…' are the co-ordinates of the region's center of gravity (i.e., the centroid). Normalized moments [21], which are invariant under a scale change xÊ¹ = αx and yÊ¹ = αy, can be defined as:

(11)

where

(12)

These normalized moments are invariant to size change. The moment invariants selected for this study are listed below:

(13)

(14)

(15)

(16)

Here, 'Φ1' and 'Φ2' are second-order moment invariants and 'Φ3' and 'Φ4' are third-order moment invariants.These moment features are invariant to rotation and reflection. The moment invariants were calculated on the object area and perimeter. The natural logarithm was subsequently applied to make the moment invariants more linear.

Classification Using Support Vector Machine: SVM [22, 23] is a novel machine learning approach based on modern statistical learning theory [24]. The principle of structural risk minimization is the origin of SVM learning [25]. The objective of SVM is to construct a hyper plane in such a way that the separating margin between positive and negative examples is optimal [26]. This separating hyper plane works as the decision surface. Even with training examples of a very high dimension, SVM is able to achieve high generalization. When used together, kernel function enables SVM to handle different combinations of more than one feature in non-linear feature spaces [27].

A classification task in SVM requires first separating the dataset into two different parts. One is used for training and the other for testing. Each instance in the training set contains a class label and the corresponding image features. Based on the training data, SVM generates a classification model which is then used to predict the class labels of the test data when only the feature values are provided. Each instance is represented by an n-dimensional feature vector,

X =(x1, x2,… …,xn) where n = 14

Here, 'X' depicts n measurements made on an instance of n features. There are six classes labelled 1 to 6 as listed in TABLE .

In the case of SVM, it is necessary to represent all the data instances as a vector of real numbers. As the feature values for the dataset can have ranges that vary in scale, the dataset is normalized before use. This is to avoid features having greater numeric ranges dominate features having smaller numeric ranges. The LIBSVM 2.91 [28] library was used to implement the support vector classification. Each feature value of the dataset was scaled to the range of [0, 1]. The RBF (Radial-Basis Function) kernel was used for both SVM training and testing which mapped samples nonlinearly onto a higher dimensional space. As a result, this kernel is able to handle cases where nonlinear relationship exists between class labels and features. A commonly used radial basis function is:

K(xi, xj) = exp(−γ || xi− xj||2), γ>0 (17)

where

|| xi - xj||2 = (xi- xj)t (xi− xj) (18)

Here, 'xi' and 'xj' are n-dimensional feature vectors. Implementation of the RBF kernel in LIBSVM 2.91 requires two parameters: 'γ' and a penalization parameter, 'C' [28]. Appropriate values of 'C' and 'γ' should be specified to achieve a high accuracy rate in classification. By repeated experiments, C = 1.00 and γ = 1 / n were chosen.

Classification using Bayesian classifier: Bayesian classifier is a fundamental and computationally efficient statistical methodology. This classifier can be represented in terms of a set of discriminant functions gi(x), i = 1, …, c. The classifier will assign a d-component column vector 'x' to class 'wi' if

gi(x) >gj(x) for all j ≠ i

Minimum-error-rate classification can be gained by:

gi(x) = ln(P(x|wi)) + ln(P(wi)) (19)

Here 'P(x|wi)' is the state conditional probability density function for 'x', with the probability density function for 'x' conditioned on 'wi' being the class and 'P(wi)' describes the prior probability that nature is in class 'wi'. If the densities 'P(x|wi)' are normal, then 'gi(x)' can be calculated as:

gi(x) = − (1/2)(x − µi)t∑i−1(x − µi) - (d/2)ln2π - (1/2) ln(|∑i|) + ln(P(wi)) (20)

Here 'µ' is the d-component mean vector, '∑' is the d-by-d covariance matrix, '|∑|' and '∑−1' are the determinant and inverse of the covariance matrix respectively.

## RESULTS AND DISCUSSION

Cross-validation is a common and effective testing procedure. It is quite efficient as it prevents the overfitting problem. Ten-fold cross validation was applied in testing. In a ten-fold cross validation, it is required to split the whole training set into ten subsets, each having an equal number of instances. Subsequently, one subset is tested using the classiï¬er trained on the remaining nine subsets. The cross validation accuracy is the average percentage of correctly classified test data when each subset of the full dataset has been used in testing.

The cross-validation results of support vector machine and Bayesian classifier using all features were 93.75% and 89.23% respectively over 224 samples. No crop image was misclassified as weed by SVM classifier. But Bayesian classifier misclassified two chilli plants as weed. It is evident that, classification accuracy of Bayesian classifier is relatively lower than SVM accuracy in this case. The overall classification result is shown in TABLE II and TABLE III.

## TABLE II

## Classification result using all features

English Name of Samples

Number of Samples

SVM Classifier

Bayesian Classifier

Number of Misclassified Samples

Accuracy Rate

Number of Misclassified Samples

Chilli

40

0

100%

2

Pigweed

40

5

87.5%

15

Marsh herb

31

2

93.5%

0

Lamb's quarters

33

5

84.84%

5

Cogongrass

45

0

100%

2

Burcucumber

35

2

94.3%

0

## TABLE III

## Success rate comparison using all features

Method

Total Number of Samples

Total Number of Misclassification

Average Success Rate

SVM Classifier

224

14

93.75%

Bayesian Classifier

224

24

89.23%

Feature reduction is necessary for reducing computational complexity and improving performance by eliminating noisy features. There may be cases when two features carry good classification information when treated separately, but there is little gain if they are combined together in a feature vector because of high mutual correlation [29]. Thus, complexity increases without much gain. The main objective of feature reduction is to select features leading to large between-class distances and small within-class variance in the feature vector space [29]. To further select the set of features that gives the optimal classification result, both forward-selection and backward-elimination methods were attempted. In forward-selection, the selection process starts with a set having only one feature. The rest of the features are then added to the set one at a time. In each step, every feature that is not a current member of the set is tested if it can improve the classification result of the set or not. If no further improvement is detected, forward-selection is stopped; otherwise, it continues to find a better classification rate. In backward-elimination, the selection process starts with a set that includes all the features. The feature that has the least discriminating ability is then chosen and removed from the set. This process continues until an optimal classification result is obtained. In this paper, both forward-selection and backward-elimination are combined in a novel stepwise feature selection procedure to discover the optimal features combination. First, features are added to the set one at a time just like forward-selection. Next, backward-elimination is applied on the set obtained from the first step.

After feature reduction, a set of eleven features was found for SVM classifier which provides the best classification rate. The best features were: convexity, solidity, mean value of 'r', mean value of 'b', standard deviation of 'r', standard deviation of 'b', ln(Φ1) of area, ln(Φ2) of area, ln(Φ3) of area, ln(Φ4) of area, ln(Φ2) of perimeter.

The result of ten-fold cross-validation of SVM classifier using these eleven features was 98.22%. It is evident that accuracy rate increases significantly with this combination of features. Only four weed images were misclassified. Classification result using these nine features is given in TABLE IV.

## TABLE IV

## Classification result of SVM using set of best features.

English Name of Samples

Number of Samples

Number of Misclassified Samples

Success Rate

Chilli

40

0

100%

Pigweed

40

2

95%

Marsh herb

31

0

100%

Lamb's quarters

33

0

100%

Cogongrass

45

0

100%

Burcucumber

35

2

94.3%

Average Success Rate

98.22%

For Bayesian classifier, a different set of eleven features was found which provides the best accuracy. The best features were: perimeter, convex perimeter, formfactor, elongatedness, solidity, mean value of 'r', mean value of 'g', mean value of 'b', standard deviation of 'g', ln(Φ2) of area, ln(Φ3) of perimeter

The result of ten-fold cross-validation for Bayesian classifier using these eleven features was 95.79%. For Bayesian classifier, accuracy rate also increases significantly with this set of features, though it is lower than the accuracy rate obtained using SVM classifier. Classification result of Bayesian classifier using these eleven features is given in TABLE V.

## TABLE V

## Classification result of Bayesian classifier using set of best features.

English Name of Samples

Number of Samples

Number of Misclassified Samples

Success Rate

Chilli

40

0

100%

Pigweed

40

3

92.5%

Marsh herb

31

2

93.55%

Lamb's quarters

33

3

90.91%

Cogongrass

45

1

97.78%

Burcucumber

35

0

100%

Average Success Rate

95.79%

Fig.3 is a graph of accuracy vs. number of features for Bayesian classifier and SVM classifier. For both classifiers, set of best features was used to determine the corresponding accuracy rate.

## Figure 3: Accuracy Rate vs. Number of Features graph for SVM and Bayesian classifier.

It can be seen that, Bayesian classifier provides higher accuracy rate than SVM when Number of Features<=3. But for 4<=Number of Features<=22, SVM accuracy rate is always higher than the accuracy rate of Bayesian classifier. From this analysis, it can be concluded that SVM classifier performance is better than Bayesian classifier for crops and weeds classification.

## CONCLUSION

In this paper, we have proposed two classification models based on support vector machine (SVM) and Bayesian classifier respectively and evaluated their ability to classify crops and weeds in digital images effectively. When applied in an automated weeds control system, both of these two approaches have the potential of being a cost-effective alternative for reducing the excessive use of herbicides in agricultural systems. Analysis of the results reveals that SVM achieves above 98% accuracy over a set of 224 test images using ten-fold cross validation, where Bayesian classifier accuracy is above 95% over the same set of images. To enable further increase in the classification rate, our future task will involve making the image pre-processing steps more robust to noises that will inevitably be introduced by the operating environment.