Different Approaches For Efficient Mutation Testing Computer Science Essay

Published: Last Edited:

This essay has been submitted by a student. This is not an example of the work written by our professional essay writers.

In the recent years the use of computer aided software has become a common task. The people are becoming more dependent on the computer based software and computer application. In order to full fill the needs of people the software development companies are designing the various small sizes to big size and complex soft ware's. Since the use and demand of software is increasing day by day, this is the responsibility of the software development team to deliver efficient and right and testified software to their customer. Thus to achieve this testing of software is an important task. The testing of software is defined as the process of checking software that whether it is performing the function for which it has been designed, and as well as is it meeting the requirements of the customers, without testing the software cannot be delivered to the system. In short testing is a process of delivering error free software to the customer.

Now there are many software testing researchers who have proposed a number of different types of testing like white box testing, black box testing, and model testing. One of the proposed and most researched testing is Mutation Testing [2].

There are some terms in the testing that need to be defined before discussing any other thing about testing or mutation testing. These terms are test suite and coverage.

During the process of testing a number of test cases are designed to test the software, the collection of these test cases is known as Test Suite.

The part of whole code of the software that a test suite can test is known as coverage.

When the code become large and the results of test are not good then selecting the simple testing technique becomes invalid. In such case the tester cannot decide that whether there is any problem in code or the test suite selected is wrong. In such situation mutation testing can be used because mutation testing has been designed to overcome this problem only.

Mutation testing is a testing in which the original code is changed and the changed code is testified to check whether there is a problem in code or in test suites. This changed code is known as MUTANT. When mutant is tested under the selected test suite if the test suites are right then they will catch the error in the Mutant, it means that there is an error in the code and the test suites selected are fine but if the test suites cannot find the error in the mutant then it means that the selected test suites has some wrong test cases and those test cases need to be changed. When one of the test cases identifies the error in the mutant then it is said that mutant has been killed. This is the whole procedure of the Mutation Testing.

We can perform the mutation testing using various methods, the most common and easy way of performing mutation testing is the use of Directed Graph. In directed graph the whole program code is converted into the directed graph having a number of nodes and arcs but the Directed graph has some its own drawbacks during its use in the mutation testing.

Thus, in this paper we have discussed about the various proposed such techniques for performing the Mutation Testing and these proposed techniques are modified and extended form of Direct Graph.

Rest of the paper is organized as: description of Directed Graph Model, Regular Grammar Model and Semantic Mutation Testing in the section II, III and IV respectively.

Finally the future work and a conclusion of the discussion are provided in last section


A directed graph (DG) is a tuple D = (V, A, S, F) where V is a finite set of nodes, A is a finite set directed arcs which are ordered pairs of elements of V , S is a distinguished set of start nodes and F is a distinguished set of finish nodes. S and F belong to V [3].

This is the definition of directed graph when it is used with the mutation testing. When a program code is converted into a directed graph the one statement of code is represented by the one node of the graph and the paths following that statement is shown by the arcs of the graph. While constructing a directed graph some nodes in the graph should be distinguished as start and finish node to enable the usefulness of DG and to show the start and end of the code. Therefore S and F are the Start and Finish node in the Directed Graph. There are some definitions in context to directed graph in mutation testing.

In a directed graph a node v belonging to V is said to be useful if and only if it occurs in a path from a start to finish node [3].

For a given directed graph, a node v belonging to V, related to v strictly preceding nodes are the nodes such that v occurs in all the paths from these nodes to finish nodes [3].

For a given directed graph, a node v belonging to V, related to v strictly succeeding nodes are the nodes which occur only in the paths from v to finish nodes [3].

For implantation- oriented, white box testing nodes of the DG to be covered generally represent the statements of SUT (Software under Test) and arcs represent the sequence of those statements [5].

For specification oriented, black box testing, nodes of DG represent the behavioral events of SUT and arcs represent the sequences of those events [6]. This is how the directed graph is constructed for a given code.

But in mutation testing negative testing is also performed it means it is tested that the software is not doing anything it is not supposed to do. For this [3] have proposed specific manipulation operators for the graphs that models the SUT. These operators are discussed as below:

The directed graph consists of nodes and arcs so it is obvious that the manipulation can be done either in nodes or the arcs. Therefore there are two types of the elementary manipulation operator that can be applied on arcs and nodes. These are insertion and omission. These are defined as below:

Node Insertion -

This operator adds a new node v to the DG together with possibly nonzero numbers of arcs, and connecting this node to the remaining nodes. Both the sets A and V are updated after the node insertion.

Node Omission-

This operator deletes an existing node v from the DG together with the arcs, ingoing and outgoing from the deleted nodes. After node omission both the sets A and V are updated.

Arc Insertion-

This operator inserts or adds a new arc to the DG, after this a new set A of arcs is formed.

Arc Omission-

This operator deletes an existing arc 'a' from the DG and the set of arcs A is updated. It may result in some nodes with no ingoing and outgoing arcs.

This is how manipulation is performed in the directed graph.

In every testing task there is a main focus on the coverage. So in context of directed graph coverage is defined as three practical coverage criteria and these criteria are defined as below:

Node Coverage:

Given a graph DG and a set of strings B, is said to cover a node v, if v occurs at least in one of the strings in B. If the set of string B covers all nodes in V, then it is said to achieve node coverage.

Edge Coverage:

Given a graph DG, a set of strings B is said to cover an edge (u, v) of A, if the sequence uv occurs at least in one of the string in B. If the set of string B covers all edges in A, then it is said to achieve edge coverage.

Path Coverage:

Given a graph Dg and a set of string B, is said to cover a path of length k if sequence u1u2---- uk occurs at least in one of the string sin B.

While performing manipulation in the model here it is directed graph one should keep in mind that manipulation operators should not invalidate the model and should take necessary required steps to convert invalid model into valid model. For directed graph model while constructing directed graph model of a SUT, start and finish nodes should be determined with respect to the system semantics. For doing this [3] have proposed two main approaches:

Fix the set of start nodes and finish nodes and do not allow any sequence of manipulation operations which violate the usefulness of any node in DG.

Perform a sequence of manipulation operations. If the resulting DG is invalid, then select new start and finish nodes to satisfy usefulness of all nodes and transform it into valid one.

This is all about the Direct Graph model for performing mutation testing.


In this section we will discuss that how regular grammar can be used for performing mutation testing and how a directed graph can be converted into a Regular Grammar. So in this section we are discussing the regular Grammar Model of performing Mutation Testing.

The Directed Graphs can only be used to model regular systems. The test generation algorithms based on DGs can be viewed as still being in their starting point means the existing few are relatively slow and they are memory consuming. To overcome such issues of Directed Graphs, regular grammar can be used to model the application, so here is an introduction of a new approach for modeling the mutation testing applications.

The basic definition of the grammar is a tuple (N, E, P, S) where, N is a finite set of non terminal symbols, E is a finite set of terminal symbols, P is a finite set of production rules and the S belongs to N and is a distinguished nonterminal start symbols.

To construct the regular grammar model for mutation testing the Directed graph is converted to the Regular grammar. Since the FSA, DGs and RGs equivalently describe regular grammar, it is possible to construct regular grammars from the directed graphs. In the constructed regular grammar from a directed graph the production rule of grammar represents the arc of the graph and terminal and non terminal symbols represent the nodes of the graph. According to [3] the constructed grammar has the following properties that make it more efficient model than directed graph

G is a right RG (thus unambiguous).

nt(x) = Rx is a bijection from E to N\{S}.

nonterminal S appears only on the left side of the production rule

Each node in V corresponds directly a terminal in E

Each arc in V corresponds to a production rule in P (including the pseudo arcs used to mark start and finish nodes).

All terminal symbols in the grammar are useful if and only if all nodes in the DG are useful.

Similar to the directed graph, the regular grammar also have some special definitions and the manipulation operators. These definitions are discussed as below:

For a given grammar (N, E, P, S), a terminal symbol r and a non terminal symbol R. A terminal symbol r is said to be useful if it occurs in at least one string in L(G) and the non terminal symbol R is said to be useful if a rule of the form (N union E) * R(N union E) * …. Is used in a derivation S * … of at least one string in L (G).

For a given grammar (N, E, P, S) and a terminal symbol r, then the strictly preceding terminal symbols for r are all those terminal symbols such that r occurs in all the strings in L (G) in which these terminal symbols also occur and r occurs after these symbols.

For a given grammar (N, E, P, S) and a terminal symbol r, strictly succeeding terminal symbols for terminal symbol r are those terminal symbols which only occur in the either one or all the strings where r also occurs and they occur after r.

Above we have discussed the special definitions of the regular Grammar, now further we will discuss the manipulation operators of the Regular Grammar Model.

Arc Insertion-

The arc insertion operator adds a new arc in the directed graph and in the grammar arc is represented by the production rule. Therefore when an arc is inserted in the DG then RG should be updated with a new production rule to show the insertion of this arc. For the algorithm of this operation you can refer to the [3].

Arc Omission-

The arc omission operator deletes an existing arc from the directed graph and in the regular grammar an arc is represented by the production rule. Therefore when a arc is deleted from DG then a production rule must be removed from the RG. This operation may violate the usefulness of some terminal symbols so some extra care is needed while performing this operation. For the algorithm of this operation you can refer to the [3].

Node Insertion-

The node insertion operator adds a new node in the Directed Graph and when a new node is added in the DG then new arc ingoing and outgoing form that node are also inserted in the DG. In Regular Grammar a node of DG is represented by the terminal and non terminal symbols, thus when a node is inserted in DG a new non terminal symbol and a new non terminal symbol is added to the RG. And due to the addition of terminal symbols and non terminal symbols in the RG the production rules are also added in the RG to show the arcs ingoing and outgoing from the nodes.

Node Omission-

The operation of arc omission deletes a nodes from the graph due to which the arc ingoing to and outgoing from this nodes are also deleted. Therefore the node omission operation is based on arc deletion of arcs from the graph.

Thus when deleting a arc from DG, a production rule is deleted from the RG and proper steps should be taken to first the arc is deleted to perform removal of the production rules, and then the isolated terminal symbols and non terminal symbols are removes from the grammar.

To study the algorithms of the manipulation operation on arcs, you can refer to the [3]. Now we will discuss the coverage criteria included in the Regular Grammar. The coverage criteria are discussed as below:

Terminal Symbol Coverage:

For a given grammar G and a set of sting A is said to cover a terminal symbol e, if e is present in at least in one of the string in A. if the string A contains all the terminal symbols of E then it is known as Terminal Symbol Coverage.

Production Rule Coverage:

For a given grammar and a set of strings A, then A is said to cover a production rule p, if p is used at least in one of the derivation of A. if the set of string A contains all the production rules belonging to P ( set of production rules) then it is known as Production Rule Coverage.

This is all about the Regular Grammar model for Mutation Testing.


In this section we are going to introduce a new approach behind the mutation testing. The simple mutation testing about which we have given a small description works on the syntax of the program or code of the software, means it focus on the code not the semantic of the code it only check the syntax not the semantic of the language used. In such condition the number of mutants gets increased in number which becomes difficult to execute practically. There is a need of a change in the standard mutation testing [4]. For this they have proposed a new mutation testing that is Semantic Mutation Testing. This testing focus on the semantic of the language used means it performs the operations of mutation on the semantic of the language used. In this section we are going to describe in brief why we need semantic mutation testing over standard mutation testing and what the basic concept behind the Semantic Mutation Testing is.

Need of Semantic Mutation Testing

When software is developed then it passes through various phases of software development Life cycle model, in each step of life cycle model the team members give the different description to the software and uses different languages in this situation the language vary from step to step. When the transformation of prototype model of software to target model of software take place the description again changes and there come different descriptions of software from this different misunderstanding regarding the actual description of software arises, in such case our traditional mutation testing cannot work efficiently in detecting the error in code and misunderstanding regarding the description of the software. To overcome this problem of Mutation Testing, a modified version of mutation testing "Semantic Mutation Testing" was proposed by [4]. This works on the semantic of the language used for the program. The semantic is modified to generate the mutant means mutant operations are applied on semantic rather than description. The working and detailed concept of Semantic Mutation Testing is described in the next part of this section.

Concept Behind Semantic Mutation Testing

In Standard Mutation Testing the mutation operators are applied at syntactic level in does not focus on the semantic misunderstanding generated by semantic mistakes as explained above. For this semantic mutation testing was introduced. In this testing mutation operations are applied on semantic and it explores the variance in the semantics.

The code which is under test is represented by a description. Take an example, a description D is written in language with semantic S, the behavior of the description is defined by an ordered pair (D, S). The standard mutation testing will apply the mutation operations on the D that is the description under test. Thus the mutant for (D, S) can be represented as (D, S) (D', S) where D' is the change in the D.

On the other hand the Semantic Mutation Testing will apply the mutant operator in the semantic of the language that is S. thus the mutant obtained for the (D, S) will be represented as (D, S) (D, S') where S' is the semantic obtained after applying mutation operations.

Now how the Semantic Mutation Testing kills a mutant. For a given D of (D, S) after applying mutation operations on S there are two interpretation one its meaning under S and second its meaning under S' these will be called Ds, Ds' respectively. For a test case say t, Ds (t) will denote the behavior produced when applying t to D under semantics S and Ds' (t) will denote the behavior produced when applying t to D under semantics S'. Then a test case t kills the mutant (D, S') if and only if Ds (t) = Ds' (t).

Further, this mutant (D, S') is an equivalent mutant if for all t we have that Ds (t) = Ds' (t).

From here it is clear that, the notions of behavior, equality of behavior, and thus of killing a mutant will depend upon the language being considered by the testers. In addition, the property of a test case killing a semantic mutant depends both on the semantic mutation made and the description under test [4].

For detailed study you can refer to the [4].


In this papre we have discussed the various techniques for performing the mutation testing that focus on both the test suites used and codes of the application to satisfy the need of the customer and software requesters. All the discussed methods have their own field of use. In the future work we will make the use of these techniques and will make some results out of that. The use of these techniques in real scenario is a big research oriented work so in future we will focus on the implementation of these techniques.