# The Minimum Spanning K Core Problem Computer Science Essay

Published:

This essay has been submitted by a student. This is not an example of the work written by our professional essay writers.

In our daily life we can observe many physical systems as networks for example electrical and power networks, telecommunication networks, transportation networks, computer networks and many more. A network is described by a set of vertices and edges that connect pair of vertices.

In all these network structures we wish to move some entity like electricity, material, final product, information, vehicle, message etc. from one point to another. Users of network always desire highly effective (less cost of transmission) and well connected (less steps of transmission) network. This proposal is about a network problem (Minimum Spanning k-core Problem) that fulfills the above mentioned requirements of users of the networks.

This research is motivated by problems of hub network design where a set of designated hubs need to be connected in reliable manner. The hubs can represent airports, warehouses, distribution centers or nodes in a wireless network for example telecommunication networks. In traditional network design models of the type introduced by Minoux [19] the design element is edge weights but in our model design element is the edge set itself. Focus of our model will be on designing underlying network to meet structural specifications. To achieve structural specifications we are using properties of graph theoretic structure called as k-cores.

Our model primarily targets on two design goals, namely reliability i.e. high vertex connectivity and reachability i.e. navigation between vertices in fewer steps. In other words, reliability takes care of robustness of network in terms of connectivity i.e. network doesn't disrupt upon deletion of some hubs. Likewise, reachability ensures low pairwise distances between hubs (diameter) to facilitate faster transportation or communication. Considering robustness and reachability goals of our model, in our model hubs (vertices) are either connects with every possible edge resulting in a complete graph (refer Figure 1) or they are connected by using spanning tree.

## Figure 2: Spanning tree of network presented in Figure 1

A complete graph has the maximum possible vertex connectivity and minimum possible diameter. On the other hand spanning tree has the minimum possible vertex with no explicit control on the diameter. However, when the cost of creating a direct edge between vertices is taken into account, complete graph is the most expensive while a minimum spanning tree is the least expensive among all connected designs for the network. Our model aims to balance the network design objectives of robustness, reachability and cost effectiveness by using the notion of k-cores.

## 1.2. RESEARCH OBJECTIVES

The overall objective of this research is to develop a metaheuristics algorithm for solving the Minimum Spanning k-core problem on large instances of networks.

The particular tasks planned are as follows:

To perform through review of literature available on minimum spanning k-core problem.

To perform through literature review of metaheuristics approaches to combinatorial optimization problems.

To develop Greedy Randomized Adaptive Search Procedure (GRASP) algorithm for minimum spanning k-core problem.

To implement, and conduct a detailed computational study of GRASP for minimum spanning k-core problem.

## 1.3. PROPOSAL OVERVIEW BY CHAPTER

In Chapter 2, graph theory concepts necessary for describing the minimum spanning k-core problem are presented. Chapter 2 also includes motivation, problem statement, problem's complexity and applications. Chapter 3 presents a brief literature review of popular metaheuristics in combinatorial optimization. Detailed description of algorithmic approach is given in Chapter 4. Finally, Chapter 5 presents research activities carried out and proposed tasks.

## CHAPTER TWO

## INTRODUCTION TO PROBLEM

This chapter opens with graph theory concepts, motivation, and problem statement and evolves with problem reduction to a generalized matching problem that focuses on complexity of research problem. Finally, chapter concludes with describing applications of problem.

## 2.1. GRAPH THEORY

This section presents the definitions from graph theory that we will use in this proposal.

An undirected graph (refer Figure 1) G = (V, E) is defined by two sets, a set V of vertices and set E of unordered pairs [i.e. Edge (i,j) Edge (j,i)] of vertices called as edges and G is called a complete graph (refer Figure 1) if for all pairs of vertices i, j V there exists an edge (i, j) E. Real values such as costs and capacities can be assigned to the edges and vertices of a graph. In our case costs are assigned to the edges of network. An edge (i, j) E is incident to the vertices i, j V, or i, j V are two endpoints of edge (i, j) E. If (i, j) E then vertices i and j are called as adjacent to each other or neighbors. The input graph for minimum spanning k-core problem must be an undirected and complete graph with edge weights on edges.

The degree of a vertex (refer Figure 3) is equal to the number of adjacent vertices or number of edges incident at it. The diameter of graph G diam (G) is the maximum shortest distance in terms of number of edges between vertices i and j, where i, j V.

## Figure 3: Degree and diameter of graph

Minimum spanning k-core is a spanning subgraph G' = (V, E') of original graph G = (V, E) that consists two sets, a set V that spans (includes) all nodes of original graph and other set E' which is a subset of set E such that every vertex v V has at least k incident edges in E'.

## 2.2. MOTIVATION

Robust network design is an important problem in designing complex transportation and distribution systems capable of handling emergencies. Some important properties to consider while designing such networks that have a high impact on its efficiency and reliability are:

Reachability/ transportation between vertices in fewer steps

Cost of transportation between vertices

Reliability of network structure i.e. removing few nodes from the network should not disrupt the network by significantly increasing the diameter.

The k-core based network design model aims to design a robust network with specified structural properties such as vertex connectivity (a measure of reliability) and diameter (a measure of reachability).

Hence, Minimum spanning k-core based network model will ensure that above all mentioned properties will be satisfied by sub-graph G' in which every node v Ð„ V has at least k incident edges, while the total cost of all the edges is minimized.

## 2.3. PROBLEM STATEMENT

The problem under study is an extension of k-core based network model introduced by Seidman [1] as a measure of "network cohesion" in social network analysis [9]. This extension called as Minimum Spanning k-core Problem is introduced by Balasundaram in [2]. Focus of this problem is on designing a network to meet structural specifications that results in a network connected in reliable manner.

## 2.3.1. DEFINITION OF "K-CORE"

K-coresÂ inÂ graph theoryÂ were introduced by Seidman in 1983 [1] as a method of simplifying graph interconnections of the elements to aid in analysis.

Given a graphÂ GÂ = (V, E)Â with vertices setÂ VÂ and edges setÂ E, the k-core of the graph G is a sub-graph G' in which every vertex has at least k neighbors, that is the minimum degree of G' is at least k.

## 2.3.2. STATEMENT

Given an undirected and complete graph GÂ = (V, E)Â our goal is to design a network to be k-core for an appropriate k. Appropriate value selection for k and diameters of k-cores are outlined in the following proposition by Seidman [1]

Proposition 2 Let GÂ = (V, E)Â be a k-core on n vertices. If k > (n-2)/2 then diam (G) â‰¤ 2.

The above proposition will be use to chose an appropriate k to identify a sub-graph that is a k-core on n vertices with prescribed connectivity and diameter two. For example, if we desire a graph on 6 vertices to have diam (G) â‰¤ 2 we will set value of k > (6-2)/2 i.e. k = 3 and we will design graph to be 3-core. Hence, given n nodes an appropriate k is chosen, and given the cost of including an edge cij between any pair of nodes i, j Ð„ V, the k-core network design problem is to identify the subset of edges (E*) to be added so that resulting n- vertex undirected sub-graph G' is k-core and the total cost of edges added is minimum.

The following is a binary integer programming formulation for Minimum Spanning k-core problem.

## Data:-

Set of n vertices: V = {1, 2â€¦ n}

Cost of edge: Cij for i < j is the cost of creating edge (i, j)

## Decision Variables:-

Binary Variables: Xij for each i < j to be one if and only if edge (i, j) is selected to be in the sub-graph G'.

## Objective Function:-

Minimize total cost of edges added in k-core sub-graph G'

Minimize

## Constraints:-

Desirable vertex connectivity: This constraint will ensure every node i will have at least k incident edges

+ k i Ð„ V

Binary Constraint:

Xij Ð„ {0, 1}; i, j Ð„ V; i < j

This formulation ensures that every node has at least k incident edges that are created, while the cost of creating all edges is minimized.

## 2.3. RELATIONSHIP BETWEEN MINIMUM SPANNING K-CORE AND MAXIMUM WEIGHTED B-MATCHING PROBLEM

It was shown by Balasundaram in [2] that the minimum spanning k-core problem is polynomial time solvable by reduction to a special case of maximum weighted b-matching problem. This reduction follows from an observation that by solving maximum weighted b-matching on G for a particular vector b identifies those edges that must be excluded in the k-core network design problem. In other words, the edges which are excluded in maximum weighted b-matching problem will be the edges of minimum spanning k-core problem.

The reduction of minimum spanning k-core problem to generalized matching problem will be in effect when every node will have at most (n-1-k) incident edges. In other words, in integer programming formulation of the maximum weighted b-matching problem, b (v) = n-1-k. Maximum weighted b-matching problem's statement and formulation is discussed in following section.

## 2.3.1. MAXIMUM WEIGHTED b-MATCHING PROBLEM

Given a graph G = {V, E} (not necessarily complete) and a vector b, a b-matching is a subset of edges M such that every vertex v Ð„ V is incident with at most b (v) edges in M. The maximum weighted b-matching problem defined on G with edge weights Ce for all e Ð„ E is to find a b-matching M such that the sum of the edge weights of edges in M is maximum.

## Statement:-

Given an undirected and complete graph GÂ = (V, E)Â with edge weights Ce e Ð„ E, the goal of maximum weighted b-matching problem is to find maximum weight integral vector x for an appropriate b that satisfies âˆ‘ Xe â‰¤ b (v), v Ð„ V where, b (v) is node capacities at node v. For e.g. b (v) can be number of edges incident at node v.

## Formulation:-

The following is a binary integer programming formulation for the maximum weighted b-matching problem.

## Data:-

Set of n vertices: V = {1, 2â€¦ n}

Cost of edge: Ce e Ð„ E

## Decision Variables:-

Binary Variables: Xe e Ð„ E to be one if and only if edge (i, j) is selected to be in the sub-graph G'.

## Objective Function:-

Maximize total cost of edges added in sub-graph G':-

Maximize

## Constraints:-

Desirable vertex connectivity: This constraint will ensure every node v will have at most b (v) incident edges.

b (v), i Ð„ V

Where, E (v) is set of edges incident at node v

Binary Constraints:

Xe Ð„ {0, 1}, v Ð„ V

This formulation ensures that every node has at most b (v) incident edges that are included, while the cost of creating all edges is maximized.

## 2.3.2. METAHEURISTICS APPROACH TO POLYNOMIAL-TIME SOLVABLE PROBLEM

As we discussed earlier that minimum spanning k-core problem and maximum weighted b-matching problems are polynomial time solvable. Both of these problems can be solved by using exact algorithmic approach that guarantees an optimal solution in finite number of steps. Metaheuristics are preferably use for the combinatorial optimization problems that are in the category of NP-hard as polynomial time algorithm doesn't exist for NP-hard category problems. On the other hand there is an exception in our case as we are using metaheuristics for minimum spanning k-core which is a polynomial time solvable.

## Reasons for using metaheuristics for minimum spanning k-core problem:-

The complexity of minimum spanning k-core problem is O (n4) which is still time consuming for exact approach and algorithm might take high computational time in worst-cases. In real world applications minimum spanning k-core problem must deliver better quality solution in time efficient manner. Hence, to get better solution in reduced amount of time meta-heuristic can be a good alternative for exact algorithm.

Yet, we have discussed deterministic version of minimum spanning k-core problem for which there exist a polynomial time solvable algorithm. Conversely, until now there doesn't exists a polynomial time solvable algorithm for probabilistic version of problem and here lies the extension of current GRASP algorithm to probabilistic version of problem. In the probabilistic version of problem, each edge (i, j) has a probability of failure Pij in addition to edge weights.

Let, Yij be a random parameter such as;

Yij =

Recall from the section 3.2.2., the degree constraint of deterministic formulation. We required that the degree di of every vertex i must be di â‰¥ k,

di = + k i Ð„ V

In probabilistic version as each edge has a probability of failure in terms of random parameter Yij, the degree constraint of deterministic formulation changes to,

di = + k i Ð„ V

This change in degree constraint converts di into random variable and one cannot guarantee di > k, i Ð„ V that put robustness of network in question. To overcome risk of loss in reliability of network connections we can define a loss function which represents loss in desired structural properties of network. Once a loss function is defined, we can formulate a binary integer programming formulation for probabilistic version of problem. This formulation uses concept of CVaR (Conditional Value at Risk) to find a robust spanning k-core under uncertainty and minimize loss function instead of minimizing edge weights as we did in deterministic version. This exploration of probabilistic version of problem and extension of GRASP algorithm to probabilistic version is the future scope of research objectives.

## 2.4. APPLICATIONS

An undirected and complete graph G = (V, E) is an input to the minimum spanning k-core problem which returns spanning sub-graph G' = (V, E*), E* E. Hence before applying minimum spanning k-core problem to a real world problem it is essential to determine the initial structure of problem. For example, the problem instances must have pre-defined numbers of nodes and complete graph structure.

Recall, three properties reachability, cost minimization and reliability of minimum spanning k-core that we have discussed in section 3.1, based on these three properties our problem have numerous applications across many physical systems where appropriate network structure is present for example Transportation Networks, Telecommunication Networks, Computer Networks, Social Networks, Biological Networks, Electrical and Power Distribution Networks etc.

To get insight about applicability of minimum spanning k-core problem we will discuss an example of Transportation Network in next section.

## 2.4.1. APPLYING MINIMUM SPANNING K-CORE TO A GENERAL TRANSPORTATION NETWORK

Problem Input: An undirected and complete graph G = (V, E) with 6 vertices [ i.e. k = (6/2) = 3 ] and 15 edges with respective edge weights

Real world situation: Consider vertices as airports, warehouses, cities and edges as routes connecting to these locations. Edge weights will represent cost of transportation between vertices.

## Edge ID

## i Node

## j Node

## Edge Cost

1

1

2

2

2

1

3

4

3

1

4

6

4

1

5

8

5

1

6

10

6

2

3

12

7

2

4

14

8

2

5

16

9

2

6

18

10

3

4

20

11

3

5

22

12

3

6

24

13

4

5

26

14

4

6

28

15

5

6

30

## Figure 4: Input graph data

Total cost of input network i.e. summation of all edge weights = 240

Problem Output: A minimum spanning k-core problem returns spanning subgraph G' = (V, E*), E* E. G' network will have reachability, reliability and minimized overall cost.

## Figure 5: Minimum spanning k-core

Output is obtained by solving binary integer programming formulation on Xpress-MP

## Analysis of Minimum Spanning k-core Properties

Reachability between vertices in fewer steps: diam (G') 2 i.e. we can move from any vertex to any other vertex in less than or equal to 2 steps

Minimized Cost of Network: Original cost was 240 and cost of G' is 130

Reliability of network structure: Suppose there is breakdown of vertex number 2 due to some reason that doesn't disrupt network by significantly increasing the diameter of G'. In other words you can still travel between other vertices in less than or equal to 2 steps.

## Figure 6: Analysis of minimum spanning k-core

## CHAPTER THREE

## LITERATURE REVIEW

This chapter presents a review of the literature on popular metaheuristics in combinatorial optimization. We begin with basic introduction to the combinatorial optimization in sections 3.1 and then proceed to introduction to metaheuristics and some popular approaches in Sections 3.2 to 3.4.

## 3.1. COMBINATORIAL OPTIMIZATION

Network problems that involve finding an optimal solution for underlying network are studied under the combinatorial optimization which is a branch of optimization. The goal of combinatorial optimization problems is to find the optimum solution among the set of discrete feasible solutions. Examples of combinatorial optimization problems are Minimum Spanning Tree Problem, Traveling Salesman Problem, Cutting Stock Problem, Scheduling Problems and Knapsack Problem etc.

Combinatorial optimization problem selects an optimum configuration of a set of variables to achieve some objective for example minimizing cost or maximizing profit. In case of minimum spanning k-core problem objective will be to find desired network or graph structure in original network.

A Combinatorial Optimization problem P = (S, f) can be defined by:

An objective function f to be minimized or maximized [as research problem under discussion is a minimization problem, hence without loss of generality we will use minimized objective function throughout proposal]

S is called as a solution space or search space and each element of S can be considered as candidate solution to the problem under discussion. The combinatorial optimization problem is to select s* S with minimum objective function value. As f(s) s S are all feasible but f(s*) < f(s), s S is called a globally optimal solution of a combinatorial optimization problem.

## 3.2. METAHEURISTICS IN COMBINATORIAL OPTIMIZATION

To solve combinatorial optimization problems several approaches have been developed which can be either exact or heuristics. An exact algorithm guarantees an optimal solution to the problem in finite number of steps whereas a heuristic algorithm doesn't guarantees an optimal solution but tries to provide a good solution in reasonable amount of time.

Heuristic algorithm selects a solution from search space that is expected to be either optimal solution or close to the optimal solution. Heuristic algorithms generally classified as constructive algorithms and local search algorithms. Constructive algorithms build solutions from scratch depending upon properties of the problem. In case of minimum spanning k-core problem a constructive algorithm returns a graph structure with every vertex having degree at least k. Local search algorithms start with some initially available solution and iteratively improve current solution to the better one by exploring solutions in a properly defined neighborhood.

## Figure 7: Local search move

Neighborhood of a current solution is a function defined on search space S (recall that S is set of all candidate solutions) of current solution that assigns a set of neighbors N(s) S, s S. Set N(s) is called as neighborhood of s. In general case, (refer Figure 2) if S = {G1, G2â€¦..Gn} is search space of current solution which is G3 then neighborhood of G3 is a set N (G3) = {G1, G2, G4, G5}. Local search algorithm will replace G3 with some improving solution from N (G3) for instance G2 which will have cost less than G3. The solution G2 is called a locally minimal solution if and only if, f (G2) f (G) G N (G3).

Metaheuristics form [18] a class of algorithms that intelligently embeds basic heuristic algorithms in sophisticated algorithmic frameworks that explore and exploit search space more effectively. Two important aspects of metaheuristics algorithms are intensification and diversification which are related to exploitation and exploration of search space respectively. Exploiting the already explored search spaces is called as intensification in which metaheuristics algorithm explores unexplored solutions from one region of the search space. On the other hand, exploring new regions of the search space is called as diversification in which metaheuristics algorithm explores new regions of the search space to find improved quality solutions.

## 3.3. METAHEURISTICS PROCEDURES

This section is brief review of popular metaheuristics procedures such as iterative/basic local search, simulated annealing, TABU and greedy randomized adaptive search procedure (GRASP).

## 3.3.1. BASIC LOCAL SEARCH

The basic local search is called as iterative improvement in which current solution s is replaced with new solution where, is neighborhood of current solution s. Replacement of s to is called move which is executed only if is better than s. The algorithm terminates at a locally optimal solution.

If S is a search space and given an initial solution X0 S, a generic local search algorithm works as follows:

s = X0

while y N(s) : f (y) f (s)

s = y

## end-while

return s

## Figure 8: Pseudo- code for local search algorithm

Local search returns either first improving neighbor, best improving neighbor or intermediate neighbor between first and best improving neighbor. In other words, algorithm scans neighborhood of s and either return first solution which is better than s or return optimum solution among all solutions or return an intermediate solution between first improving and best improving solution. The implementation of basic local search algorithm on combinatorial optimization problems is inefficient as it terminates at locally optimum solution even in the presence of more improving solutions. Therefore, a number of mechanism are use to prevent algorithm to terminate at local optima. The combination of these mechanisms and basic local search give rise to important methods of meta-heuristics where local search is used as subroutine.

## 3.3.2. SIMULATED ANNEALING

Simulated Annealing is the oldest algorithm of meta-heuristic which prevents termination at local optima by accepting probabilistic uphill moves to worse solutions. At each iteration of algorithm a new solution s' is randomly selected from N(s) and replaces with s in two different ways:

if s' is better than s it's intuitive that s will replaced by s'

if s' is worse than s then s' replaces s with probability which is a function of temperature parameter T and difference between f (s') and f (s). The probability function refers to the Boltzmann Distribution and given by: exp ((f (s') - f (s))/T).

The origins of the SA algorithm are in Metropolis algorithm and it was presented as a search algorithm for CO problems by S. Kirkpatrick, C. D. Gelatt and M. P. Vecchi in 1983 [11] and by V. ÄŒerný in 1985 [10].

procedure SimulatedAnnealing

i = 0 % this is iterator

initialize temperature parameter T

s = InitialSolution ()

s-best = s

while termination condition do

s' = RandomPick(N(s))

if s' is better than s

s = s'

## else

s = AcceptanceCriterion(T,s',s)

## end-if

T(i+1) = Update(T)

i = i+1

## end-while

## Figure 9: Pseudo- code for simulated annealing algorithm

Simulated Annealing is inspired by physical annealing process of metals and glasses. Â Annealing is a heat treatment process that improves certain structural properties of metals or glasses by heating to above the re-crystallization temperature, maintaining a suitable temperature and then cooling. The suitable temperature is maintained and decreased according to a cooling schedule to achieve desired structure in metal under operation. At high temperature crystals of metals wander randomly and settle down to appropriate position during cooling process.

This process is analogous to simulated annealing where terms like structural property improvement, temperature and cooling of metals and glasses are analogues to objective improvement, temperature parameter and cooling schedule of combinatorial optimization problem respectively.Simulated annealing improves objective function value of combinatorial optimization problem by maintaining appropriate temperature parameter and cooling schedule. The cooling is analogues to iterative improvement of objective function value. At high temperature crystals are free to wander in structure of metal, similarly at high temperature control parameter probability of uphill moves is high. As temperature decreases during cooling process crystals of metal settle down at appropriate position, similarly as iterative improvement in s proceed towards termination condition control parameter temperature decreases which low down the probability of making uphill moves and converges simulated annealing algorithm into basic local search algorithm. Finally, annealing heat treatment process returns metal with improved properties an simulated annealing returns optimum objective function.

## 3.3.3. TABU SEARCH

Tabu search prevents termination at local optima by maintaining Tabu List which comprises of recently visited solutions. The word Tabu comes from area of social custom Taboo which is social ban or prohibition or forbiddance to specific human activity. Hence, algorithm mark recently visited solutions as tabu by maintaining their Tabu List and does not revisit those solutions. The improving solution s' is thus scans from neighborhoods of s that do not belong to Tabu List. Tabu search is credited toÂ Fred Glover [12] [13] [14].

s = InitialSolution()

TabuList = NULL

i = 0

while Termination Condition do

procedure AdmissibleSet(s)

s = Optimum from AdmissibleSet(s)

Update TabuList

i = i+1

## end-while

procedure AdmissibleSet(s)

if ( s' doesn't belongs to TabuList && s' doesn't belongs to AttributeList{} i.e s' doesn't violate tabu conditions OR s' satisfy Aspiration Criteria )

Admissible Set = {s' Ð„ N(s) | above conditions}

## end-if

## Figure 10: Pseudo- code for TABU search algorithm

We will refer to the neighborhoods of s that do not belong to Tabu List as Admissible Set. At each iteration best solution from Admissible Set is selected as new current solution and marked as tabu by adding to the TabuList. As new solution is added to TabuList one of the existing solutions from TabuList is removed by using FIFO order. TS stops when specified termination condition is met, one possible termination condition is Admissible Set reaches to empty status i.e. neighborhood of current solution s is empty. This condition is possible when all neighborhoods of s are in TabuList.

TS algorithm uses memory technique by maintaining TabuList to avoid trapping at local minima, but from practical prospective of implementation operating several complete solutions in a list is inefficient. Hence to overcome this drawback instead of maintaining a list of complete solutions several attribute lists are maintained in addition to the TabuList. These attributes are conditions to mark solutions as tabu and add them to TabuList. The AttributeList{} and associated TabuList{} defines conditions to tabu which used to generate Admissible Set.

Different Attribute Lists are maintained to improve operational efficiency of TS algorithm, although it raises a new problem of marking unvisited best quality solutions as tabu. This happens during construction of Attribute Lists on neighborhoods of s where some unvisited solutions which are better than current solution violates attribute condition and added to the TabuList. To unmark those solution Aspiration Criterion are used for solutions which are better than the currently-known best solution.

## 3.3.4. GREEDY RANDOMIZED ADAPTIVE SEARCH PROCEDURE

As we are using GRASP procedure for the minimum spanning k-core problem, to get deeper knowledge we will illustrate GRASP by using example of MAX-CUT problem.

## MAX-CUT problem:

Given an undirected graph G = (V, E) with edge weights Wij, (i, j) E, a MAX-CUT problem is to find a vertex partition S, T V, S T = V, S T =; such that sum of edge weights of cut (S, T) given by is maximized.

## Greedy Randomized Construction for MAX-CUT problem:

GRAPS is a multi-start procedure where each iteration consists of two phases, greedy randomized construction phase followed by local search phase. In greedy randomized construction phase, an initial feasible solution is built.

Let's build a greedy solution for MAX-CUT problem on an undirected graph G given below,

## Figure 11: MAX-CUT problem graph-1

## Step 1:- Building a candidate list

To be greedy, algorithm ranks candidates by using a greedy rule. In case of MAX-CUT as we have to maximize edge weights of cut (S, T), algorithm sorts all edge candidates in descending order of edge weights.

## Candidates

(1, 2)

(1, 3)

(1, 5)

(2, 3)

(5, 6)

(1, 6)

(3, 5)

## Weights

15

14

10

9

8

5

5

## Step 2:- Building a restricted candidate list (RCL)

In this stage algorithm becomes greedier by constructing RCL which consists of best quality candidates from candidate list. RCL is associated with a threshold parameter which is generated randomly in the range of 0 to 1.

A threshold value is calculated as min + (max - min), where min and max are the minimum and maximum edges weights respectively. Based on Threshold value a candidate is selected from RCL to put into cut (S, T). If alpha is equal to behavior of algorithm will be complete random & selects any candidate. On the other hand if alpha is equal to 1 then algorithm will greedily select candidate with edge weight greater than or equal to threshold value.

## Figure 12: MAX-CUT problem restricted candidate list

Now, let's consider edge (1, 2) is selected as initial choice and included into cut (S, T) as shown, next available candidate edges to add in cut are (2, 3) and (1, 3). Algorithm will make greedy choice and selects edge (1, 3) to maximize edge weight.

## Set S = {2, 3} and Set T = {1}

## Figure 13: MAX-CUT problem graph-2

Next available candidate edges are (3, 5) and (1, 5) again algorithm acts greedily and select edge (1, 5) to add in cut (S, T) as shown,

## Set S = {2, 3, 5} and Set T = {1}

## Figure 14: MAX-CUT problem graph-3

Next available candidate edges are (1, 6) and (5, 6) but threshold parameter = 0 that make algorithms behavior complete random and algorithm selects edge (1, 6) as shown,

## Set S = {2, 3, 5, 6} and Set T = {1}

## Figure 15: MAX-CUT problem graph-4

This is an initial feasible solution returned by greedy randomized construction algorithm of GRASP with cut value of 44.

## Local Search for MAX-CUT problem:

An initial feasible solution built in greedy randomized construction phase is further improved by local search algorithm based upon appropriately defined neighborhood and this completes an iteration of GRASP. Finally, the best solution among the iterations will be return by GRASP algorithm as optimum solution.

(1, 0) exchange can be an improving neighborhood for MAX-CUT problem. In other words, flipping one vertices either from S set to T set or from T set to S set that gives better quality solution i.e. maximize cut value.

## Figure 16: MAX-CUT problem neighborhood

In our example, the initial feasible solution can improve by flipping vertex 6 from S to T set as shown in Figure 17 that will improve cut value from 44 to 47 and which is optimal solution for given graph G.

## Set S = {2, 3, 5} and Set T = {1, 6}

## Figure 17: MAX-CUT problem graph-5

## 3.4. NEED FOR METAHEURISTICS

As discussed in section 3.2 an exact algorithm guarantees an optimal solution to the combinatorial optimization problem in finite number of steps whereas a metaheuristics algorithm doesn't guarantee an optimal solution but tries to provide a good solution in reasonable amount of time. Most of combinatorial optimization problems belong to NP-hard category and polynomial time algorithm doesn't exist for those problems under the assumption of P NP. In such case exact algorithm may take exponential computational time in worst-case. In practical conditions we may accept solutions that are close to optimal solution which can be achieved through metaheuristics algorithms.

## CHAPTER FOUR

## ALGORITHMIC APPROACH

As discussed in preceding chapter that due to O (n4) complexity of minimum spanning k-core problem which is still time consuming for exact algorithmic approach as well as considering complexity of probabilistic version of problem we will use metaheuristics approach to get better quality solution in reasonable amount of time. Among the available metaheuristics approaches we have used recently proposed Greedy Randomized Adaptive Search Procedures (GRASP) metaheuristics for minimum spanning k-core problem.

## 4.1. GRASP FOR MINIMUM SPANNING K-CORE

As we discussed in section 3.3.4 an iteration of GRASP consists of two phases, greedy randomized Construction Phase followed by Local Search Phase. In construction phase, an initial feasible solution will be built based upon structural properties of minimum spanning k-core problem. This initial feasible solution is further improved by investigating neighborhoods of the feasible solution in the local search phase and this completes an iteration of GRASP. Finally, the best solution among the iterations will be return by GRASP algorithm as an optimum solution. The stopping criterion for GRASP is depending upon MAXITER i.e. number of GRASP iterations. Larger number of GRASP iteration result in increased computation time of algorithm and increased probability of finding better quality solution.

procedure GRASP (MAXITER,alpha)

1 ReadInput ();

2 GRASP_Solution (int,int);

3 for k = 1...MAXITER do

4 startsol = Construct (alpha);

5 localopt = LocalSeach (startsol);

6 if (Best Solution Move Condition) then

bestsol = localopt

## end-if

7 end for;

8 return BestSolution;

end GRASP

Best Solution Move Condition:

If (G, C) is a best solution found then move towards another best solution (G', C') is possible, if and only if C' < C.

## Figure 18: Pseudo- code for GRASP algorithm

## 4.1.1. GREEDY RANDOMIZED CONSTRUCTION PHASE ALGORITHM

Greedy randomized construction phase is solely based upon properties of problem under consideration. This algorithm needs an input of undirected and complete graph G = (V, E) which then constructed as initial feasible solution G' = (V, E*), E* E that ensures degree of every vertex v V is at least k and an edge with both end points greater than k.

## Important steps of greedy randomized construction algorithm

As our problem is to find spanning k-core in a given graph that will minimize overall cost of the network algorithm greedily selects edges with lower weighs.

## Building candidate list

Initially construction algorithm builds a candidate list C that consist all edges of given graph. Then algorithm greedily arrange candidate list C in ascending order of edge weights.

## Building restricted candidate list (RCL)

In this stage construction algorithm becomes greedier by constructing RCL which consists of best quality (small edge weights) edges from candidate list C. RCL is associated with a threshold parameter alpha which is generated randomly in the range of 0 to 1. A threshold value is calculated as Cmin + alpha (Cmax - Cmin), where Cmin and Cmax are the minimum and maximum edges weights respectively. Based on Threshold value a candidate is selected from RCL to put into Solution Set E*. If alpha is equal to zero algorithm will select minimum weight edge and if alpha is equal to 1 then algorithm will randomly select any edge with edge weight less than or equal to Cmax.

procedure ConstructionPhase (alpha)

1 SolutionSet = Empty;

2 Initialize Candidate List C = E;

3 Rank all candidates using a greedy function;

4 while C is not empty do

5 Build the Restricted Candidate List;

6 Select element s from RCL at random;

7 Add s to SolutionSet;

8 Update Candidate List C;

9 Rank all candidates in C;

10 end-while;

11 return SolutionSet;

end ConstructionPhase

## Figure 19: Pseudo- code for greedy randomized construction phase algorithm

## 4.1.2. LOCAL SEARCH ALGORITHM

An initial feasible solution G' built in construction phase is further improved by local search algorithm based upon appropriately defined neighborhood. Initial feasible solution ensures that degree of every vertex v V is at least k and an edge with both end points greater than k. Solution returned by construction algorithm is generated by randomly selecting edges that fulfills structural properties of minimum spanning k-core problem. Hence, there may be possibility of existing edges in E that have edge weights lesser than the edges in E*. To find such edges and replacing them with higher cost edges in E* is the main aim of local search algorithm. These edge replacement steps of local search algorithm are discussed in detail in following neighborhood definition.

## Neighborhood Definition

Given a complete graph G (V, E) and an initial feasible solution from construction algorithm k (V, E*), improving neighborhood for k can be described as follows:-

N (E*) = { E" E | k (V, E") is a minimum spanning k-core if,

## (1,1) exchange neighborhood:

## |E1\E"| = 1,

i.e. Delete one edge from current solution with one end-point greater than k and other end-point is equal to k

## |E2 E"| = 1,

i.e. Add a new edge E2 to E" with cost lesser than cost of deleted edge E1 which is not in current solution

## (1,2) exchange neighborhood:

## |E1\E"| = 1,

i.e. Delete one edge E1 from current solution with both end-points equal to k

## |E2 E"| = 1 & |E3E"| = 1

i.e. Add two new edges E2 & E3 to E" with sum of costs lesser than cost of deleted edge E1 which is not in current solution

## }

## Pseudo Code:

Pseudo code will be same for both (1, 1) and (1, 2) exchange neighborhood. First (1, 1) exchange will execute followed by (1, 2) exchange. A new solution after (1, 2) exchange neighborhood is locally optimal in (1, 1) exchange neighborhood.Â After (1, 2) exchange, edges with one end point greater than k and other equal to k are present in current solution, but lesser cost edge candidates which are not in current solution E* doesn't exist in E for exchange. This is because in both (1, 1) and (1, 2) exchange we have already added lesser cost edges to the solution

procedure LocalSearch(SolutionSet)

1 while Termination Conditions do

2 Find S' in Neighborhood of SolutionSet;

3 Set SolutionSet = S';

## 4 end;

5 return SolutionSet;

end LocalSearch

Termination Conditions:

(1, 1) Exchange: while can't find lesser cost edge E2 to swap with higher cost edge E1

(1, 2) Exchange: while can't find lesser cost edges E2 and E3 to swap with higher cost edge E1

## Figure 20: Pseudo- code for local search algorithm

## CHAPTER FIVE

## RESEARCH TASKS

This chapter briefly describes research activities carried out and tasks that are planned as future work.

## 5.1. RESEARCH ACTIVITIES

## Literature Review

Reviewed literature on graph theory to understand concepts and notations obligatory for research tasks

Reviewed literature for degree constrained network problems to understand nature of underlying research problem

Reviewed literature for minimum spanning k-core problem

Reviewed literature on metaheuristics approaches to combinatorial optimization problems

Reviewed literature on degree constrained sub-graph problems essentially related to b-matching problem as minimum spanning k-core is reduction of a generalized matching problem

## Algorithmic Development

Developed Greedy Randomized Construction Phase of GRASP metaheuristics algorithm that required through understanding of structural properties of underlying research problem

Developed Local Search Phase of GRASP metaheuristics algorithm by designing appropriate neighborhood definition

## Implementation

Implemented binary integer programming model of minimum spanning k-core problem in Xpress-MP optimization tool

Implemented GRASP metaheuristics algorithm in Visual C++ programming language

Created large instance generators for both Xpress-MP and Visual C++ in MATLAB

Conducted preliminary experiments on C++ implementation to explore ways to speed up implementation

## Future Work

To perform computational study on algorithm to observe performance of metaheuristics

Literature review for network problems under uncertainty

Exploration of probabilistic version of minimum spanning k-core

Extension of current GRASP algorithm to probabilistic version of problem

## 5.2. TIME LINE OF RESEARCH TASKS

A snapshot of the time line for the research tasks mentioned above is included below.