# Airline Crew Pairing Problem Computer Science Essay

**Published:** **Last Edited:**

This essay has been submitted by a student. This is not an example of the work written by our professional essay writers.

## Abstract

In the cost list of airlines, crew costs are in the second place after costs of fuel. So in this paper cockpit crew pairing problem has been considered. To model this problem, set partitioning problem has been used and because of classifying this problem in the large scale problems, column generation approach has been used to solve LP relaxation of set partitioning model. This paper focuses on solving sub-problem of column generation. For this reason an algorithms, named as SPRCF, based on shortest path with resource constraints algorithms, have been proposed. Then its efficiency in solving some problem instances has been tested and obtained results have been compared with an existing algorithm in the literature of crew pairing problem. These results show high efficiency of proposed algorithm in solving problem instances, so that larger size of problems, in less time have been solved in compare with existing method in the literature.

Keywords: Airline scheduling, crew pairing, set partitioning, column generation, shortest path with resource constraints

## Introduction

Air transport industry has been considered by researchers for many years because of great deal of complexity in its environment. So, many problems have presented in this field, that their objective is increasing the profitability and decreasing expenses of airlines. One of the most important of these problems is airline crew scheduling. This problem is important because crew costs are in the second order in the cost list of airlines after fuel costs [1]. In literature of airline scheduling, crew scheduling problem has been divided into crew pairing and crew rostering. In this paper crew pairing problem is considered. For more detail about crew rostering problem anyone can refer to Belobaba et al [2]. Airline crew divides into two groups: cockpit crew and cabin crew. But cockpit crew pairing problem is more important because of greater amount of their salary, possibility of modeling them as one group in one day and etc [3].

To describe the problem, first some definitions are needed. In the airline scheduling flight leg is a flight section from one airport to another one with certain departure and arrival time. A duty period is a sequence of some flight legs in one day with small rest time between them and a pairing is a sequence of some duty periods with overnight rest between them. Each pairing starts and ends with the same crew and in the same crew base. So, crew pairing problem is defined as finding some pairings such that all flight legs are covered in minimum cost.

In the literature, crew pairing problem is considered according to schedule horizon time and frequency of flight legs as three states: daily, weekly and dated. In the daily problem assumption is that all flight legs are repeated in each day of the week and in the weekly problem, repetition of flight schedule in each week is assumed. In the dated problem, usually schedule horizon time is month and flights frequency is irregular. For more detail about these cases anyone can refer to Gopalakrishnan and Johnson [4].

Chu et al [5] solved the daily crew pairing problem by considering linear cost structure for each pairing. Their main idea was proposing a heuristic method for branching based on graph theory to solve large problems. They applied this method on three problem sets in the size of about 1200 flight legs. Desaulniers et al [6] formulated crew pairing problem as an integer, nonlinear multi-commodity network flow problem with additional resource variables and then isolated nonlinear aspects of problem by using Dantzig-Wolf decomposition and solved the problem in the form of set partitioning problem in a branch and bound framework.

Makri and Klabjan [7] developed a new pricing scheme for column generation approach in crew pairing problem. They proposed two exact and two approximate procedures to stop column generation in solving airline crew pairing problem. Their approach is applicable in the daily, weekly and dated problem and it was developed base on nonlinear cost structure for each pairing. Vance et al [8] proposed a new formulation for daily crew pairing problem with nonlinear cost for each pairing. Main benefit of their formulation rather than set partitioning model is its tighter linear lower bound in compare with integral solution. But solving LP-relaxation of this formulation is more complex and time consuming rather than set partitioning model.

Zeghal and Minoux [9] proposed an integer programming model for crew pairing problem and they used CPLEX to solve this model. They didn't consider assumption of being the cockpit crew together in one day. AhmadBeygi et al [10] developed an integer programming to generate pairings in the case of nonlinear cost structure for pairings. Their objective to propose this model is producing a tool for testing new ideas in the related problems. They claimed that this model could be solved by commercial softwares.

In this paper, daily cockpit crew pairing problem is considered. To solve this problem, set partitioning model is used and LP-relaxation of this model is solved by column generation procedure. Here main focus is on solving sub-problem of column generation and developing two algorithms for it.

The rest of the paper is organized as follows. In section 2, definitions and assumptions of the problem is introduced. Solution approach of the problem is stated in section 3 and in section 4, an algorithms for solving the sub-problem of column generation is developed and some notes about this algorithm is presented. Computational results of implemented proposed algorithm on some problem instances of two major Iranian airlines and its analysis is mentioned in section 5. Finally, conclusion is presented in last section.

## Definitions and assumptions

As mentioned in section 1, in this paper cockpit crew pairing problem in the case of daily and not allowing deadheading is considered. However, the proposed approach in this paper has capability of adopting and implementing in the weekly and dated problem. In this paper, other concepts and assumptions are considered and mentioned follow.

## 2.1. Imposed rules

In the final solution of crew pairing problem, all regional work rules and regulations, unions requirements and other specific work rules of airlines must consider. These rules are restrictive and help to decrease the size of problem. However, they could make finding an optimal solution more difficult. In this research, 12 common rule according to the study of Ho et al [11] are considered. These rules are shown in Table 1. Each duty period and pairing that related work rules have considered in them, is named as legal duty period and legal pairing, respectively.

## Cost structure of each pairing

Based on a common method that has been used in the most recent research in the literature with small differences ([4], [10], [12], [13], [14], [15], [16]), cost of each pairing p is equal to maximum amount of three quantities and in this paper is calculated from following equation:

Where is cost of pairing p, is cost of duty period d, is time away from crew base and is a fraction between 0 and 1. Also, is the minimum guarantee payment for pairing p without any attention to its elapsed time. Cost of each duty period d, is calculated from following equation:

Where is elapsed time of duty period d, is a fraction between 0 and 1, is flying time of duty period d and is the minimum guarantee payment for duty period d.

The values of and could be different for various airlines. It is notable to mention that all of the above quantities are in time credit and the monetary amount of pairing cost could be easily calculated by multiplying a monetary factor in it.

## Revise flight network

To show feasible pairings in crew pairing problem, two types of network have been used in literature that has been named as flight network and duty period network. Barnhart et al [16] have described specifications of these classic networks. In this paper, a new type of network are designed based on the classic type of it and named as "revised flight network". The Advantage of this network rather than its classic type is the fewer number of nodes and arcs. So, proposed algorithm based on this network in this paper, require less memory and fewer calculations. The properties of this network is stated in continue.

In revised flight network, a flight node is considered for each flight leg and some arcs are drawn to show the legal connection between flight legs. Two virtual nodes are also exist in the network as origin and sink node that the start and end of pairings are connected to them respectively. In Table 2 four flight legs data is shown and Figure 1 illustrates the related revised flight network of these flight legs in two days.

In this model, each row or constraint represents a certain flight leg and each column or variable represents a pairing. The value of each variable will be equal to 1, if the related pairing will exist in the final solution and otherwise it will be equal to 0. Variable j in i th row has coefficient if flight leg i is covered by pairing j and otherwise.

The model (3) is a famous and powerful model in operations research, however in crew pairing problem the main difficulty is great deal of columns or legal pairings. So, practically obtaining all variables of the above model is impossible. Therefore in this paper column generation approach is used to solve LP relaxation of model (3). In the column generation, first the model is limited to some initial variables, that is named as restricted master problem (RMP) and this problem is solved. Then by considering all variables implicitly, in each iteration one or more new variables are added to RPM. Finding new suitable variable or column for adding to the RPM is the duty of a sub-problem, which usually is named as pricing problem. If the sub-problem can not generate a new column, solving process will stop. The condition of selecting a new column for adding to the RPM is that its reduced cost must be negative. In this paper to solve sub-problem in the column generation procedure, an algorithms based on shortest path algorithms with resource constraints is proposed. More description of this algorithm is stated in the next section.

## Proposed algorithm

In order to solve sub-problem of column generation and find pairings or paths which have negative reduced cost, based on nonlinear cost structure that mentioned in section 2.2, in this section a shortest path algorithm with resource constraints is proposed.

## SPRCF algorithm

Here, an algorithm with name of shortest path with resource constraint in revised flight network and shortly SPRCF is proposed to solve the problem. To consider the work rules of section 2.1 in this algorithm, a label set with 9 labels like Table 3, is considered for each node in the revised flight network. By these numbers of labels all mentioned work rules could be modeled. However, SPRCF has the ability of considering new rules by adding new labels to the label set if necessary. With each label set it can be considered a related vector as.

## Steps of SPRCF

In this algorithm, first the nodes are numbered and an initial label set is considered for them. Then by considering the first node, labels of the subsequent nodes that could be connected to this node are updated based on a certain pattern. Therefore this node is removed from the considered list and the next node in the sequence is considered. This procedure is done for all nodes in the network and finally according to these updated labels, reduced costs of paths are calculated. Steps of SPRCF to solve sub-problem of column generation is as follow.

## Step 1: Numbering the node

Sort the flight legs of each day first by their departure time and then by arrival time increasingly. Consider a node for each flight leg in the network. Consider the origin node number as 0 and start numbering the other nodes from 1 and number each node based on the above sorted sequence.

## Step 2: Producing the initial network

Step 2-1: Connect the origin node with a connection arc to each node that his related flight leg starts in the crew base.

Step 2-2: Checking all flight legs of one day for establishment of Minsit and Maxsit rules and repeat it for each day. If these rules are considered for each couple of flight legs, connect them with a connection arc.

Step 2-3: Checking each flight leg for connecting to flight legs of the other days according to the MinRest and MaxRest rules. If these rules are considered, connect them with a connection arc.

Step 2-4: Connect each node that his related flight leg ends in the crew base with a connection arc to the sink node.

## Step 3: Initialization

Step 3-1: Consider the vector as the first label set for origin node and for the other nodes consider the vector .

Step 3-2: Name the active list of pairings as ALP and let , , and .

Step 4: Let . If go to step 7, otherwise if node j and k are connected in the initial network then go to step 5 otherwise go to step 7.

## Step 5: Updating the label sets of nodes

Perform step 5-1 to 5-10.

Step 5-1: Let . If then go to step 7.

Step 5-2: For label set of node j and k, if then go to step 5-3 and otherwise go to step 5-4.

Step 5-3: If all the following conditions hold then go to step 5-5 and otherwise go to the step 5-1.

(7)

(8)

(9)

Step 5-4: If all the following conditions hold then go to step 5-5 and otherwise go to the step 5-1.

(10)

(11)

(12)

Step 5-5: For label set of node j, if nodes j and k are connected then let , and be the increasing in , and respectively.

Step 5-6: Let .

Step 5-7: (Dominance rule 1) If following conditions hold then go to the step 5-1.

(13)

(14)

Step 5-8: (Dominance rule 2) Consider the following conditions:

(15)

(16)

If these conditions hold, erase label set of node k and go to step 5-7.

Step 5-9: Let .

Step 5-10: If then go to step 5-7 and otherwise go to step 6.

Step 6: If then go to step 6-1 and otherwise go to step 6-2.

Step 6-1: Let , , , , and .

Step 6-2: Let , , , , and .

Step 6-3: Add the following label set to the node k.

Step 7: Let , and . If go to the step 4 and otherwise go to step 8.

## Step 8: Finding pairings with negative reduced cost

Step 8-1: Let .

Step 8-2: If destination of flight leg is not crew base then go to step 8-5 and otherwise let and go to step 8-3.

Step 8-3: if or then go to step 8-5. Otherwise by using label of node , find its predecessor node and continue until getting to a node with value of for (origin node). In this case, all considered node from origin to node illustrate a path. If a flight leg does not repeat more than one time in this path, put the related pairing of this path in the .

Step 8-4: Let and go to the step 8-3.

Step 8-5: Let and if then go to the step 8-2, otherwise go to step 9.

Step 9: Return as output.

Step 10: Finish.

As mentioned in the steps 5-7 and 5-8, in this algorithm two dominance rule 1 and 2 are developed for updating the label sets of node. These dominance rules are selected after several experiments and testing some ideas, and then have been applied in this algorithm.

## Complementary issues

In this section some notes and complimentary issues about SPRCF is stated.

Considered work rules in this paper are some of the most common rules in the airlines. However, SPRCF can consider every other rule. For this reason, if it will be not impossible to consider this/these new rules by using present labels; it is enough to add one or more labels to them.

The value error for acceptance of being negative reduced cost of one pairing, which is used in the step 8-3 of SPRCF, is equal to 0 theoretically. However, because of cumulative error in calculations, it is reasonable to consider a small value for it in practice. This value is one of the setting parameters of the algorithm. In next section the considered value of this parameter will be mentioned.

The condition of step 8-3 in the SPRCF, which states no flight leg must repeat in one pairing, is considered to avoid deadheading and by deleting this condition, solution of the problem will be different. For better understanding of this issue consider four flight legs data as in Table 2. It is clear from Figure 1 that pairing , which includes duty period in day one and duty period in day two, covers all flight leg of the problem. But in this pairing flight leg d repeats two times. One problem that this case makes is that for implementing this pairing two groups of crew is necessary and because in day one crew 1 must fly duty period and crew 2 must fly duty period , so one of these groups must fly as passenger to displace and receive at considered place for flying the next flight legs. This case is deadheading and as mentioned before it is not allowed here.

As stated in section 2.2, cost structure of each pairing in this paper is nonlinear and is equal to maximum of three quantities, which one of them is equal to the maximum of two other quantities. However, all these quantities are linear. In the SPRCF algorithm to trace each quantity one label is saved. The main reason is that until receiving to the end of one path, it is not obvious that the cost of path is obtained from which quantity. If dominance rules 1 and 2 are not applied in SPRCF then each time that one node is considered, based on mentioned linear expression one new label set is added to the connected nodes; so at the end of the algorithm all possible status were considered and best solution is obtained among them. This case needs a lot of computer memory, even for small problems. So, using this dominance rules to make the calculations practical is very effective. But, applying these dominance rules could result in losing the optimal solution [10]. In addition when cost structure of each pairing is linear, the problem is converted to find simple shortest path in the network, which is a famous problem and could be solved by existing powerful algorithms of the literature.

## Computational results

To implement the proposed algorithms of section 4, some problem instances were used. So, in this research flight schedules of two major Iranian airlines are applied. Table 5 shows the specifications of these problem instances. In this table problems are divided into two groups based on the number of duty periods that can produce. Problems of set 1 have fewer number of duty periods rather than set 2.

All the algorithms that considered in this paper, is coded in C++ and used BCP project of COIN-OR [17], in Linux environment. Then these algorithms are run on a Laptop with 2.5 GHz CPU Core 2 Duo T9300 and with 2 GB RAM.

The parameters value of pairing cost is considered similar to Gopalakrishnan and Johnson [4]. So, the values of , , and is set to , , and respectively.

In selecting the value of there should be more attention, because by selecting a big value for this parameters some pairings with small negative reduced cost might be not generated and then final solution quality will be lower. But, by decreasing this value to zero, reduced cost of pairings with small positive reduced cost might be obtained negative and then adding these pairings to the RMP will be useless. Therefore the value of in the SPRCF algorithm was considered as 0.001 after checking various numbers.

Considered value for work rules, which were used in this paper, are selected by inspiring from the research of Ho et al [11] and Ahmadbeygi et al [10] and these values were shown in Table 6. In this table three value sets are exist. The values of sets 2 and 3 in compare with set 1 are more rigorous. So, they will result in decreasing the number of legal pairings. These values are applied on the problems with large size and especially in methods that are implemented on duty period network to occupy less memory and make the problem tractable. However it should be mentioned that making rule values tighter and more rigorous has inverse relation to feasibility of the problem. So, it is possible to find a solution by values set 1 for a problem but by applying values set 2 and 3, this problem may be infeasible.

In this paper column generation procedure is started by considering some artificial variables in RMP. In this case, for each flight leg one pairing is considered to cover all flight legs. But, for the related columns of these pairings a big value is considered as coefficient of objective function in set partitioning model. By doing this work, the model attempts to send out these variables from the base when a feasible solution achieve. In other words, for each infeasible pairings that includes one flight legs, a big mount of cost is considered, since the model does not tend to hold it in base. In this section performance of the proposed algorithm was compared with a previous research. So, computational results of SPRCF were compared with the algorithm of Makri and Klabjan [7]. This algorithm will be shown by MKA.

In Table 7 the results of implementing SPRCF and MKA on problem set 1 and by setting rules values to values set 1, are illustrated. In this table column "Rules set" refers to the number of values sets of rules in from Table 6. In the column "Column generation iterations", the number of iterations that column generation procedure passed before the time that pricing problem could not find any column, is shown. Also, the number of generated columns in all iterations of column generation procedure is in column "Number of generated columns". If after stopping the column generation any artificial variables exist in basis, their number stated in column "Number of remained artificial variables in basis". If the value of this column is nonzero, it means that no feasible solution has been found. Column "Integrality of final LP solution" shows if the final LP solution of column generation procedure is integral or not. If the symbol "ƒ¼" exists in this column, it means that an integral solution were found for the LP relaxation problem in root node and it is not necessary to search for an integral solution in a tree search framework.

As shown, MKA not only have worse objective value, but also it could not solve problem P4 because of inadequate computer memory. Also, solving time of MKA in compare with the proposed algorithm in this paper is very greater.

According to Table 8, SPRCF could solve all problems as rule set 1 and 2. But, MKA did not solve any problem as rule set 2 which is easier than rule set 1 and only as very tight values of rule set 3 and could solve three problems among six problems of problem set 2. In addition this algorithm could not send out all initial artificial variables from basis in problem P6 and did not find any feasible solution to it. In this table, results of MKA as rule set 1 did not show, since these algorithms have very low efficiency in solving the problems in this case.

## Conclusions and future works

In this research daily cockpit crew pairing problem is considered with assumptions of considering important work rules, a nonlinear cost structure for each pairing and not allowing deadheading. This problem was formulated as set partitioning problem and then column generation procedure was used to solve its LP relaxation. To solve sub-problem of column generation an algorithm with type of shortest path algorithms with resource constraints named as SPRCF was proposed. SPRCF is implemented on revised flight network.

The proposed algorithm and an existing algorithm in the literature were implemented on some problem instances. The results showed that proposed algorithm has the ability of solving problems up to 632 flight legs size as different value for work rules. In addition solving ability of this algorithm in compare with existing algorithm in the literature has lower sensitivity to the values of work rules. Solving time of SPRCF is very smaller than the other algorithm and since this algorithm is used to solve the sub-problem of column generation and column generation procedure is called several times in the framework of a branch and price tree, so spending a lot of time in column generation procedure imposed the problem to a great challenge and even could make solving it impossible.