# Various Sorting Algorithms A Review Computer Science Essay

Published:

This essay has been submitted by a student. This is not an example of the work written by our professional essay writers.

An algorithm is any well-defined procedure or set of instructions, that takes some input in the form of some values, processes them and gives some values as output. Sorting involves rearranging information into either ascending or descending order. Sorting is considered as a fundamental operation in computer science as it is used as an intermediate step in many operations. The goal of this paper is too reviewed on various different sorting algorithms and compares the various performance factors among them.

Keywords: heap sort, merge sort, quick sort.

## INTRODUCTION

Algorithms have a vital and key role in solving the computational problems, informally an algorithm is a well-defined computational procedure that takes input and produces output.

Algorithm is a tool or a sequence of steps to solve the computational problems [9]. The existence of algorithms goes way back as they were in existence even before the existence of computers.

There are various methodologies and techniques based on which various different kinds of algorithms are designed. Out of all these problem solving algorithms, let us talk about the sorting algorithms. In case of sorting, it is required to arrange a sequence of numbers into a given order, generally non-decreasing. [10] In computer science, an algorithm that puts elements of a list into a certain order is known as sorting algorithm. The orders that are mainly used are numerical order and lexicographical order. For making use of other algorithms (like search and merge algorithms) sorted lists are required to work correctly; it is also often useful for conforming to well-established patterns or rules of data and for producing such output which is easy to read and understand. There are two conditions enlisted that output must satisfy. These conditions are:

1. The output is in non-decreasing order (each element is no smaller than the previous element according to the desired total order)

2. The output is a permutation or reordering of the input.

Since the dawn of computing, the sorting problem has attracted a great deal of research, perhaps due to the complexity of solving it efficiently despite its simple, familiar statement. For example, bubble sort was analyzed as early as 1956 [1]. Although many consider it a solved problem, useful new sorting algorithms are still being invented (for example, library sort was first published in 2004). Sorting algorithms are prevalent in introductory computer science classes, where the abundance of algorithms for the problem provides a gentle introduction to a variety of core algorithm concepts, such as big O notation, divide and conquer algorithms, data structures, randomized algorithms, best, worst and average case analysis, time-space tradeoffs, and lower bounds.

## SORTING ALGORITHMS USED IN COMPUTER SCIENCE ARE OFTEN CLASSIFIED BY:

Computational complexity (worst, average and best behavior) of element comparisons in terms of the size of the list. For typical sorting algorithms good behavior is O(n logn) and bad behavior is O(n2).

Memory usage (and use of other computer resources). In particular, some sorting algorithms are "in place". This means that they need only O(1) memory beyond the items being sorted and they don't need to create auxiliary locations for data to be temporarily stored, as in other sorting algorithms.

Recursion: Some algorithms are either recursive or non-recursive, while others may be both (e.g., merge sort).

Stability: stable sorting algorithms maintain the relative order of records with equal keys (i.e., values).

Whether or not they are a comparison sort. A comparison sort examines the data only by comparing two elements with a comparison operator.

Adaptability: Whether or not the pre-sortedness of the input affects the running time. Algorithms that take this into account are known to be adaptive.

## Aims of the Algorithms:

The algorithm had several aims:

Speed.

Good memory utilization. The number of elements that can be sorted should closely approach the physical limits of the machine.

In order for the algorithm to be truly general purpose the only operator that will be assumed is binary comparison. This rule out methods such as radix sort [2, 3].

To obtain good memory utilization when sorting small elements linked lists are avoided. Thus, the lists of elements referred to below are implemented using arrays, without any storage overhead for pointers.

## Summaries Of Popular Sorting Algorithms:

## Bubble Sort:

Bubble sort is a simple sorting algorithm. The algorithm starts at the beginning of the data set. It compares the first two elements, and if the first is greater than the second, then it swaps them. It continues doing this for each pair of adjacent elements to the end of the data set. It then starts again with the first two elements, repeating until no swaps have occurred on the last pass. This algorithm's average and worst case performance is O(n2), so it is rarely used to sort large, unordered, data sets. This causes larger values to "bubble" to the end of the list while smaller values "sink" towards the beginning of the list. Bubble sort can be used to sort a small number of items (where its inefficiency is not a high penalty). Bubble sort may also be efficiently used on a list that is already sorted except for a very small number of elements. For example, if only one element is not in order, bubble sort will take only 2n time. If two elements are not in order, bubble sort will take only at most 3n time. Bubble sort average case and worst case are both O(nÂÂ²).

## Pros:

1) Simplicity and ease of implementation.

2) Auxiliary Space used is O (1).

## Cons:

Very inefficient. General complexity is O (n2). Best case complexity is O(n).

## Insertion sort:

Insertion sort is a simple sorting algorithm that is relatively efficient for small lists and mostly sorted lists, and often is used as part of more sophisticated algorithms. It works by taking elements from the list one by one and inserting them in their correct position into a new sorted list. In arrays, the new list and the remaining elements can share the array's space, but insertion is expensive, requiring shifting all following elements over by one. Shell sort is a variant of insertion sort that is more efficient for larger lists.

## Pros:

Auxiliary space used is O (1).

## Cons:

General Complexity is O (n2). Best Case is O(n) when the list is already sorted.

## Heap sort:

Heap sort is a much more efficient version of selection sort. It also works by determining the largest (or smallest) element of the list, placing that at the end (or beginning) of the list, then continuing with the rest of the list, but accomplishes this task efficiently by using a data structure called a heap, a special type of binary tree. Once the data list has been made into a heap, the root node is guaranteed to be the largest (or smallest) element. When it is removed and placed at the end of the list, the heap is rearranged so the largest element remaining moves to the root.

Using the heap, finding the next largest element takes O(log n) time, instead of O(n) for a linear scan as in simple selection sort. This allows Heap sort to run in O(n log n) time, and this is also the worst case complexity.

## Pros:

1) Time complexity of the algorithm is O (n log n).

2) Auxiliary Space required for the algorithm is O (1).

3) In-space and non-recursive makes it a good choice for large data sets.

## Cons:

1) Works slow than other such DIVIDE-AND-CONQUER sorts that also have the same O (n log n) time complexity due to cache behavior and other factors.

2) Unable to work when dealing with linked lists due to non convertibility of linked lists to heap structure.

## Quick Sort

Quick Sort is a divide and conquer algorithm which relies on a partition operation: to partition an array an element called a pivot is selected. All elements smaller than the pivots are moved before it and all greater elements are moved after it. This can be done efficiently in linear time and in-place. The lesser and greater sub-lists are then recursively sorted. Efficient implementations of quick sort (with in-place partitioning) are typically unstable sorts and somewhat complex, but are among the fastest sorting algorithms in practice. Together with its modest O(log n) space usage, quick sort is one of the most popular sorting algorithms and is available in many standard programming libraries. The most complex issue in quick sort is choosing a good pivot element; consistently poor choices of pivots can result in drastically slower O(nÂÂ²) performance, if at each step the median is chosen as the pivot then the algorithm works in O(n log n). Finding the median however, is an O(n) operation on unsorted lists and therefore exacts its own penalty with sorting.

## Pros:

1) One advantage of parallel quick sort over other parallel sort algorithms is that no synchronization is required. A new thread is started as soon as a sub list is available for it to work on and it does not communicate with other threads. When all threads complete, the sort is done.

2) All comparisons are being done with a single pivot value, which can be stored in a register.

3) The list is being traversed sequentially, which produces very good locality of reference and cache behavior for arrays.

## Cons:

1) Auxiliary space used in the average case for implementing recursive function calls is O (log n) and hence proves to be a bit space costly, especially when it comes to large data sets.

2) Its worst case has a time complexity of O (n2) which can prove very fatal for large data sets. Competitive sorting algorithms.

## Merge Sort:

Merge sort takes advantage of the ease of merging already sorted lists into a new sorted list. It starts by comparing every two elements (i.e., 1 with 2, then 3 with 4...) and swapping them if the first should come after the second. It then merges each of the resulting lists of two into lists of four, then merges those lists of four, and so on; until at last two lists are merged into the final sorted list. Of the algorithms described here, this is the first that scales well to very large lists, because its worst-case running time is O(n log n). Merge sort has seen a relatively recent surge in popularity for practical implementations, being used for the standard sort routine in the programming languages Perl, [4] Python (as timsort[5] ), and Java (also uses timsort as of JDK7[6] ), among others. Merge sort has been used in Java at least since 2000 in JDK1.3. [7] [8]

## Pros:

1) Marginally faster than the heap sort for larger sets.

2) Merge sort is often the best choice for sorting a linked list because the slow random-access performance of a linked list makes some other algorithms (such as quick sort) perform poorly, and others (such as heap sort) completely impossible.

## Cons:

1) At least twice the memory requirements of the other sorts because it is recursive. This is the BIGGEST cause for concern as its space complexity is very high. It requires about a ÎËœ (n) auxiliary space for its working.

2) Function overhead calls (2n-1) are much more than those for quick sort (n). This causes it to take more time marginally to sort the input data.

## Counting Sort

Counting sort is applicable when each input is known to belong to a particular set, S, of possibilities. The algorithm runs in O(|S| + n) time and O(|S|) memory where n is the length of the input. It works by creating an integer array of size |S| and using the ith bin to count the occurrences of the ith member of S in the input. Each input is then counted by incrementing the value of its corresponding bin. Afterward, the counting array is looped through to arrange all of the inputs in order. This sorting algorithm cannot often be used because S needs to be reasonably small for it to be efficient, but the algorithm is extremely fast and demonstrates great asymptotic behavior as n increases. It also can be modified to provide stable behavior.

## Pros:

1) The algorithm has a time complexity of O (n+m), where n is the number of data while m is the range of the data, which implies that the most efficient use will be when m<<n. In that case the time complexity will turn out to be linear.

2) This sort works optimally in the case when the data is uniformly distributed.

## Cons:

If the range m>>n, the complexity will not be linear in n and thus this sort does not remain useful anymore. This is because chances of introduction of gaps, that is counters for those elements which do not exist in the list, will cause a higher space complexity.

Sort

Order

Worst Case

Memory

Stable

Data Type

Complexity

Bubble

n2

n2

nk+np

Yes

All

Very Low

Insertion

n2

n2

nk+np

Yes

All

Very Low

Quick

n log n

n2

nk+np+stack

No

All

High

Merge

n log n

n log n

nk+np+stack

Yes

All

Medium

Heap

n log n

n log n

nk+np

No

All

Very Low

Counting

n

n

nk+np

Yes

All

Very Low

Table1: Various Sorting Algorithm

## CONCLUSION

Every sorting algorithm has some advantages and disadvantages. In the following table we are tried to show the strengths and weakness of some sorting algorithms according to their order, memory used, stability, data type and complexity. To determine the good sorting algorithm, speed is the top consideration but other factor include handling various data type, consistency of performance, length and complexity of code, and the prosperity of stability .This paper discuss comparison based sorting algorithms . For future, we proposed design a more efficient algorithm.