Print Email Download Reference This Send to Kindle Reddit This
submit to reddit

Complexity Analysis by using some examples Algorithm in simplest terms

A.1. Definition and example (from practical life) An algorithm may be defined in simple terms as a finite sequence of instructions that solves a problem. Example: Shampoo Algorithm (Dirty Hair Problem!)

1. Wet hair

2. Apply shampoo

3. Rinse hair

4. Repeat steps 1, 2, and 3 twice

5. Stop

A.2. what do we observe from this algorithm that we follow every day?

We have a finite number of instructions. They are in a specific sequence i.e. the order is important. You cannot apply the shampoo until the hair is wet.

The term algorithm is believed to derive from the name of a 9th century Arabian mathematician Al-Khowarizmi.

Data structures : Subject of algorithms

B.1. Relation between computer program and algorithm

A computer program is simply an implementation of an algorithm on a computer.

B.2. Arriving at analysis of algorithms

Frequently, there may be a number of algorithms to solve the same problem. The question arises as to which one should be used. Is one algorithm better than another? This is a non-trivial question and leads to the analysis of algorithms and the means by which they can be compared. This area of study is sometimes called Complexity Theory in Computer Science.

B.3. Decidability of an algorithm

A). 1st way of deciding an appropriate algorithm

One way to compare algorithms is to compare their Performance in terms of how quickly they solve the problem. Some algorithms arrive at a solution faster than others. We often choose the fastest algorithm when solving a problem. Think of how you use a telephone book to look up a number.

Basic questions

What other way could you do it?

Which way is faster?

Which way is simpler?

B.)2nd way of deciding an appropriate algorithm

Another way of comparing algorithms is to look at the Amount of space (memory) they require. Some algorithms require large amounts of space but arrive at a solution quickly. Others require small amounts of space but arrive at a solution less quickly.

B.4. Parameters of efficiency of an algorithm

The efficiency of an algorithm depends on its use of resources, such as:

– The time it takes the algorithm to execute;

– The memory it uses for its variables;

– The network traffic it generates;

– The number of disk accesses it makes;

– Etc.

Space/time trade off in algorithms

You are often presented with a space/time trade-off. You will find time and time again in comparing algorithms and computing systems (and in many aspects of life!) that you have a trade-off situation.

C.1. what is actually does trade-off mean?

It simply means Improvement in one aspect leads to degradation in another.

Examples

Fast algorithm may require a lot of space & Slow algorithm may require small amount of space

Expensive car may consume less fuel & Cheap car may consume much fuel

C.2. Treatment of a trade off

Each trade-off situation must be evaluated separately. You cannot tell in advance which solution is appropriate until you know the details of the situation. Thus, in advance you cannot specify the ‘best’ solution. However, you can specify in advance, the relative merits of each solution, so that in any particular situation you can choose the solution that best suits the situation.

A).What if there are not one but many algorithms to make choice from?

If we have a number of algorithms, we can specify their relative performance and use this when choosing one for a particular problem in a particular environment. We will only look at one measure of performance here, the relative amount of time an algorithm takes to solve a problem.

This is called the time complexity of an algorithm.

Note: - When we speak of the time complexity we are not interested in absolute times, i.e. how many seconds it takes to solve a particular problem.

b).What factors dominate the actual time taken to solve a problem?

The actual absolute time taken to solve a problem depends on a number of factors:

• How fast the computer is

• RAM capacity of the computer

• OS the computer uses

• Quality of code generated by the compiler

If you change any of these, then the absolute time changes. Thus absolute time is not useful as a measure of an algorithm’s performance.

Computation of complexity

D.1. Reason for complexity computation

One reason for computing complexity is to compare algorithms. We need a measure which will allow us compare two algorithms. Each algorithm consists of a finite sequence of instructions. The more instructions in an algorithm the longer it will take to execute.

D.2. Ways for complexity computation (detail analysis of a very simple example)

1. one way to compare algorithms would be to count the instructions that the algorithm requires to solve a problem. The number of instructions will vary depending on the input.

Example: A payroll program for 100 people will repeat more instructions than one for 10 people.

Thus we compute the number of instructions as a function of the input size.

But do we need to count all instructions?

When we look at many algorithms, we see that they consist of a basic loop which is executed for each item of the input. As the input size grows, the number of instructions carried out in the loop far exceeds those before and after the loop.

To simplify our calculations we do not count instructions outside the basic loop.

Do we need to count all instructions inside the loop?

Again when we study algorithms, we find that the number of instructions inside the loop for any two algorithms to solve a specific problem does not vary widely!

But what does differ is the number of times the loop is executed.

So, essentially what we need to count is the number of times the loop is executed.

D.3. Conclusion to this problem

In a searching algorithm, the loop compares the current element of a list with the item to be found. In this case it is the number of comparisons that we count in order to compare one searching algorithm with another. But obviously the number of comparisons depends on the size of the list.

Thus we express the number of comparisons in terms the list length.

Understanding Big-O notation method through following example

Given a list of length n for a sequential (linear) search algorithm we find that the number of comparisons is n, in the case where the item to be found is not in the list.

E.1. Big-O notation method

We use the notation O (n) to describe this time complexity. It reads “Order of n”. This is called the Big-O notation.

For a list of length 1000, we would make 1000 comparisons. If we double the list length we double the number of comparisons. But sometimes (on average) we are likely to find the item in the list. Sometimes near the beginning, sometimes near the end.

E.2. Types of Case complexities

On average, we will have to search half of the list. So, on average we would make O (n / 2) comparisons. This is called the average case complexity. We have seen that if the item is not in the list we make n comparisons, this is called the worst case complexity.

Note: - we still regard the complexity of a linear search algorithm as O (n) despite the fact that on average we find the item in n/2 comparisons.

But why?

This is because as n grows very large, the difference between n and n/2 may will become less significant i.e. they are of the same order (e.g. 2 billion versus 1 billion).

When comparing algorithms it is important to know both the average and worst case complexity.

E.3. Faster Search Algorithm: Binary Search

But there is another algorithm for searching a list called the binary search algorithm where the number of comparisons is substantially less. It is O (log n), using base-2 logs. Thus if the list length is 1000, using binary search we will find the item in at worst 10 comparisons (as opposed to 1000 for linear search). But consider a list length of 1 billion. With a binary search algorithm we will find an item with at worst 30 comparisons.

Note: - It is obvious that the binary search algorithm is far superior from a time complexity viewpoint to a linear search algorithm.

We can see for any two algorithms that an algorithm of O (log n) is far superior to one of O (n) as the following table illustrates:

Binary Search Algorithm and its complexity

There is a major restriction with the binary search algorithm. It can only be used when the list is in a particular order i.e. when the list is sorted.

It is the method we use to search for a number in phone directory. (Real life relation)

We open the directory in the middle, compare the entry there with what we are looking for, and depending on the outcome we know if the number is in the lower half of the directory, the upper half or on the page we have opened.

We thus eliminate half of the directory with 1 comparison!! We repeat this procedure until we find the number or know it is not present. With each comparison we reduce the list length (number of pages of directory) by a factor of 2.

Thus with 10,000 pages after 1 comparison we have 5,000 left to search, then 2500, then 1250, then 625 etc. The maximum number of comparison is log (10,000) (to base 2) i.e. the number of times we can divide 2 into 10,000.

We can only use this method, because the list is sorted.

Consider a list A of n integers, sorted in increasing order. The binary search algorithm to search A for an element e takes the form:

The binary search (chop) algorithm is a standard algorithm for searching sorted lists. It is described in most textbooks dealing with algorithms. There are numerous standard algorithms in Computer Science for performing a variety of tasks.

Footnotes on choosing algorithm and problems faced

Before designing an algorithm from scratch, it is worth checking whether the problem (sub-problem) is a general one and if so there are standard algorithms available. This obviously saves you a lot of time and effort as there is little point in reinventing the wheel!

One problem for which there exists a host of standard algorithms is that of sorting. The sorting problem is: given a list of items in any order, design an algorithm to arrange the list in a particular order (ascending or descending) i.e. sort the list.

Before we can use the binary search algorithm, the list must be sorted. Occasionally, the list of interest in a problem may already be sorted, but usually you will have to sort the list yourself.

Searching and sorting algorithms are important because they arise so frequently in practical programming applications. A significant percentage of all computing time is spent searching and sorting lists.

Bubble Sort and its complexity

There are a surprising number of ways to sort a list. We will only consider one method called the bubble sort algorithm.

We will sort the list in ascending order i.e. the largest values are at the end.

Consider the list L:

10, 34, 2, 16, 23, 8, 12

We wish to sort L so that it becomes:

2, 8, 10, 12, 16, 23, 34

In order to sort the list, we have to swap (exchange) values, so that the larger values are swapped with the smaller ones, the larger values being moved to the end of the list and the smaller values being moved to the start of the list.

We want to move the larger values to the end of the list. Remember we can only compare two values at any point.

Steps taken

In bubble sorting, we begin at the start of the list.

We compare the first two elements: L [0] and L [1]. If the lower index element (L [0]) is bigger we swap it, thus moving the larger element towards the end of the list. We then compare L [1] with L [2] and we swap them if L [1] > L [2].

Next we compare L [2] with L [3] and swap if necessary. We continue this process of comparing pairs of elements and swapping if necessary until we come to the end of the list.

We can now be sure of one thing. The largest element of the list is at the end. The rest of the list is still unsorted. We have passed through the list once.

Initially list L had the form

10, 34, 2, 16, 23, 8, 12

After first pass of bubble sort:

10, 2, 16, 23, 8, 12, 34

We now begin all over again.

We compare L[0] with L[1] swapping if L[0] > L[1] and so on as before, but this time we

Have only to proceed to the second last item in the list, since the last item is the largest.

When we have finished, we now have the second largest item in the second last position. We

Have passed through the list twice.

List L now has the form:

2, 10, 16, 8, 12, 23, 34

We start all over again, this time stopping at the second last item. On this pass through the list, we get the third largest item in position.

List L after third pass

2, 10, 8, 12, 16, 23, 34

The algorithm continues, passing through the list, once for every item in the list, but the distance passed getting smaller by one each time. The algorithm finishes when we have passed through the list

For the nth time for a list with n items.

After the final pass, L has the form:

2, 8, 10, 12, 16, 23, 34

L is now sorted. Bubble sort is named because if you trace its action, the larger elements gradually move towards the end of the list, just like bubbles rising in a glass of water.

Complexity of Bubble Sort

For a list of n items we pass through the list n times, thus giving a complexity of O (n2).

We have now seen the complexity of three well-known

Algorithms:

• Linear search O (n)

• Binary search O (log n)

• Bubble sort O (n2)

The complexity of list processing algorithms will typically be at least O (n), since nearly always we have to make at least 1 pass through the list.

Binary searching is somewhat of an exception. It takes advantage of the fact that a list is sorted and thus does not have to make a pass through the list. But what if the list is unsorted, as is usually the case. Then for searching, we have two choices. We can use the linear search algorithm which works with unsorted and sorted lists.

Alternatively, we can first sort the list using a sorting algorithm and then use the binary search algorithm. However, we now have the overhead of sorting and unless we intend searching the list a large number of times the linear search algorithm would be superior than the

Combination of sorting followed by a binary search.

If we intend to search the list many times, then the overhead of sorting the list once would be repaid by the greater efficiency of the binary search algorithm.

It is important to note that algorithms of O (n) are in fact very efficient algorithms especially compared to say a bubble sort with O (n2) complexity.

Infeasible Algorithms (complexity computation usage in real life)

There are many algorithms whose complexity is such that for even small values of n, they cannot be feasibly used. Algorithms with complexity O (n!) (Factorial n) or O (2n) (exponential complexity) are infeasible.

One such algorithm is one for the travelling salesman problem.

PROBLEM 1: A salesman has to travel to say, 50 cities. He wishes to take the cheapest (perhaps shortest) route whereby he visits each city only once.

One simple algorithm is to compute all possible routes and pick the cheapest.

But how many routes are there? There are 50! Possible routes. This is a 65 digit number, and

Would take billions of years to compute on a computer. If there were 300 cities then we would have a 600 digit number.

Thus we can see that the algorithm is infeasible for even small values of n. It is important to note that this problem occurs frequently in different guises: e.g. laying telephone cables (roads, rail

Tracks) between a numbers of towns. A small saving in

Distance can result in huge financial savings. It is an

Example of a routing problem.

Another example of an infeasible problem might be a process control situation in a power plant.

PROBLEM 2: There are 50 sensors around the plant connected to a computer. For simplicity, we assume each sensor sends a binary signal to the computer, indicating normal/abnormal conditions. We are interested in testing that the computer program will behave correctly with every possible combination of inputs from the sensors.

Well how many combinations are there? There are 50 sensors each providing yes/no responses thus there are 250 possible combinations. If we carried out 1 instruction per microsecond it would take over 35 years to compute all solutions. If we had 60 sensors, it would take 366 centuries to compute.

This is another infeasible problem.

PROBLEM 3: this last example illustrates another important issue. We cannot test complex computer programs to check if they will work with all possible inputs. We can only partially test programs on a very small subset of the possible inputs.

It is also very important to remember that testing in itself does not prove anything. Just because a program works for some test data, in no way proves the correctness of the program.

The question arises then as to how we deal with infeasible problems. One approach is to develop algorithms that are not guaranteed to give the best answer, but will on averagegive a good answer.

Case study-1

Examine a problem with several different solutions.

– Will look at four algorithms

– Some algorithms much easier to code than others.

– Some algorithms much easier to prove correct than others.

– Some algorithms much, much faster (or slower) than others.

J.1. the problem

Maximum Contiguous Subsequence Sum Problem

– Given (possibly negative integers) A1, A2, AN

find (and identify the sequence corresponding to) the

maximum value of (Ai + Ai+1 + · · · + Aj).

• The maximum contiguous subsequence sum is zero if all the

integers are negative.

• Examples :

– -2, 11, -4, 13, -4, 2

– 1, -3, 4, -2, -1, 6

Let us apply brute force algorithm

J.2. Analysis.

Loop of size N inside of loop of size N inside of loop of size N

means O(N3), or cubic algorithm.

Slight over-estimate that results from some loops being of size

less than N is not important.

J.3. Actual running time:-

• For N = 100, actual time is 0.47 seconds on a particular

computer.

• Can use this to estimate time for larger inputs :

T(N) = cN3

T(10N) = c(10N)3 = 1000cN3 = 1000T(N)

• Inputs size increases by a factor of 10 means that running time

increases by a factor of 1,000.

• For N=1000, estimate an actual time of 470 seconds. (Actual

was 449 seconds).

• For N=10,000, estimate 449000 seconds (6 days).

J.4. How to improve ?

• Remove a loop; not always possible.

• Here it is : innermost loop is unnecessary because it throws

away information.

• This Sum for next j is easily obtained from old value of

This Sum:

– Need Ai + Ai+1 + · · · + Aj−1 + Aj

– Just computed Ai + Ai+1 + · · · + Aj−1

– What we need is what we just computed +Aj .

J.5. The better algorithm

J.5.1 Analysis

• Same logic as before : now the running time is quadratic, or O(N2).

• As we will see, this algorithm is still usable for inputs in the tens of thousands.

• Recall that the cubic algorithm was not practical for this amount of input.

J.5.2 Actual running time

• For N = 100, actual time is 0.0111 seconds on the same particular computer.

• Can use this to estimate time for larger inputs :

T(N) = cN2

T(10N) = c(10N)2 = 100cN2 = 100T(N)

• Inputs size increases by a factor of 10 means that running time increases by a factor of 100.

• for N = 1000, estimate a running time of 1.11 seconds. (Actual was 1.12 seconds).

• For N = 10, 000, estimate 111 seconds (=actual).

J.6. Use of linear algorithm

• Linear algorithm would be best.

• Running time is proportional to amount of input. Hard to do better for an algorithm.

• If inputs increases by a factor of ten, then so does running time.

J.7. Use of recursive algorithm

• Use a divide-and-conquer approach.

• The maximum subsequence either

– lies entirely in the first half

– lies entirely in the second half

– starts somewhere in the first half, goes to the last element in

the first half, continues at the first element in the second half, ends somewhere in the second half.

• Compute all three possibilities, and use the maximum.

• First two possibilities easily computed recursively.

J.8. Computing the third case

• Idea :

1. Find the largest sum in the first half, that includes the last element in the first half.

2. Find the largest sum in the second half that includes the first element in the second half.

3. Add the 2 sums together.

• Implementation :

– Easily done with two loops.

– For maximum sum that starts in the first half and extends to the last element in the first

half, use a right-to-left scan starting at the last element in the first half.

– For the other maximum sum, do a left-to-right scan, starting at the first half.

J.8.1 Analysis

• Let T(N) = the time for an algorithm to solve a problem of size N.

• Then T(1) = 1 (1 will be the quantum time unit; constants don’t matter).

• T(N) = 2T(N/2) + N

– Two recursive calls, each of size N/2. The time to solve each recursive call is T(N/2) by the above definition.

– Case three takes O(N) time; we use N, because we will throw out the constants eventually.

J.9. Bottom line

J.10. concept of N log N

• Any recursive algorithm that solves two half-sized problems and does linear non-recursive work to combine/split these solutions will always take O(N logN) time because the above analysis will always hold.

• This is a very significant improvement over quadratic.

• It is still not as good as O(N), but is not that far away either.

There is a linear-time algorithm for this problem. The running time is clear, but the correctness is non-trivial.

J.11. The linear – time algorithm

Case study 2

K.1 The problem: (Mr.Dupont has been asked to solve the problem in the given example)

T (1) = 3

T (2) = 10

T (n) = 2T (n − 1) − T (n − 2)

K, 1.1 what we have to find?

What is T (100)?

-We decide to directly code a recursive function tfn, in Java,

K.2 how solve the problem:

if (n==1) { return 3; }

else

if (n==2) {return 10;}

else

{return 2 ¤ tfn(n-1) - tfn(n-2); }

K.2.1. First mistake:

No analysis of the problem

-risk to be fired

K.2.2. Second mistake:

Bad choice of the programming language

-increasing the risk to be fired

K.3 how it actually work:

K. 4 How to improve the mistake;

Mr Dupont decides then to use C:

K.5. Final, analyze the problem

We times both programs and plots the results.

k. 6 we conclude:

It seems that each time n increases by 1 the time increases by ¼ 1.62.

At n = 40 the C program took 13.79 seconds,

K.7 we have to find for:

So for n = 100

-we estimates:

13.79 × 1.6260 ¼ 1, 627, 995 years

K.8 One of main problem

• Mr Dupont remembers:

-What will happen when we concern this problem in Data Structure?

After consulting his course notes,

Mr Dupont decides to use Dynamic Programming:

K,8.1 Use of Dynamic programming

Dynamic Programming:

K.8.2 final Analysis

This solution provides a much better complexity in time but despite of the space complexity:

n = 100 takes only a fraction of a second,

But for n = 10, 000, 000 a segmentation fault occurs.

K.9 But one problem

Too much memory required

.

Representations

Summary & Conclusion

M.1 an algorithm analyst might ask:

1. What makes the first program so slow?

2. How fast are the 3 programs asymptotically?

3. Is the last version really the ultimate solution?

Let us look at the recursion tree for the first program at n=4.

Here each circle represents one call to the routine tfn. So, for n=4 there are 5 such calls.

M.1.1 what is a general requirement for any algorithms?

A call to tfn (n) requires a recursive call to tfn(n-1)(represented by the shaded region on the left)

A call to tfn (n-2)(shaded region on the right).

M.1.2After that we compute by use of famous Fibonacci recurrence

If we let f (n) represent the number of calls to compute T(n), then :

The famous Fibonacci recurrence.

M.1.2.1. Analysis based in that:

Query for question (1).

It is known that f(n) ¼ 1.618n.

Agrees very well with the times we presented earlier where each increase of n by 1

Increases the time by a factor of a little under 1.62.

We say such growth is exponential with asymptotic growth rate (1.618n).

Query for the question (2)

There was a loop

-This loop contained two or three assignments, a multiplication and a subtraction.

-We say such a loop takes O(n) time.

-This means that running time is proportional to n.

Recall that increasing n from 100 million to 300 million increased the time from approximately 3 to approximately 9 seconds.

The last program has one multiplication and one subtraction and takes O(1) or constant time.

One more aspect –Query for question (3)

The answer to the last question is also NO. If the boss asked for

We would get integer overflow on most computers.

Switching to a floating point representation would be of no value since we need to maintain all the significant digits in our results.

The only alternative is to use a method to represent and manipulate large integers.

A natural way to represent a large integer is to use an array of integers, where each array slot stores one digit.

The addition and subtraction require a linear-time algorithm.

A simple algorithm for multiplication requires a quadratic-time cost.

The third question, “is the last program the ultimate solution”, is more of a computer science question.

M.1.2.2 Computer Scientist might ask:

. How do you justify counting function calls in the first case, Counting array assignments in the second case, counting Variable assignments in the third, and counting arithmetic operations in the last?

. Is it really true that you can multiply two arbitrary large numbers together in constant time?

. Is the last program really the ultimate one?

M.2 CS questions with an engineering orientation:

What general techniques can we use to solve computational problems?

What data structures are best, and in what situations?

Which models should we use to analyze algorithms in practice?

When trying to improve the efficiency of a given program, which aspects should we focus on first?

Work distribution and summary

N.1. Work distribution

Topics covered by roll number 9 :B, D, F, H, J,L (6 topics)

Topics covered by roll number 10: A,C,E,G,I,K, (6 topics)

Topic M is a team effort (as it is summarization)

N.2. References

Books:

Data structures by Seymour Lipschutz

Algorithmics, The Spirit of Computing; Hare; Addison- Wesley

The Practice of Programming, Kernighan & Pike, Prentice Hall

Internet sites:

http://ww3.algorithmdesign.net/handouts/Analysis.pdf

http://www.cse.iitd.ernet.in/~ssen/csl356/notes/root.pdf

http://www.catonmat.net/blog/mit-introduction-to-algorithms-part-one

Print Email Download Reference This Send to Kindle Reddit This

Share This Essay

To share this essay on Reddit, Facebook, Twitter, or Google+ just click on the buttons below:

Request Removal

If you are the original writer of this essay and no longer wish to have the essay published on the UK Essays website then please click on the link below to request removal:

Request the removal of this essay.


More from UK Essays