An Overview Algorithms And Data Structures Computer Science Essay

Published: Last Edited:

This essay has been submitted by a student. This is not an example of the work written by our professional essay writers.

Algorithms consist of a set of rules to execute calculations by hand or machine. It can also be defined as an abstraction consisting a program executed on a machine (Drozdek 2004). This program will follow operations carried out in sequence on data organized in data structures. These data structures are generally categorized into:

Linear data structures examples which are arrays, matrices, hashed array trees and linked list among others.

The tree data structures which include binary tree, binary search tree, B- trees, heaps e.t.c.

Hashes which consist of the commonly used hash table

Graph

Graph:

This is an abstract data structure which implements the graph oriented concepts. The graph will consist of arcs or edges as (x, y) of nodes or vertices. The edges may assume some value or numeric attribute such as cost length or capacity. Some of the operations of the graph structure G would include:

Adjacent (x, y) an operation testing whether for the existence of an edge between x and y.

Set_ node, value (G x, a) an operation setting the value associated with node x to a

Add (G x, y) an operation that adds to the graph an arc from x and y if it is not existent.

Graph algorithms are implemented within computer science to find the "paths between two nodes like the depth or breadth first search or the shortest path" (Sedgewick 2001 p 253). This is implemented by the Dijkstra's algorithm. The Floyd - Warshall algorithm is used to derive the shortest path between nodes.

These are linear data structures consisting of a data sequence linked by a reference. Linked lists provide implementation for stacks, queues, skip lists and hash tables. Linked lists are preferred over arrays because the lists may be ordered differently from how they are stored in memory. These lists will therefore allow the removal or insertion of nodes at any point. Each component or record has a node containing an address to the next node called the pointer or next link. The remainder of the fields are known as the payload, cargo, data or information. The list has first node as the head and the last node as the tail. A linked list may be circularly linked where the last node references the first node in the same list or linear where the link field is open.

B -Tree

This is a tree data structure that stores sorted data and allows searches, deletions, insertions and sequential access. The operations in the B- Tree are normally optimized for bulky data systems. The B -Tree has variants of design. However the B -Tree stores keys in the internal nodes. However this does not normally reflect at the leaves. The general variations are B+ - Tree and B* - Tree (Comer 129).

The searching process is similar for the B- Tree and the binary search tree. It commences at the root and a traversal is executed from top to bottom. The search points at the child pointer with values between the search values. The insertion starts at the leaf node which if containing fewer than legally acceptable elements qualify for an addition, otherwise the node is evenly split into two nodes. A median is chosen in determining the left or right hand placements with values greater than the median going to the right node. The median here acts as the separation value. The deletion process assumes two popular strategies. Either the element located is deleted followed by a restructuring of the tree. Alternatively a scan may be performed followed by a restructuring of the tree after the candidate node to be delete has been identified.

Hashes

This is a data structure employing the hash function mapping to identity keys. The function transforms the key as an index of an array. The function then maps every key possibility to a unique slot index. Using well dimensioned hash tables every look up is independent of the population in the array. The hash table efficiency is utilized in database indexing, implementation of sets and cache and associative arrays. A simple array is central to the hash table algorithm. This algorithm derives an index from the element's key. This index is then used to store the elements in the array. The hash function f represents the implementation of the calculation. Hash tables implement various types of memory tables. The keys are used in this case for persistent data structures and disk based database indices.

Greedy Algorithms.

These algorithms work by making most promising decisions at the onset whatever the outcome would be is not taken into consideration at that moment. These algorithms are considered straight forward, simple and short sighted (Chartrand 1984). The upside or advantage to these greedy algorithms is that they are easy to invent and implement and will prove efficient.

Their disadvantage is that they are not able problems optimally because of their greedy approach. Greedy algorithms are applied when we try to solve optimization problems.

A typical implementation of these algorithms is the 'making change' problem whereby we are required to give change using minimum number of notes or coins. We commence by giving the largest denomination first.

Informally the greedy algorithm for this problem would follow the steps below:

Begin without anything

At each stage and without passing a given amount

Consider the largest addition to the set.

A formal algorithm of the implementation of the making change problem can be written as here below:

MkChange

C â†’ {100, 25, 10, 5, 1} // C is a constant set of different coinage denominations

Sol â†’ {X} // Represents the solution set

Sumâ†’ 0 which is the sum of items in {X}

WHILE Sum. Not = n

L =Largest of C such that

Sum +L <= n

IF no such item THEN

RETURN "No item"

SUM â† Sum+L

RETURN S.

An approach by the greedy algorithm to ensure optimization is the maintaining of two sets one for chosen items and the other for rejected items. Based on the two sets the algorithm will carry out four functions. Function one checks whether the chosen set of items can provide a solution. Function two checks for flexibility of the set. The selection function identifies the candidates. The objective function gives a solution.

The greedy algorithm applies for the shortest path. The Dijkstra's algorithm aims at determining the length of the shortest path. This path runs from S the source to other nodes.

Typically Dijkstra's algorithm maintains two sets of nodes S and C. S in this case consists of already selected nodes whereas C will consist of the rest of the nodes within the graph (Papadimitrious & Steiglitz 1998). At the initialization of the algorithm our set X has only S. After execution {X} includes all the nodes of the graph. During every step in the algorithm a node in C that is closest to S is chosen. The remainder nodes that don't belong to S will result in a disconnected graph.

The diagrams below illustrate the Dijkstra algorithm

Considering the graph G = (V, E). Each node of the graph has an infinite cost apart from the source node with 0 costs (Design and Analysis of Computer Algorithms 2010)

Source: Design and Analysis of Computer Algorithms 2010

Initialize d[S] to zero and choose the node closest to S. Add to S while relaxing all other nodes adjacent to S.

Update every node. The diagram here below illustrates this process:

Source: Design and Analysis of Computer Algorithms 2010

Choose the closest node X and relax adjacent nodes while updating u, v and y as indicated in the diagram below.

Source:

Next we consider y as closest and add to S and relax V as indicated in the diagram below

Source: Design and Analysis of Computer Algorithms 2010

Consider u and adjust v as a neighbor as indicated in the diagram here below.

Source: Design and Analysis of Computer Algorithms 2010

Finally add V and the predecessor list now defines the shortest path from S which was the source node. The diagram below illustrates the resulting shortest path

Source: Design and Analysis of Computer Algorithms 2010

Spanning trees

Typically graphs will have a number of paths between nodes. Spanning tree graphs consist of all the nodes with a path between any two nodes. A graph consists of different spanning trees. A disconnected graph will represent a spanning forest. A breadth first spanning tree results after a breadth first search on this graph. The depth first spanning tree results after a depth first search on the spanning tree. Spanning tree applications among others includes the travelling salesman problem here below:

Problem: Considering an undirected graph G= (V, E) having a non negative integer cost associated with every edge and representing a certain distance. We can derive a tour of the graph G with the minimum cost. The salesman may start from city 1 and go on to the six cities (1 - 6) and return back to city 1.

The first approach would run in the following manner from city:

1 to 4 to 2 to 5 to 6 to 3 to 1 resulting in a total of 62 kilometers. The diagram below shows this approach.

Adding the edge weights we have 15+10+8+15+9+5 = 62

Source: Design and Analysis of Computer Algorithms 2010

The other alternative approach which is the most optimal would run in the following man from city: 1 to 2 to 5 to 4 to 6 to 3 to 1 resulting in a total of 48 kilometers. The diagram below shows this approach.

Adding the edge weights we have 10+8+8+8+9+5= 48 Kilometers

Source: Design and Analysis of Computer Algorithms 2010

Other applications using the panning tree approach are like the airlines' route determination, designing of computer networks, the laying of oil pipelines to connect refineries and road link constructions between cities. (Goodrich & Tamassia 2010; Sedgewick 2002).

A typical minimum spanning tree application based on the spanning tree application MST(minimum spanning tree) cost can be used o determine the points of connection of some cable for example the fiber optic being laid along a certain path. The edges with a larger weight which corresponds to more cost would be those that require more attention and resources to lay the cable. An appropriate result would be derived from the graph with the minimum cost.

Prim's Algorithm.

The approach for this algorithm is that it proceeds from an arbitrary root node at every stage. A new edge being added to the tree at every step. The addition process terminates when all the nodes in the graph have been achieved. This algorithm concentrates on the shortest edge. Therefore the time lapse for the algorithm will depend on how the edge is searched. The straight forward search method identifies the smallest edge by searching adjacently a list of all nodes in the graph. Every search as an iteration has a cost time O (m). Total cost time to run a complete search is O (mn).

The Prim algorithm (basic) takes the following steps:

Initialize the tree to consist of a start node

WHILE not all nodes in the tree

Loop

Examine all nodes in the graph with one end point in the tree

Find the shortest edge adding it to the tree

End.

After each step or iteration a partially completed spanning tree holding a maximum number of shortest edges is created as A and B will consist of the remaining nodes. The loop looks for the shortest edge between A and B.

Kruskal's Algorithm.

This is an algorithm that computes the minimum spanning tree (MST). This is done by building a generic algorithm into a forest. Kruskal's algorithm will consider every edge and is ordered based on the increasing weight. Consider an edge (u, v) that connecting two different trees. It follows that (u, v) will be added to the set of edges in the generic algorithm. The resultant is a single tree from two trees connected by (u, v).

The algorithm can be outlined as follows:

Commence with an empty set E selecting at each stage the shortest edge not yet chosen or discarded regardless of its location on the graph

MST - KRUSKAL (G, w)

A â†{ } // the set containing the edges of the MST

for every node n in V[G]

do make_set (n)

sort edge of E by decreasing weights w

for each edge (u, n) in E

do if FIND_SET (u) not equal FIND_SET (n)

then A=A U {(U, N)}

UNION (u, n)

Return A

The algorithm above makes use of disjoint set data structures. Kruskal's algorithm can also be implemented with the priority queue data structure. The resulting algorithm is shown below:

MST - KRUSKAL (G)

for each node n in V[G] do

define S(n) â†{ n}

initialize the queue Q consisting of all the edges of the graph G. Weights are not used as key here

A â†{ } // This set will contain the edges of the generic algorithm(MST)

WHILE A has v-1 edges do

n Ð„ S(n) and u Ð„ S(u)

IF S (n)! = S (u) Then add edge (u, n) to A S(n) U S(u)

Return A

The Binary Search Tree.

A binary tree is one where every internal node X will store an element. Generally the elements in the left sub tree of X are less than or equal to X whereas those on the right sub tree are equal or greater than X. This represents the binary search tree property. The binary search tree height amounts to the number of links between the root and the deepest node (Skeinna 2008). The implementation of the binary search tree is such as a linked data structure where each node is an object with a total of three pointer fields namely left, right and Parent. These points to nodes corresponding to the left, right children and the parent. A NIL in any of these fields indicates no parent or child. The root node contains NIL in the Parent field.

Dynamic programming algorithms

These typically explores ways of optimization sequence based decisions in determining solutions. The algorithms employed avoid full enumeration of partial decisions that having a sub optimal contribution to the final solution. They instead concentrate only on optimal contributors (Aho & Hopcrost 1983). The optimal solution is derived from a polynomial number of decision steps. At other times it is necessary for the algorithm to be fully implemented, however in most cases only the optimal solution is considered.

Dynamic programming algorithms use of duplication and every sub solution is stored for later referencing. These solutions to the sub problems are held in a table. The total sub problems are then worked out using the bottom up technique. The steps in this bottom up technique will include the following:

Begin by addressing the smallest sub problem

Combine and sum up their solution increasing the scope and size

UNTIL arriving at the solution of the original problem

Dynamic programming relies on the principle of optimality. This principle alludes to the fact that present in an optimal decision or choice sequences are sub sequences that must be optimal as well.

Warshall Algorithm.

The WFI algorithm as it is also known is a graph analysis algorithm used to determine the shortest path in a weighted graph (Chartrand 1984). A comparison carried out will cover all possible paths between nodes of the graph. Consider graph G with nodes V as 1 to N. Let sPath(i, j, k) be the function that will return the shortest path between I and j while using the nodes 1 to k, demonstrates a recursive formula that results as shown here below

shortestPath(i, j, 0) = edgeCost(i, j)

This forms the heart of the WFI algorithm. The shortest path is first computed as shortestPath(i, j, k) for all (i, j) pairs of k where k = 1 to n.

The Floyd - Warshall algorithm iteratively determine paths lengths between nodes (i, j) over i=j.

The initial path is considered as zero, the algorithm provides the path lengths between the nodes.

Conclusion

Data structures and their associated algorithms are fundamental even today in providing the means for data storage and manipulation (Sage 2006). Core and complex computer processing involving memory management functions for operating systems, the database management systems cache implementation rely on data structures and their associated algorithms to execute efficiently and effectively. It is therefore becomes necessary that an adequate study of these data structures and algorithms is carefully studied and understood by system programmers to ensure the design of efficient and effective software.