In graph theory, the shortest path problem is the problem of finding a path between two vertices (or nodes) such that the sum of the weights of its constituent edges is minimized. An example is finding the quickest way to get from one location to another on a road map; in this case, the vertices represent locations and the edges represent segments of road and are weighted by the time needed to travel that segment.
Formally, given a weighted graph (that is, a set V of vertices, a set E of edges, and a real-valued weight function fÂ : EÂ â†’Â R), and one element v of V, find a path P from v to a v' of V so that..,
is minimal among all paths connecting v to v' .
The problem is also sometimes called the single-pair shortest path problem, to distinguish it from the following generalizations:
The single-source shortest path problem, in which we have to find shortest paths from a source vertex v to all other vertices in the graph.
The single-destination shortest path problem, in which we have to find shortest paths from all vertices in the graph to a single destination vertex v. This can be reduced to the single-source shortest path problem by reversing the edges in the graph.
The all-pairs shortest path problem, in which we have to find shortest paths between every pair of vertices v, v' in the graph.
These generalizations have significantly more efficient algorithms than the simplistic approach of running a single-pair shortest path algorithm
SHORTEST PATH BACKGROUND
An algorithm for finding a graph geodesic, i.e., the shortest path between two graph vertices in a graph. It functions by constructing a shortest-path tree from the initial vertex to every other vertex in the graph. The algorithm is implemented as Dijkstra in the Mathematic package.
The worst-case running time for the Dijkstra algorithm on a graph with n nodes and m edges is because it allows for directed cycles. It even finds the shortest paths from a source node to all other nodes in the graph. This is basically for node selection and for distance updates. While is the best possible complexity for dense graphs, the complexity can be improved significantly for sparse graphs.
With slight modifications, Dijkstra's algorithm can be used as a reverse algorithm that maintains minimum spanning trees for the sink node. With further modifications, it can be extended to become bidirectional. The bottleneck in Dijkstra's algorithm is node selection. However, using Dial's implementation, this can be significantly improved for sparse graphs.
Efficient management of networks requires that the shortest route from one point (node) to another is known; this is termed as the shortest path. It is often necessary to be able to determine alternative routes through the network, in case any part of the shortest path is damaged or busy. The analysis of transportation networks is one of many application areas in which the computation of shortest paths is one of the most fundamental problems. These have been the subject of extensive research for many years. The shortest path problem was one of the first network problems studied in terms of operations research. Fixed two specific nodes s and t in the network, the goal is to find a minimum cost way to go from s to t. Several algorithms for computing the shortest path between two nodes of a graph are known. This one is due to Dijkstra (1959). Each node is labeled with its distance from the source node along the best-known path. Initially, no paths are known, so all nodes are labeled with infinity. As the algorithm proceeds and paths are found, the labels may change, reflecting better paths. A label may be tentative or permanent. Initially, all labels are tentative. When it is discovered that a label represents the shortest possible path from the source to node, it is made permanent and never changed thereafter
Clearly, this approach is not feasible for dynamic networks, where the travel cost is time-dependent or randomly varying. However, the majority of published research on shortest paths algorithms has dealt with static networks that have fixed topology and fixed costs. A few early attempts on dynamic approaches, referenced by Chabini (1997), are Cooke and Halsey (1966) and Dreyfus (1979). Not more than a decade ago, Van Eck (1990) reports several hours as an average time for a computer to churn through an all-to-all calculation on a 250-nodes small-scale static network, and several days on a 16.000-nodes large-scale network.[2,33].One way of dealing with dynamic networks is splitting continuous time into discrete time intervals with fixed travel costs, as noted by Chabini (1997). Thus, understanding shortest path algorithms in static networks becomes fundamental to working with dynamic networks.
SHORTEST PATHS IN DIFFERENT NETWORKS
The shortest path is implemented both in static and dynamic networks with the help of Dijkstra algorithm. The implementation steps of the algorithm include:
Let the node at which we are starting be called the initial node. Let the distance of node Y be the distance from the initial node to Y. Dijkstra's algorithm will assign some initial distance values and will try to improve them step by step.
Assign to every node a distance value. Set it to zero for our initial node and to infinity for all other nodes.
Mark all nodes as unvisited. Set initial node as current.
For current node, consider all its unvisited neighbors and calculate their tentative distance (from the initial node). For example, if current node (A) has distance of 6, and an edge connecting it with another node (B) is 2, the distance to B through A will be 6+2=8. If this distance is less than the previously recorded distance (infinity in the beginning, zero for the initial node), overwrite the distance.
When we are done considering all neighbors of the current node, mark it as visited. A visited node will not be checked ever again; its distance recorded now is final and minimal.
If all nodes have been visited, finish. Otherwise, set the unvisited node with the smallest distance (from the initial node) as the next "current node" and continue from step 3.
SHORTEST PATH CONSTRAINTS
When too many packets are present in (a part of) the subnet, performance degrades. This situation is called congestion. When the number of packets dumped into the subnet by the hosts is within its carrying capacity, they are all delivered (except for a few that are afflicted with transmission errors), and the number delivered is proportional to the number sent. However, as traffic increases too far, the routers are no longer able to cope, and they begin losing packets. This tends to make matters worse. At very high traffic, performance collapses completely, and almost no packets are delivered.
Congestion can be brought about by several factors. If all of a sudden, streams of packets begin arriving of three of four input lines and all need the same output line, a queue will build up. If there is insufficient memory to hold all of them, packets will be lost. Adding more memory may help up to a point, but Nagle (1987) discovered that if routers have an infinite amount of memory, congestion gets worse, not better, because by the time packets get to the front of the queue, they have already timed out (repeatedly), and duplicates have been sent. All these packets will dutifully forward to the next router, increasing the load all the way to the destination.
Slow processors can also cause congestion. If the routers' CPUs are slow at performing the bookkeeping tasks required of them (queuing buffers, updating tables, etc.), queues can build up, even though there is excess line capacity. Similarly, low bandwidth lines can also cause congestion. Upgrading the lines but not changing the processors, or vice versa, often helps a little, but frequently just shifts the bottleneck. Also, upgrading part, but not all, of the system, often just moves the bottleneck somewhere else. The real problem is frequently a mismatch between parts of the system. This problem will persist until all the components are in balance.
Congestion tends to feed upon itself and become worse. If a router has no free buffers, it must ignore newly arriving packets. When a packet is discarded, the sending router (a neighbor) may time out and retransmits it, perhaps ultimately many times. Since it cannot discard the packet until it has been acknowledged, congestion at the receiver's end forces the sender to refrain from releasing a buffer it would have normally freed. In thin manner, congestion backs up, like cars approaching a toll booth.
General principles of congestion control:
Many problems in complex systems, such as computer networks, can be viewed from a control theory point of view. This approach leads to dividing all solutions into two groups: open loop and closed loop. Open loop solutions attempt to solve the problem by good design, in essence, to make sure it does not occur in the first place. Once the system is up and running, midcourse corrections are not made.
Tools for doing open-loop control include deciding when to accept new traffic, deciding when to discard packets and which ones, and making scheduling decisions at various points in the network. All of these have in common the fact that they make decisions without regard to current state of the network.
In contrast, closed loop solutions are based on the concept of a feedback loop. This approach has three parts when applied to congestion control:
1 monitors the system to detect when and where congestion occurs.
2 pass this information to places where action can be taken.
3 adjust system operation to correct the problem.
Various metrics can be used to monitor the subnet for congestion. Chief among these are the percentage of all packets discarded for lack of buffer space, the average queue lengths, the number of packets that time out and are retransmitted, the average packet delay, and the standard deviation of packet delay. In all cases, rising numbers indicate growing congestion.
The second step in the feedback loop is to transfer the information about the congestion from the point where it is detected to the point where something can be done about it. The obvious way is for the router detecting the congestion to send a packet to the traffic source of sources, announcing the problem. Of course, these extra packets increase the load at precisely the moment that more loads is not needed, namely, when the subnet is congested. When a router detects this congested state, it fills in the field in all outgoing packets, to warn the neighbors.Still another approach is to have hosts or routers send probe packets out periodically to explicitly ask about congestion. This information can then be used to route traffic around problem areas. Some radio stations have helicopters flying around their cities to report on road congestion in the hope that their listeners will route their packets (cars) around hot spots.
In all feedback schemes, the hope is that knowledge of congestion will cause the hosts to take appropriate action to reduce the congestion. To work correctly, the time scale must be adjusted carefully. If every time two packets arrive in a row, a router yells STOP, and every time a router is idle for 20 micro sec it yells GO, the system will oscillate wildly and never converge. On the other hand, if it waits 30 minutes to make sure before saying anything, the congestion control mechanism will react too sluggishly to be of any real use. To work well, some king of averaging is needed, but getting the time constant right is a nontrivial matter.
Many congestion control algorithms are known. To provide a way to organize them in a sensible way, Yang and Reddy (1995) have developed taxonomy for congestion control algorithms. They begin by dividing all algorithms into open loop or closed loop, as described above. They further divide the open loop algorithms into ones that act at the source versus ones that act at the destination. The closed loop algorithms are also divided into two subcategories: explicit feedback versus implicit feedback. In explicit algorithms, packets are sent back from the point of congestion to warn the source. In implicit algorithms, the source deduces the existence of congestion by making local observations, such as the time needed for acknowledgements to come back.
The presence of congestion means that the load is (temporarily) greater than the resources (in part of the system) can handle. Two solutions come to mind: increase the resources of decrease the load. For example, the subnet may start using dial-up telephone lines to temporarily increase the bandwidth between certain points. In systems like SMDS, it may ask the carrier for additional bandwidth for a while. On satellite systems, increasing transmission power often gives higher bandwidth. Splitting traffic over multiple routes instead of always using the best one may also effectively increase the bandwidth. Finally, spare routers that are normally used only as backups (to make the system fault tolerant) can be put on-line to give more capacity when serious congestion appears.