Dijkstra's Algorithm for Shortest Paths

Section 3.4 Dijkstra's Algorithm for Shortest Paths

Just as with graphs, it is useful to assign weights to the directed edges of a digraph. Specifically, in this section we consider a pair \((\bfG,w)\) where \(\GVE\) is a digraph and \(w\colon E\rightarrow\nonnegints\) is a function assigning to each directed edge \((x,y)\) a non-negative weight \(w(x,y)\text{.}\) However, in this section, we interpret weight as distance so that \(w(x,y)\) is now called the length of the edge \((x,y)\text{.}\) If \(P=(r=u_0,u_1,\dots,u_t=x)\) is a directed path from \(r\) to \(x\text{,}\) then the length of the path \(P\) is just the sum of the lengths of the edges in the path, \(\sum_{i=0}^{t-1} w(u_iu_{i+1})\text{.}\) The distance from \(r\) to \(x\) is then defined to be the minimum length of a directed path from \(r\) to \(x\text{.}\) Our goal in this section is to solve the following natural problem, which has many applications:

Problem 3.4.1.

For each vertex \(x\text{,}\) find the distance from \(r\) to \(x\text{.}\) Also, find a shortest path from \(r\) to \(x\text{.}\)

Subsection 3.4.1 Description of the Algorithm

To describe Dijkstra's algorithm in a compact manner, it is useful to extend the definition of the function \(w\text{.}\) We do this by setting \(w(x,y)=\infty\) when \(x\neq y\) and \((x,y)\) is not a directed edge of \(\bfG\text{.}\) In this way, we will treat \(\infty\) as if it were a number (although it is not!).¹

This is not an issue for computer implementation of the algorithm, as instead of using \(\infty\text{,}\) a value given by the product of the number of vertices and the maximum edge weight may be used to simulate infinity.

We are now prepared to describe Dijkstra's Algorithm.

Algorithm 3.4.2. Dijkstra's Algorithm.

Let \(n=|V|\text{.}\) At Step \(i\text{,}\) where \(1\le i\le n\text{,}\) we will have determined:

A sequence \(\sigma=(v_1,v_2,v_3,\dots,v_i)\) of distinct vertices from \(\bfG\) with \(r=v_1\text{.}\) These vertices are called permanent vertices, while the remaining vertices will be called temporary vertices.
For each vertex \(x\in V\text{,}\) we will have determined a number \(\delta(x)\) and a path \(P(x)\) from \(r\) to \(x\) of length \(\delta(x)\text{.}\)

Initialization (Step 1)

Set \(i=1\text{.}\) Set \(\delta(r)=0\) and let \(P(r)=(r)\) be the trivial one-point path. Also, set \(\sigma= (r)\text{.}\) For each \(x\neq r\text{,}\) set \(\delta(x)= w(r,x)\) and \(P(x)=(r,x)\text{.}\) Let \(x\) be a temporary vertex for which \(\delta(x)\) is minimum. Set \(v_2 = x\text{,}\) and update \(\sigma\) by appending \(v_2\) to the end of it. Increment \(i\text{.}\)

Inductive Step (Step \(i\text{,}\) \(i>1\))

If \(i\lt n\text{,}\) then for each temporary \(x\text{,}\) let

\begin{equation*} \delta(x) = \min\{\delta(x), \delta(v_i)+w(v_i,x)\}. \end{equation*}

If this assignment results in a reduction in the value of \(\delta(x)\text{,}\) let \(P(x)\) be the path obtained by adding \(x\) to the end of \(P(v_i)\text{.}\)

Let \(x\) be a temporary vertex for which \(\delta(x)\) is minimum. Set \(v_{i+1}=x\text{,}\) and update \(\sigma\) by appending \(v_{i+1}\) to it. Increment \(i\text{.}\)

Subsection 3.4.2 Example of Dijkstra's Algorithm

Before establishing why Dijkstra's algorithm works, it may be helpful to see an example of how it works. To do this, consider the digraph \(\bfG\) shown in Figure 3.4.3. For visual clarity, we have chosen a digraph which is an oriented graph, i.e., for each distinct pair \(x,y\) of vertices, the graph contains at most one of the two possible directed edges \((x,y)\) and \((y,x)\text{.}\)

Figure 3.4.3. A digraph with edge lengths

Suppose that the root vertex \(r\) is the vertex labeled \(a\text{.}\) The initialization step of Dijkstra's algorithm then results in the following values for \(\delta\) and \(P\text{:}\)

Step 1. Initialization.

\begin{align*} \sigma\amp=(a)\amp\amp\\ \delta(a)\amp=0; \amp P(a)\amp=(a)\\ \delta(b) \amp=\infty; \amp P(b)\amp=(a,b)\\ \delta(c) \amp=47; \amp P(c)\amp=(a,c)\\ \delta(d) \amp=\infty; \amp P(d)\amp=(a,d)\\ \delta(e) \amp=70; \amp P(e)\amp=(a,e)\\ \delta(f) \amp=24; \amp P(f)\amp=(a,f)\\ \delta(g) \amp=\infty; \amp P(g)\amp=(a,g)\\ \delta(h) \amp=\infty; \amp P(h)\amp=(a,h) \end{align*}

Before finishing Step 1, the algorithm identifies vertex \(f\) as closest to \(a\) and appends it to \(\sigma\text{,}\) making \(a\) permanent. When entering Step 2, Dijkstra's algorithm attempts to find shorter paths from \(a\) to each of the temporary vertices by going through \(f\text{.}\) We call this process “scanning from vertex \(f\text{.}\)” In this scan, the path to vertex \(d\) is updated, since \(\delta(f) + w(f,d)=24+120=144\lt \infty=w(a,d)\text{.}\)

Step 2. Scan from vertex \(f\).

\begin{align*} \sigma\amp=(a,f)\amp\amp\\ \delta(a)\amp=0; \amp P(a)\amp=(a)\\ \delta(b)\amp=\infty; \amp P(b)\amp=(a,b)\\ \delta(c)\amp=47; \amp P(c)\amp=(a,c)\\ \delta(d)\amp=144 = 24 + 120 = \delta(f)+w(f,d); \amp P(d)\amp=(a,f,d)\quad\text{updated} \\ \delta(e)\amp=70; \amp P(e)\amp=(a,e)\\ \delta(f)\amp=24; \amp P(f)\amp=(a,f)\\ \delta(g)\amp=\infty; \amp P(g)\amp=(a,f)\\ \delta(h)\amp=\infty; \amp P(h)\amp=(a,h) \end{align*}

Before proceeding to the next step, vertex \(c\) is made permanent by making it \(v_3\text{.}\) In Step 3, therefore, the scan is from vertex \(c\text{.}\) Vertices \(b\text{,}\) \(d\text{,}\) and \(g\) have their paths updated. However, although \(\delta(c) + w(c,e) = 47+23=70=\delta(e)\text{,}\) we do not change \(P(e)\) since \(\delta(e)\) is not decreased by routing \(P(e)\) through \(c\text{.}\)

Step 3. Scan from vertex \(c\).

\begin{align*} \sigma\amp=(a,f,c)\\ \delta(a)\amp=0; \amp P(a)\amp=(a)\\ \delta(b)\amp=102=47+55= \delta(c)+w(c,b); \amp P(b)\amp=(a,c,b)\quad\text{updated}\\ \delta(c)\amp=47; \amp P(c)\amp=(a,c)\\ \delta(d)\amp=135=47+88 = \delta(c)+w(c,d); \amp P(d)\amp=(a,c,d)\quad\text{updated} \\ \delta(e)\amp=70; \amp P(e)\amp=(a,e)\\ \delta(f)\amp=24; \amp P(f)\amp=(a,f)\\ \delta(g)\amp=113=47+66= \delta(c)+w(c,g); \amp P(g)\amp=(a,c,g)\quad\text{updated} \\ \delta(h)\amp=\infty; \amp P(h)\amp=(a,h) \end{align*}

Now vertex \(e\) is made permanent.

Step 4. Scan from vertex \(e\).

\begin{align*} \sigma\amp=(a,f,c,e)\\ \delta(a)\amp=0; \amp P(a)\amp=(a)\\ \delta(b)\amp=101=70+31= \delta(e)+w(e,b); \amp P(b)\amp=(a,e,b)\quad\text{updated}\\ \delta(c)\amp=47; \amp P(c)\amp=(a,c)\\ \delta(d)\amp=135; \amp P(d)\amp=(a,c,d)\\ \delta(e)\amp=70; \amp P(e)\amp=(a,e)\\ \delta(f)\amp=24; \amp P(f)\amp=(a,f)\\ \delta(g)\amp=112=70+42= \delta(e)+w(e,g); \amp P(g)\amp=(a,e,g)\quad\text{updated}\\ \delta(h)\amp=\infty; \amp P(h)\amp=(a,h) \end{align*}

Now vertex \(b\) is made permanent.

Step 5. Scan from vertex \(b\).

\begin{align*} \sigma\amp=(a,f,c,e,b)\\ \delta(a)\amp=0; \amp P(a)\amp=(a)\\ \delta(b)\amp=101; \amp P(b)\amp=(a,e,b)\\ \delta(c)\amp=47; \amp P(c)\amp=(a,c)\\ \delta(d)\amp= 132 = 101+ 31= \delta(b)+w(b,d); \amp P(d)\amp=(a,e,b,d)\quad\text{updated} \\ \delta(e)\amp= 70; \amp P(e)\amp=(a,e)\\ \delta(f)\amp= 24; \amp P(f)\amp=(a,f)\\ \delta(g)\amp=112; \amp P(g)\amp=(a,e,g)\\ \delta(h)\amp=180 = 101+79=\delta(b)+w(b,h); \amp P(h)\amp=(a,e,b,h)\quad\text{updated} \end{align*}

Now vertex \(g\) is made permanent.

Step 6. Scan from vertex \(g\).

\begin{align*} \sigma\amp=(a,f,c,e,b,g)\\ \delta(a)\amp=0; \amp P(a)\amp=(a)\\ \delta(b)\amp=101; \amp P(b)\amp=(a,e,b)\\ \delta(c)\amp=47; \amp P(c)\amp=(a,c)\\ \delta(d)\amp= 132; \amp P(d)\amp=(a,e,b,d)\\ \delta(e)\amp=70; \amp P(e)\amp=(a,e)\\ \delta(f)\amp=24; \amp P(f)\amp=(a,f)\\ \delta(g)\amp=112; \amp P(g)\amp=(a,e,g)\\ \delta(h)\amp=178 = 112+66=\delta(g)+w(g,h); \amp P(h)\amp=(a,e,g,h)\quad\text{updated} \end{align*}

Now vertex \(d\) is made permanent.

Step 7. Scan from vertex \(d\).

\begin{align*} \sigma\amp=(a,f,c,e,b,g,d)\\ \delta(a)\amp=0; \amp P(a)\amp=(a)\\ \delta(b)\amp=101; \amp P(b)\amp=(a,e,b)\\ \delta(c)\amp=47; \amp P(c)\amp=(a,c)\\ \delta(d)\amp= 132; \amp P(d)\amp=(a,e,b,d)\\ \delta(e)\amp=70; \amp P(e)\amp=(a,e)\\ \delta(f)\amp=24; \amp P(f)\amp=(a,f)\\ \delta(g)\amp=112; \amp P(g)\amp=(a,e,g)\\ \delta(h)\amp=161 = 132+29=\delta(d)+w(d,h); \amp P(h)\amp=(a,e,b,d,h)\quad\text{updated} \end{align*}

Now vertex \(h\) is made permanent. Since this is the last vertex, the algorithm halts and returns the following:

Final Results of Dijkstra's Algorithm.

\begin{align*} \sigma\amp=(a,f,c,e,b,g,d,h)\\ \delta(a)\amp=0; \amp P(a)\amp=(a)\\ \delta(b)\amp=101; \amp P(b)\amp=(a,e,b)\\ \delta(c)\amp=47; \amp P(c)\amp=(a,c)\\ \delta(d)\amp= 132; \amp P(d)\amp=(a,e,b,d)\\ \delta(e)\amp=70; \amp P(e)\amp=(a,e)\\ \delta(f)\amp=24; \amp P(f)\amp=(a,f)\\ \delta(g)\amp=112; \amp P(g)\amp=(a,e,g)\\ \delta(h)\amp=161; \amp P(h)\amp=(a,e,b,d,h) \end{align*}

Subsection 3.4.3 The Correctness of Dijkstra's Algorithm

Now that we've illustrated Dijkstra's algorithm, it's time to prove that it actually does what we claimed it does: find the distance from the root vertex to each of the other vertices and a path of that length. To do this, we first state two elementary propositions. The first is about shortest paths in general, while the second is specific to the sequence of permanent vertices produced by Dijkstra's algorithm.

Proposition 3.4.4.

Let \(x\) be a vertex and let \(P=(r=u_0,u_1,\dots,u_t=x)\) be a shortest path from \(r\) to \(x\text{.}\) Then for every integer \(j\) with \(0\lt j\lt t\text{,}\) \((u_0,u_1,\dots,u_j)\) is a shortest path from \(r\) to \(u_j\) and \((u_j,u_{j+1},\dots,u_t)\) is a shortest path from \(u_j\) to \(u_t\)

Proposition 3.4.5.

When the algorithm halts, let \(\sigma=(v_1,v_2,v_3,\dots,v_n)\text{.}\) Then

\begin{equation*} \delta(v_1)\le \delta(v_2)\le\dots \le \delta(v_n). \end{equation*}

We are now ready to prove the correctness of the algorithm. The proof we give will be inductive, but the induction will have nothing to do with the total number of vertices in the digraph or the step number the algorithm is in.

Theorem 3.4.6.

Dijkstra's algorithm yields shortest paths for every vertex \(x\) in \(\bfG\text{.}\) That is, when Dijkstra's algorithm terminates, for each \(x\in V\text{,}\) the value \(\delta(x)\) is the distance from \(r\) to \(x\) and \(P(x)\) is a shortest path from \(r\) to \(x\text{.}\)

Proof.

The theorem holds trivially when \(x=r\text{.}\) So we consider the case where \(x\neq r\text{.}\) We argue that \(\delta(x)\) is the distance from \(r\) to \(x\) and that \(P(x)\) is a shortest path from \(r\) to \(x\) by induction on the minimum number \(k\) of edges in a shortest path from \(r\) to \(x\text{.}\) When \(k=1\text{,}\) the edge \((r,x)\) is a shortest path from \(r\) to \(x\text{.}\) Since \(v_1=r\text{,}\) we will set \(\delta(x)=w(r,x)\) and \(P(x)=(r,x)\) at Step 1.

Now fix a positive integer \(k\text{.}\) Assume that if the minimum number of edges in a shortest path from \(r\) to \(x\) is at most \(k\text{,}\) then \(\delta(x)\) is the distance from \(r\) to \(x\) and \(P(x)\) is a shortest path from \(r\) to \(x\text{.}\) Let \(x\) be a vertex for which the minimum number of edges in a shortest path from \(r\) to \(x\) is \(k+1\text{.}\) Fix a shortest path \(P=(u_0,u_1,u_2,\dots,u_{k+1})\) from \(r=u_0\) to \(x=u_{k+1}\text{.}\) Then \(Q=(u_0,u_1,\dots,u_k)\) is a shortest path from \(r\) to \(u_k\text{.}\) (See Figure 3.4.7.)

Figure 3.4.7. Shortest paths

By the inductive hypothesis, \(\delta(u_k)\) is the distance from \(r\) to \(u_k\text{,}\) and \(P(u_k)\) is a shortest path from \(r\) to \(u_k\text{.}\) Note that \(P(u_k)\) need not be the same as path \(Q\text{,}\) as we suggest in Figure 3.4.7. However, if distinct, the two paths will have the same length, namely \(\delta(u_k)\text{.}\) Also, the distance from \(r\) to \(x\) is \(\delta(u_k)+w(u_k,x)\ge \delta(u_k)\) since \(P\) is a shortest path from \(r\) to \(x\) and \(w(u_k,x)\geq 0\text{.}\)

Let \(i\) and \(j\) be the unique integers for which \(u_k=v_i\) and \(x=v_j\text{.}\) If \(j \lt i\text{,}\) then

\begin{equation*} \delta(x)= \delta(v_j)\le \delta(v_i)= \delta(u_k)\le \delta(u_k)+w(u_k). \end{equation*}

Therefore the algorithm has found a path \(P(x)\) from \(r\) to \(x\) having length \(\delta(x)\) which is at most the distance from \(r\) to \(x\text{.}\) Clearly, this implies that \(\delta(x)\) is the distance from \(r\) to \(x\) and that \(P(x)\) is a shortest path.

On the other hand, if \(j>i\text{,}\) then the inductive step at Step \(i\) results in

\begin{equation*} \delta(x)\le \delta(v_i)+w(v_i,y)=\delta(u_k)+w(u_k,x). \end{equation*}

As before, this implies that \(\delta(x)\) is the distance from \(r\) to \(x\) and that \(P(x)\) is a shortest path.