Just Read The Directions
(Because sometimes that helps.)
Articles on programming by Phil Darnowsky.

Parallelized A* search (December 30, 2010)

Updated January 18, 2011 to fix inconsistent mileage figures in examples.

Sequential A*

Everyone knows and loves the A* search. If you're familiar with it and want to jump ahead to parallelizing it, please feel free.

For those who aren't familiar with A* and don't want to read the complete treatment above, A* is a type of best-first search. It uses a heuristic function to estimate the distance between any node under consideration and the goal. Essentially, A* uses extra information about the problem domain to make informed choices when it considers where to go next.

Take for example the problem of a programmer who lives in San Francisco who has accepted a job in Boston. She's packed all her stuff in a moving van which gets terrible gas mileage, so she's taken out the highway map to find the shortest route between the cities.

Since she's a programmer, her map shows an abstract version of (a country remarkably similar to) the United States: only cities and highways are portrayed, in the form of vertices and edges of a weighted graph.

She sees that, starting in San Francisco, she has four choices for the first leg of her trip: she can drive to Portland, Reno, Las Vegas, or Los Angeles. From Portland, she can continue to Reno (backtracking to SF being forbidden); from Reno, she can go to Portland, Denver or Las Vegas; from Las Vegas, to Reno, Denver, Phoenix or Los Angeles; and from Los Angeles to Las Vegas or Phoenix. This is starting to get complicated, so she draws a tree to help her evaluate her options:

Looking at this, she and we can see that going from San Francisco to Reno to Las Vegas would be a trip of 670 miles. She could keep systematically enumerating the possible next moves from each city in a breadth first search, and it certainly will find a route to Boston, but it's not guaranteed to be the best possible route, and it may take quite a while to find any kind of solution.

Then inspiration strikes. She takes out a ruler and measures the straight-line distance from each city to Boston, never mind if there's actually a direct road or not. The first few make the map look like:

Now she knows as above that a trip from San Francisco to Reno to Las Vegas will be 670 miles. But now she can also estimate that the balance of the trip, from Las Vegas to Boston by some route, will be another 2370 miles. Adding those together gives her an estimate of 3040 miles for the length of the shortest trip that starts by going from San Francisco to Reno to Las Vegas and ends up in Boston.

At every step in the A* search, she picks the next city with the smallest total estimate route length. So at the beginning her search tree is as simple as can be:

The only state she knows of is when she is in San Francisco. She has traveled 0 miles to get there and has an estimated 2700 miles left to Boston. Adding these together gives her a total estimated trip length from here of 2700 miles.

In A*, you always expand the unexpanded state with the lowest total estimate. Since there's only one possible state, it'll have to do for a start. Our heroine looks at the possibilities from this state and comes up with:

Looking at the unexpanded states, she notes that the route beginning by driving from San Francisco to Reno has the lowest estimated total cost. Being a methodical type, she asks herself "Is Reno Boston?" If you happen to have visited both cities, you already know the answer to this. So she repeats the process again, considering where she could go from Reno:

At this point she has six possible routes to consider, and she sees that the route from San Francisco to Reno to Denver has the lowest estimated total cost. Denver is not Boston either, so she repeats the process again, and again, until a route to Boston emerges.

Note that, while we're imagining examining the tree directly for our next move, in practice you use a priority queue instead. A priority queue is just a list of items, each of which has a numerical value called a "priority" associated with it. When you add new items to the list, instead of just sticking them on the end, you put each in the list so that the list stays in ascending order of priority. In A*, the priority value we use is the estimated total distance.

So at the start we'd have a priority queue with just one item in it:

NodeEstimated Total Distance
San Francisco2700

San Francisco is at the head of the queue, so we take that off, expand it, and add the results back into the queue:

NodeEstimated Total Distance
San Francisco to Reno2740
San Francisco to Las Vegas2930
San Francisco to Los Angeles2980
San Francisco to Portland3170

Now Reno is at the head, so we take that off, expand it, and add the results back in:

NodeEstimated Total Distance
San Francisco to Las Vegas2930
San Francisco to Los Angeles2980
San Francisco to Reno to Denver3020
San Francisco to Reno to Las Vegas3040
San Francisco to Portland3170
San Francisco to Reno to Portland3280

In the next iteration, we'd expand routes that start by going from San Francisco to Las Vegas, and so on until we find our way to Boston.

A proof is beyond the scope of this article, but as long as the estimating function is "optimistic" (to use the formal term, admissible), that route is guaranteed to be the best possible one. An optimistic estimating function can underestimate the remaining cost of a route, or get it exactly correct, but isn't allowed to ever overestimate that cost. The straight-line estimation we're using here is certainly optimistic: no matter what route you take from Point A to Point B, you're never going to beat the straight-line distance between them.

To recap, her strategy for picking the best route by A* is:

  1. Take the state at the head of the priority queue off of it. It's guaranteed, of all the states we know, to have the smallest estimated total distance.
  2. If that state corresponds to being in Boston, we've found our route and can stop.
  3. Otherwise, look at the map and find which cities we can go to without doubling back on our route. Calculate their estimated total distances by adding the known distance it takes to get there to the estimated remaining distance to Boston. Based on those estimated total distances, add them to the appropriate place in the priority queue.
  4. Go back to the top and repeat until we've gotten to Boston.

Parallelized A*

A lot of the work of A* can profitably be done in parallel if we have multiple CPUs to work with. Suppose that we have three CPUs available: instead of picking the lowest estimated total cost node and expanding just that, we pick the three lowest estimated total cost nodes and assign each to a CPU.

To return to the example above, the first step (expanding the "San Francisco" state) would still work the same, since there's only one node to look at. But on the second iteration, we can expand three nodes at once, going from:

to:

This lends itself to a manager/worker architecture. The manager keeps the priority queue of states and assigns work to the workers. The workers expand the states and pass results back to the manager. A diagram of messages passing between processes in our example might look like:

Notice that we're not assuming any relationship between the order in which we assign work to workers and the order in which they actually finish. We just pass out work in an arbitrary order and add results to the queue whenever they come in.

There's a trap here. Since you don't know the order that results will come back to the manager, it's theoretically possible for it to find a solution that's not the best possible solution. In fact, if you're really unlucky, it may find the worst possible solution.

We need to do a little more work to ensure that we get the best solution. Rather than stopping when we find a solution, we take the cost of that solution, and consider it as an upper bound on the cost of the best possible solution. This means we can throw out any unexpanded node with an estimated total cost greater than that bound. Recall that our estimation function is allowed to underestimate the cost of a path, but never to overestimate it. Hence if the estimate is greater than this upper bound, the actual cost is guaranteed to also be greater.

When we ask a worker to expand a node, we can also pass it this upper bound. The worker then expands the node as normal, but filters out any results with estimated total cost greater than the upper bound. The manager never sees these, and never adds them to the queue in the first place.

With these rules in place, we know we're done when the queue is empty and all workers are idle (and thus we're not expecting any new results). At this point, the best solution we know of is the best one there is.

To recap the parallelized algorithm:

  1. Assign nodes off the top of the queue to workers until all workers are busy or the queue is empty (assuming none of your starting nodes are solutions).
  2. When a worker responds, add its results to the queue in the proper order, removing any with an estimated cost greater than the upper bound on cost (originally infinity).
  3. Take the node off the top of the queue. If it's a solution, remember its cost as the upper bound on cost. If not, give it to the newly-freed worker to expand.
  4. Repeat until all workers are idle and the queue is empty. The best solution the manager knows of is now the best solution there is.

For those interested in playing with code to illustrate this algorithm, I've uploaded an example implementation in Erlang to Github, under an MIT license. It's very pre-alpha, just meant for a proof of concept. If I think of a practical use for it, I'll be cleaning it up. Of course, if you beat me to it, you're welcome to do the same, especially as the chances of me actually making a production-quality library out of this are slim.

blog comments powered by Disqus