Updated 2012-08-30 23:37:55 by LkpPo

NEM 23 May 2007: Many problems can be seen as a search for a solution within a space of possible states. Such a state space can be represented as a directed graph, where the nodes represent states and the edges represent possible transitions between states (often labeled with a cost associated with that transition). In order to find the solution to our problem we search the graph from our current state until we find a state that matches our goal (a goal state). A variety of search strategies have been developed to solve problems in this manner, which we can divide into two groups: uninformed searches attempt to exhaustively search the graph without any clue as to where solutions may lie, whereas heuristic (or informed) searches use some knowledge about the problem domain to guide the search towards likely solutions.

The five search strategies we illustrate are:

1. Depth-first search which follows a path blindly to its end before trying any alternatives (an uninformed search).
2. Breadth-first search which conservatively explores all alternatives at once, one node at a time (another informed search).
3. Uniform-cost search expands nodes which have the least cost-so-far first (uninformed).
4. Best-first search expands nodes which score best in some evaluation function. In this case we use a greedy search evaluation function which expands nodes that have the least estimated cost-to-distance, regardless of the cost so far.
5. Finally, we show the famous A*-search which combines a measure of the cost-so-far and the estimated cost-to-goal to get a very efficient search strategy.

The search strategies are implemented as part of a lightweight digraph package, and this page serves as an example of their use. This example we use is taken from BOOK Artificial Intelligence: A Modern Approach by Russell and Norvig, page 95 (1st ed), but adapted to Tcl. The example involves planning a route from Arad in Romania to Bucharest [1]. Firstly, we define our graph, where nodes are cities and the edges are roads between them, with the distance as the cost of the edge:
``` package require digraph     0.6

set map [digraph create]
foreach {source -> dests} {
Arad        -> {Zerind 75 Timisoara 118 Sibiu 140}
Bucharest   -> {Pitesti 101 Fagaras 211 Urziceni 85 Giurgiu 90}
Craiova     -> {Dobreta 120 RimnicuVilcea 146 Pitesti 138}
Dobreta     -> {Mehadia 75 Craiova 120}
Eforie      -> {Hirsova 86}
Fagaras     -> {Sibiu 99 Bucharest 211}
Giurgiu     -> {Bucharest 90}
Hirsova     -> {Urziceni 98 Eforie 86}
Iasi        -> {Neamt 87 Vaslui 92}
Lugoj       -> {Iasi 87}
Mehadia     -> {Lugoj 70 Dobreta 75}
Neamt       -> {Iasi 87}
Oradea      -> {Zerind 71 Sibiu 151}
Pitesti     -> {RimnicuVilcea 97 Craiova 138 Bucharest 101}
RimnicuVilcea -> {Sibiu 80 Pitesti 97 Craiova 146}
Timisoara   -> {Arad 118 Lugoj 111}
Urziceni    -> {Bucharest 85 Vaslui 142 Hirsova 98}
Vaslui      -> {Urziceni 142 Iasi 92}
} {
foreach {dest cost} \$dests {
digraph edge map \$source \$dest -cost \$cost
}
}```

For the informed search methods we also need to define a heuristic that is used to estimate the cost of reaching the goal (Bucharest) from the current node. It is important that this heuristic be admissible -- i.e., that it never overestimates the cost to the goal. For route-planning a reasonable heuristic is the straight-line distance to the goal (SLD):
``` # This is our estimation of the cost to the goal - the straight line
# distance to the goal (SLD). This is admissible, as it can never over-
# estimate the distance: SLD is the shortest possible distance.
array set SLD {
Bucharest       0
Craiova         160
Dobreta         242
Eforie          161
Fagaras         178
Giurgiu         77
Hirsova         151
Iasi            226
Lugoj           244
Neamt           234
Pitesti         98
RimnicuVilcea   193
Sibiu           253
Timisoara       329
Urziceni        80
Vaslui          199
Zerind          374
}
proc SLD node { return \$::SLD(\$node) }```

We can now try out our various search strategies using the digraph package:
``` foreach strategy {depth-first breadth-first uniform-cost
{best-first SLD} {a-star SLD}} {
set steps 0
puts "\nSearching \$strategy..."
digraph search \$map Arad [concat digraph \$strategy] {path cost} {
incr steps
if {[lindex \$path end] eq "Bucharest"} {
puts "Found result: [join \$path { -> }]"
puts "Iterations: \$steps Size: [llength \$path] Cost: \$cost"
break
}
}
}```

Remove the break statement to generate all solutions. The results are:
``` Searching depth-first...
Found result: Arad -> Sibiu -> RimnicuVilcea -> Craiova -> Pitesti -> Bucharest
Iterations: 6 Size: 6 Cost: 605

Found result: Arad -> Sibiu -> Fagaras -> Bucharest
Iterations: 13 Size: 4 Cost: 450

Searching uniform-cost...
Found result: Arad -> Sibiu -> RimnicuVilcea -> Pitesti -> Bucharest
Iterations: 19 Size: 5 Cost: 418

Searching best-first SLD...
Found result: Arad -> Sibiu -> Fagaras -> Bucharest
Iterations: 4 Size: 4 Cost: 450

Searching a-star SLD...
Found result: Arad -> Sibiu -> RimnicuVilcea -> Pitesti -> Bucharest
Iterations: 6 Size: 5 Cost: 418```

We can see from this some general properties of the search strategies. Depth-first search quickly found a solution, but it is of poor quality. Breadth-first search took a moderate amount of time to find a solution with a small number of steps, but not the shortest distance. Uniform-cost search finds the lowest-distance solution, but takes the longest amount of time doing it. Greedy search quickly finds a solution, but again it is of only decent quality. Finally, A* search quickly finds the best solution. Indeed, for any given heuristic function, A* is optimally efficient: that is, no other search algorithm will expand fewer nodes than A* before it finds an optimal solution.

See also RS's Searching A star in space, which greatly improved on a much older version of these pages. Also, see State Space Searching in Tcl.