Project 1: Navigating Memphis

In this project, you will use the A* algorithm to solve two related problems. In part A, you will write a program to find the fastest route between two locations in Memphis. In part B, you will write a program to find the fastest route between a set of locations in Memphis, returning to where you started (also known as the traveling salesperson problem).

Part A

Your program should ask the user for three inputs: the name of a map file (in the same format as used in Project 0), then the IDs for two different locations on that map. Your program should use the A* algorithm to find the quickest driving route in terms of time between the two locations. Your program should print the following pieces of information in the format specified in the examples below.

The total number of A* nodes expanded (the size of the explored list).
The total time the route takes to drive, in minutes.
The route itself, consisting of location IDs and the road segment names.

Your algorithm should use the speed limit for each road to determine the time it takes to travel that particular road segment. You may assume speed changes are instantaneous, and you don't have to take curves or stop lights (or other things which one would slow down for) into account.

You will notice that in order to calculate the travel time for a road segment, you will need the length of the road in miles as well as the speed limit. To do this from latitude and longitude coordinates, you can use code that calculates great-circle distances on a sphere. (No, the earth is not perfectly spherical, but it's an OK approximation.) Remember to multiply the return value by 3960 to get the distance in miles! You can then use the distance in miles combined with the speed limit of the road in miles per hour, to compute the travel time in minutes.

For g(n), use the travel time in minutes from the start state to the node n. For the heuristic value h(n), use the straight-line travel time from n to the goal state, assuming you could drive from node n to the goal at 65 miles per hour. We can use 65mph because that's the maximum speed limit of any Memphis road, so this h(n) will never overestimate the true travel time.

Location IDs

To test your program, I have collected the IDs for some Memphis landmarks, commonly-visited places, and some intersections near Rhodes.

Example runs

Your program should print out the best route found, the total travel time for that route, and the total number of nodes expanded (the size of the explored list). These examples use the all-memphis.txt map.

Rhodes College to Barksdale & Snowden (2471207719 to 203785186) Details

Total travel time is 0.37643637480988684 minutes.
Number of nodes expanded: 4
Path found is: 
2471207719 (starting location)
203821515 University Street
203785186 Snowden Avenue

University & Tutwiler to McLean & Snowden (203874746 to 203744893) Details

Total travel time is 0.8577285013726111 minutes.
Number of nodes expanded: 13
Path found is: 
203874746 (starting location)
203785183 Tutwiler Avenue
203744887 Tutwiler Avenue
203744893 North McLean Boulevard

Overton Square to U of Memphis (203777568 to 203948127) Details

Total travel time is 5.2132217862798935 minutes.
Number of nodes expanded: 1860
Path found is: 
203777568 (starting location)
203996102 Madison Avenue
203787633 Madison Avenue
203739086 Madison Avenue
628986670 Madison Avenue
628986667 East Parkway North
628986666 East Parkway North
628986663 East Parkway North
628986662 East Parkway North
628986781 Poplar Avenue
203651910 Poplar Avenue
203630104 Poplar Avenue
203651913 Poplar Avenue
203651917 Poplar Avenue
203651919 Poplar Avenue
203651921 Poplar Avenue
203651925 Poplar Avenue
203651929 Poplar Avenue
1403569453 Poplar Avenue
203651937 Poplar Avenue
203651947 Poplar Avenue
1403569447 Poplar Avenue
203622188 Poplar Avenue
203651955 Poplar Avenue
203651957 Poplar Avenue
2868680611 Poplar Avenue
1431064839 Poplar Avenue
1403569457 Poplar Avenue
203651960 Poplar Avenue
203651964 Poplar Avenue
203837645 Poplar Avenue
203651973 Poplar Avenue
203651977 Poplar Avenue
203651981 Poplar Avenue
425410163 Poplar Avenue
203651986 Poplar Avenue
203651990 Poplar Avenue
203651994 Poplar Avenue
203651997 Poplar Avenue
203652000 Poplar Avenue
203652004 Poplar Avenue
203824471 South Highland Street
203824476 South Highland Street
203824478 South Highland Street
203753365 South Highland Street
2816874995 South Highland Street
203824485 South Highland Street
203872561 Central Avenue
203948127 Central Avenue

Graceland to Shelby Farms Park (480814962 to 1352161029) Details

Total travel time is 18.12600769455315 minutes.
Number of nodes expanded: 17494
(See details for directions.)

Part B

Your program should ask the user for these inputs: the name of a map file (in the same format as used in Project 0), the IDs for a starting location, then the IDs for one or more other locations that need to be visited. Your program should use the A* algorithm to find the quickest driving route in terms of time that begins at the starting locations, visits the other locations in the quickest order possible, and returns to the starting location.

The concept of a "state" in Part B will be very different from a "state" in Part A. In Part A, a state can simply be a location on the map, because that's all the information a node needs to keep track of to figure out the best location to visit next. In Part B, this is not enough information: in order to have a good heuristic estimate of the total distance left on the route, a state will need to store not only the current location, but information on where you have visited previously and the locations still left to visit.

For instance, imagine you start at location 1 and need to visit locations 2 and 3 before returning to 1. Consider these two "states:"

State X: Current location = 4, Visited = 1 and 2
State Y: Current location = 4, Visited = 1

Even though both states have the same current location, a good heuristic would have a larger h(n) estimate for state X than for state Y, because at X, you still have to visit Y, then return back to 1, whereas at Y, you just have to get back to 1.

Heuristic

There are a few ways to get a good heuristic estimate for a node, taking into account where you are, where you've been, and where you still have to go. A good place to start is to think about how to modify your heuristic from part A, the straight-line 65mph estimate from your current location to your goal. The problem is now we have multiple goals: all the locations to visit, plus returning home. Consider calculating all of the 65mph times to all of your goals. How can we use this set of times to estimate the time it will take to visit them all, without overestimating? Recall that A* is only optimal with an h(n) estimate that never overestimates the true cost to the goal (meaning that your h(n) can never overestimate the true driving time to visit all the remaining locations and return back to the start).

Note that this is not a very good heuristic. It severely underestimates the true travel time to finish the tour of all the locations, especially in an area with no 65mph roads. Can you make a better heuristic that is still consistent and admissible?

Example runs

Your program output should look similar to Part A. These examples use the west-of-rhodes.txt map.

Start at Rhodes College, visit University & Lyndale (2471207719 to 203874118)

Total travel time is 0.254302144139388 minutes.
Number of nodes expanded: 6
2471207719 (starting location)
203821515 University Street
203874118 University Street
203821515 University Street
2471207719 University Street

Start at Rhodes College, visit Barksdale/Crump, Barksdale/Lyndale, Barksdale/Tutwiler (2471207719 to 203785195, 203785189, and 203785183)

Total travel time is 1.8604921727811592 minutes.
Number of nodes expanded: 8100
2471207719 (starting location)
203821515 University Street
203874118 University Street
203785569 University Street
203785192 Mignon Avenue
203785195 North Barksdale Street
203785192 North Barksdale Street
203785189 North Barksdale Street
203785186 North Barksdale Street
203785183 North Barksdale Street
203874746 Tutwiler Avenue
2471207719 University Street

What to turn in

Through Moodle, turn in your code for parts A and B. In addition, turn in a text file which explains, for part B, how you chose to represent a "state," and what your heuristic function does.

Grading

You will be graded on correctness of your program (whether it gives the best path or tour), and also the quality of your heuristics (measured by how many states need to be expanded to find the best solution). For part B, tours can obviously be made in reverse order of locations as well with no penalty (since all roads are assumed to be bidirectional).

Hints

This is a challenging project. Start early.
Use diagnostic messages to print the status of the frontier and explored list as you visit nodes in your search tree. I have included diagnostics for Part A that you can use to verify your program is working correctly.
You are expected to use the A* algorithm to solve Part B in its entirety, just by changing your representation of a state and finding a good heuristic. It will probably not be a particularly fast algorithm; that's OK. The idea of Part B is to use A* to solve a non-traditional problem.
If you are using C++, use the built-in container classes like vector (for lists that grow and shrink), map (for hash tables), and priority_queue. The goal of this project is not to build your own data structures, it is to implement the A* algorithm.

Fun ideas

Compare some of the routes your program finds against Google Maps or another mapping service. Why are they different sometimes?