In this lesson, students examine a classic problem in computer science, the Traveling Salesperson Problem (TSP). Students will solve small instances of the problem, try to formulate algorithms to solve it, and discuss why these algorithms take a long time for computers (and humans) to compute. Students will see how the TSP grows in size much faster than the problem of adding characters to a password. Even though we use encryption to motivate a desire to learn about computationally hard problems, they are valuable to know about, in and of themselves. This lesson covers some territory about how we reason formally and mathematically about algorithms and figuring out how "hard" something is for a computer to do.

**Computationally hard:**a problem for which there is no proven algorithm for solving in a reasonable time. The larger the problem set, the harder it is to solve by any means. The algorithm to solve must have exponential growth, one unit larger problem multiplies the number of answers to check by n. Sometimes these problems are called "intractable." The goal in many of these problems is to find the best -- most optimal -- solution.

Students will be able to:

- Describe the TSP
- Explain why TSP is computationally hard
- Solve the TSP on small graphs
- Connect the properties of computationally hard problems with desirable properties for encryption

In this final algorithm detour, students are introduced to the Traveling Salesperson Problem, a classic problem in computer science for which there is no known algorithm to solve it, other than brute force. The number of possible solutions grows extremely fast, even for small inputs, and quickly becomes "unreasonable" to solve, making it a computationally hard problem. The ideas of computationally hard problems are leveraged for encryption to make ciphers that take an unreasonable amount of time to crack (as in thousands of trillions of years), but computationally hard problems are also important in their own right. There are many problems for which we wish we had reasonable algorithmic solutions - especially in medical fields - and we're still on the hunt to find them. No one has yet mathematically proven whether or not the problems we currently think are "hard" actually are.

- Activity Guide: The Traveling Salesperson Problem (Google Drive)
- Video: Traveling Salesperson Problem (Optional, highly-visual)

Remind the students the following as a recall:

- In a previous lesson we looked at the problem of trying to crack passwords and learned it would take a computer a very long time to try every possible combination of letters. We learned that the longer a password is, the harder it is to crack. This is because every character you add multiplies the number of possible passwords by the number of possibilities for that character. That's pretty fast growth.

Now, introduce today's lesson to the students:

- Computationally Hard Problems are problems that force a computer to run through many possibilities to find the right answer. In cryptography, our desire is to have an encryption function that is easy to compute if you have the key, but really hard if you don't. Today we're going to look at one of the most famous problems that is thought to be computationally hard. We'll see if we can find an algorithm to solve it, and along the way, we'll get a better sense of what "computationally hard" really means.

After distributing The Traveling Salesperson Problem Activity Guide,

- Give individual students time to work on finding the shortest routes for the small examples in the worksheet.
- Make sure that students compare their answers. If two partners disagree, they should test one another's work to find the shortest tour.

If you ask students how they arrived at their solutions they might say, "I tried a bunch of different paths and kept track of the shortest." Or some form of greedy algorithm: "I want to avoid that large edge" or "I want to use that small edge." These are good strategies, but they can always be thwarted with the question, "How do you know for sure it's the shortest?"

- This problem at first might seem like the Shortest Path problem we looked at in a previous lesson, but it's actually a much, much harder problem to solve with an algorithm. In fact there is no known algorithm to solve the TSP, other than by "brute force" -- trying every single possible tour of the nodes and then picking the one that gives the shortest possible tour.

The goal of the discussion is to bring out the fact that this would be a much harder problem for computers to solve than the shortest path problem or minimum spanning tree problem. "Hard" means that there is no known way to find the correct answer without generating ALL possible answers. Connect this problem to our desire to understand problems that are computationally hard for computers for encryption.

Here are some of the questions you can ask the students:

- What makes this problem harder than the shortest path problem?

- What kinds of algorithms did you think about in trying to solve the problem? Students might suggest strategies that leverage what they know about solving the minimum spanning tree and shortest path problems from previous lessons, such as starting with the least-cost edge in the graph, and then "greedily" connecting nodes by picking subsequent least-cost edges. This is a good heuristic and might yield a tour that is better than the worst possible tour, but there is no guarantee it's the best. The only algorithm that is known to find the absolute shortest tour every time is brute force, generating every possible tour and comparing them.

This is a fairly extended Wrap Up that asks the teacher to lead students through some reasoning about the traveling salesperson problem. Some preparation is recommended.

Many people for many years have tried to design an algorithm that works in all cases to find the exact optimum route for the traveling salesperson problem, with no success. So there is currently no algorithm to find the exact answer besides "brute force," trying all the possibilities.

Since we have to calculate the distances of all the tours anyway, we might as well count how many tours there are in the first place, just so we know how many calculations we have to perform.

For this demonstration, let's also consider a version of the problem faced by, say, an airline. For an airline, there are cities all over the world, and you can fly from any one directly to any other. So for an airline, there is a possible path that connects every possible node; we say the graph is "fully connected."

Show the students how the problem grows very quickly, by constructing all the possible tours for larger and larger numbers of nodes.

Starting with **3 nodes**, there is only one possible tour as shown in the image below.

With **4 nodes**, there are 3 possible tours as shown in the image below.

With **5 nodes**, there are 12 possible tours as shown in the image below. The way shown in the image is one way to reason through the different tours that 5 nodes can take: a 5th node could be added between each pair of nodes for every possible 4-node tour. Here, the actual distances don't matter because we're just trying to count the number of tours.

A different way to reason through the possible tours of 5 nodes is to "break" one of the edges, connecting it to the new 5th node.

Without having to draw pictures, we can reason about **6 nodes**: add a node to every possible edge of the 12 solutions 5-node solutions. Since each of 12 solutions above has 5 edges, it means we have 12*5 = 60 tours!

We can see how fast this grows: between a graph with 3 nodes and a graph of 6 nodes, our total number of possible tours grew from 1 to 60. This problem grows very quickly!

With just 10 nodes, this grows to about 181,440 possible tours.

With just 26 nodes, this grows to a 25-digit number: 7,755,605,021,665,492,992,000,000. (By comparison, the width of the entire observable universe(!!!!) in miles is roughly a 25-digit number.)

With 100 nodes you're up to a roughly a 155-digit number.

The math: With n nodes there are (n-1)!/2 ("n-1 factorial divided by 2") possible tours. n! (or "n factorial") is n(n-1)(n-2)(n-3)...(2)(1). So for example 10! = 10 * 9 * 8 * 7 * 6 * 5 * 4 * 3 * 2 * 1. You can confirm the formula with the examples above: 5 nodes = (5-1)! / 2 = 12.

Discuss the following with the students.

Computers work fast, but they have limits. In computer science, we have an actual mathematical hard line between reasonable and unreasonable runtimes.

**Reasonable** means the number of things the computer has to do is proportional to the size of the input to the problem. For example, the Minimum Spanning Tree and Shortest Path Problems are "reasonable" because they had algorithms that solved them by considering every edge in the graph once. The amount of time it takes is proportional to the number of edges. If the number of edges is n, even if there was an algorithm that had to look at the edge n^2 times, or n^3 times, that's still reasonable.

**Unreasonable** means the number of things the computer has to do grows as an exponent of the size of the input. So if you discovered that an algorithm made the computer do 2^n things, that's not reasonable, because it means every time the size of the input (n) gets bigger, the solution gets massively further out of reach. n! is another running time that is considered unreasonable. In real life, "unreasonable" problems would take a modern computer trillions and trillions of years to churn through all the possibilities.

So the brute force solution to TSP is unreasonable -- at least as far as we know! In fact, it's an open question how to solve this problem efficiently. If anyone finds a solution that runs in a reasonable time, a lot of security and encryption algorithms are in trouble, because many are based on the fact that this (and related problems) are unreasonable to solve.

Hard problems like this one can be used to create public keys that are tough for computers to crack.

Ask the students to think about the following: If the TSP is unsolvable for finding an exact solution, how do you think package delivery companies optimize their delivery routes?

Do a quick Google search on the Traveling Salesperson Problem, and explore the many many interesting resources you can find.

**CSTA K-12 Computer Science Standards:**CT.L3B:1, CT.L3B:2, CT.L3B:3, CT.L3B:4**Computer Science Principles:**4.2.1 (A, B, C, D)**Computer Science Principles:**4.2.3 (A)**Computer Science Principles:**4.2.4 (A, B, C)