The Classes P and NP

We now shift gears slightly and move to the examination of two families of problems, which are very important to computer scientists. Two families which constitute the bulk of our practical computational problems and have been central to the theory of computation for many years.

The first is a class which contains all of the problems we solve using computers. If we think about the problems we actually present to the computer we note that not too many computations require more than O(n3) or O(n4) time. In fact, most of the important algorithms we compute are somewhere in the O(log n) to O(n3) range. Thus we shall state that practical computation resides within polynomial time bounds. There is a name for this class of problems.

Definition. The class of polynomially solvable problems, P contains all sets in which membership may be decided by an algorithm whose running time is bounded by a polynomial.

Besides containing all of what we have decided to consider practical computational tasks, the class P has another attractive attribute. Its use allows us to not worry about our machine model since all reasonable models of computation (including programs and Turing machines) have time complexities, which are polynomially related.

That was the class of problems we actually compute. But there is another important class. This one is the class of problems that we would love to solve but are unable to do so exactly. Since that sounds strange, let's look at an example. Consider final examination scheduling. A school has n courses and five days in which to schedule examinations. An optimal schedule would be one where no student has to take two examinations on the same day. This seems like an easy problem. But there are O(5n) possible different schedules. If we looked at them all with a computer which could check a million schedules every second, the time spent checking for a value of n = 50 would be about

200,000,000,000,000,000,000 years!

Yes, that's right. Obviously this will not be done between registration and the end of the semester.

One might wonder if the above analysis was needed, because after all, who would look at all of the schedules? You only need to check a few of the obvious ones. Or do you? Think back over all of the examination schedules you have seen. Were there any, which were optimal? No! So, there must be a small problem somewhere. We shall see more on this problem later.

Let us think a little more about examination schedules. While it might be very difficult to find a good one, it is easy to check a schedule to see how near perfect it is. This process is called verification and allows us to know quickly if we stumble upon a good schedule.

Consider another problem that of finding a minimal length tour of n cities where we begin and end at the same place. (This is called the closed tour problem.) Again, there are many solutions, in fact n factorial different tours are possible. And, once more, if we have a tour, we can easily check to see how long it is. Thus if we want a tour of less than some fixed length, we can quickly check candidates to see if they qualify.

This is interesting and provides some hope of solving problems of this kind. If we can determine the worth of an answer, then maybe we can investigate promising solutions and keep the best one.

Let us consider a class of problems, which all seem very complex, but have solutions, which are easily checked. Here is a class, which contains the problems for which solutions can be verified in polynomial time.

Definition. The class of nondeterministic polynomially acceptable problems, NP, contains all sets in which membership can be verified in polynomial time.

This may seem to be quite a bizarre collection of problems. But think for a moment. The examination scheduling problem does fit here. If we were to find a solution, it could be checked out very quickly. Lots of other problems fall into this category. Another instance is closed tours of groups of cities. Many graph problems used in CAD algorithms for computer chip design fit in here also. Also, most scheduling problems. This is a very interesting collection of problems.

One might wonder about the time actually involved in solving membership in this class. The only known relationship between NP and deterministic time is the following result.

Theorem 1. For every set A in NP there is a polynomial p(n) such that the problem of determining whether a data item of size n is a member of A can be solved in 2p(n) time.

A useful tool in studying the relationships between members of a class is the translation or mapping of one to another. If we can translate one set into another, we can often deduce properties of one by the properties that we know the other possesses. This is called reducibility, is pictured in Figure 1, and defined below.

Definition. The set A is many-one polynomial-time reducible to the set B (this is written as A p B) if and only if there is a recursive function g(x) which can be computed in polynomial time such that for all x: x Î A if and only if g(x) Î B.

Figure 1 - A many to one mapping between sets

Note that all of the members of A map into a portion of B and all elements not in A map into a part of B's complement. This gives us a way to solve membership in A if we know how to solve membership in B. If A is reducible to B via the function g(x), then all we need do to determine if x is in A is to check to see if g(x) is in B.

One of the properties preserved by reducibility is complexity. Recall that to decide whether x was in A, we had to:

  1. Compute g(x), and

  2. Check to see if g(x) was in B.

Thus the complexity of deciding membership in A is the sum of the complexities of computing g(x) and deciding membership in B. If computing g(x) does not take very long then we can say that B is no more complex than A. From this discussion we can state the following theorem.

Theorem 2. If A p B and B is in P, then A is in P also.

And of course if A pB and B is in NP, then A is in NP for exactly the same reasons. This brings up another concept.

Definition. The set A is hard for a class of sets if and only if every set in the class is many-one reducible to A.

If the reducibility function is not very complex, this means that the set A is at least as complex as any of the members of the class it is hard for. Thus an NP-hard set would be as difficult to decide membership in as any set in NP. If it were itself in NP, it would be the most complex set in the class. We have a name for this.

Definition. A set is complete for a class if and only if it is a member of the class and hard for the class.

Here is another fact about NP-complete sets and polynomial reducibilities, which will be our major tool in proving sets NP-complete.

Theorem 3. If A p B for a set B in NP, and A is NP-complete, then B is NP-complete also.

Polynomial reducibilities also may be used to place upper bounds upon sets in P. For example, the following result is based on this.

Theorem 4. If A is NP-complete then A is a member of P if and only if P = NP.

Proof. Almost obvious. If A is a member of P then every set polynomially reducible to A is also in P. Thus the NP-completeness of A forces every single one of the sets in NP to be members of P.

On the other hand, if P = NP then of course A is a member of P as well. 

This is very interesting. If we know that membership in one NP-complete set can be decided in polynomial time then we know that every set in NP can be decided using some polynomial algorithm! This means that we would get all of their recognition algorithms for the price of one. But, it is felt that this is highly unlikely since we know of no sub-exponential algorithms for membership in any of these sets and the problem has been around for a while.

In closing, here is a small list of some of the many problems that are members of NP, and are in fact, NP-complete.

0-1 Integer Programming (0-1 INT). Given a matrix A and a vector b, is there a vector x with values from {0, 1} such that Ax b?

CLIQUE. Given a graph and an integer k, are there k vertices in the graph which are all adjacent to each other?

Vertex Cover (VC). Given a graph and an integer k, is there a collection of k vertices such that each edge is connected to one of the vertices in the collection?

Chromatic Number (COLOR). Given a graph and an integer k, is there a way to color the vertices with k colors such that adjacent vertices are colored differently?

Examination Scheduling (EXAM). Given a list of courses, a list of conflicts between them, and an integer k; is there an exam schedule consisting of k dates such that there are no conflicts between courses which have examinations on the same date?

Closed Tour (TOUR). Given n cities and an integer k, is there a tour, of length less than k, of the cities which begins and ends at the same city?

Rectilinear Steiner Spanning Tree (STEINER). Given n points in Euclidean space and an integer k, is there a collection of vertical and horizontal lines of total length less than k, which spans the points?

Knapsack. Given n items, each with a weight and a value, and two integers k and m, is there a collection of items with total weight less than k, which has a total value greater than m?