An
NP-complete SetThe definitions and discussion about
P and NP were very interesting. But, of course for any of this discussion to be worthwhile we need to see an NP-complete set. Or at least prove that there is one. The following definitions from the propositional calculus lead to our first NP-complete problem.Definition.
A clause is a finite collection of literals, which in turn are Boolean variables or their complements.Definition. A clause is satisfiable if and only if at least one literal in the clause is true.
Suppose we examine the clauses below which are made up of literals from the set of Boolean variables {
v1, ..., vn}.![]()
The first clause is satisfiable if either
v1 or v3 are true or v2 is false. Now let us consider at the entire collection of clauses. All three are true (at once) when all three variables are true. Thus we shall say that a collection of clauses is satisfiable if and only if there is some assignment of truth values to the variables which makes all of the clauses true simultaneously. The collection:![]()
is not satisfiable because at least one of the three clauses will be false no matter how the truth values are assigned to the variables. Now for the first decision problem which is
NP-complete. It is central to theorem proving procedures and the propositional calculus.The Satisfiability Problem (SAT).
Given a set of clauses, is there an assignment of truth values to the variables such that the collection of clauses is satisfiable?Since some collections are satisfiable and some are not, this is obviously a nontrivial decision problem. And it just happens to be
NP-complete! By the way, it is not the general satisfiability problem for propositional calculus, but the conjunctive normal form satisfiability problem. Here is the theorem and its proof.Theorem 5.
The satisfiability problem is NP-complete.Proof Sketch.
The first part of the proof is to show that the satisfiability problem is in NP. This is simple. A machine which checks this merely jots down a truth value for each Boolean variable in a nondeterministic manner, plugs these into each clause, and then checks to see if one literal per clause is true. A Turing machine can do this as quickly as it can read the clauses.The hard part is showing that every set in
NP is reducible to the satisfiability problem. Let's start. First of all, if a set is in NP then there is some one tape Turing machine Mi with alphabet = {0, 1, b} which recognizes members (i.e., verifies membership) of the set within time p(n) for a polynomial p(n). What we wish is to design a polynomial time computable recursive function gi(x) such that:Mi
recognizes x if and only if gi(x) Î SAT.For
gi(x) to be a member of SAT, it must be some collection of clauses which contain at least one true literal per clause under some assignment of truth values. This means that gi must produce a logical expression which states that Mi accepts x. Let us recall what we know about computations and arithmetization. Now examine the following collections of assertions.a) When
b) At each step of
Mi 's computation:c) At each computational step, the instruction being executed and the symbol on the square being scanned completely determine:
d) Before p(n) steps,
Mi must be in a halting configuration.These assertions tell us about the computation of
Mi(x). So, if we can determine how to transform x into a collection of clauses which mean exactly the same things as the assertions written above, we have indeed found our gi(x). And, if gi(x) is polynomially computable we are done.First let us review our parameters for the Turing machine
Mi. It uses the alphabet {0, 1, b, #} (where # is used only as an endmarker) and has m instructions. Since the computation time is bounded by the polynomial p(n) we know that only p(n) squares of tape may be written upon.Now let us examine the variables used in the clauses we are about to generate. There are three families of them. For all tape squares from 1 to p(n) and computational steps from time 0 to time p(n), we have the collection of Boolean variables of the form
HEAD[s, t]
which is true if Mi has its tape head positioned on tape square s at time t.(Note that there are
p(n)2 of these variables.) For the same time bounds and all instructions, we have the variables of the formINSTR[i, t]
which is true if Mi is about to execute instruction number i at time t.There are only m*p(n) of these variables. The last family contains variables of the form
CHAR[c, s, t]
which is true if character c in {0, 1, b, #} is found upon tape square s at time t.So, we have O(
p(n)2) variables in all. This is still a polynomial.Now let's build the clauses which mean the same as the above assertions. First, the machine must begin properly. At time 0 we have #x on the tape. If x = 0110 then the clauses which state this are:
(CHAR[#,1,0]), (CHAR[0,2,0]), (CHAR[1,3,0]),
(CHAR[1,4,0]), (CHAR[0,5,0])
and blanks are placed upon the remainder of the tape with:
(CHAR[b,6,0]), ... , (CHAR[b,p(n),0]).
Since the machine begins on square one with instruction 1, we also include:
(HEAD[1,0]), (INSTR[1,0]).
That finishes our first assertion. Note that all of the variables in these clauses must be true for
gi(x) to be satisfiable since each clause contains exactly one literal. This starts Mi(x) off properly. Also note that there are p(n)+2 of these particular one variable clauses.(NB. We shall keep count of the total number of literals used so far as we go so that we will know |
gi(x)|.)During computation one instruction may be executed at each step.
But, if the computation has halted then no more instructions can be executed. To remedy this we introduce a bogus instruction numbered 0 and make
Mi switch to it whenever a halt instruction is encountered. Since Mi remains on instruction 0 from then on, at each step exactly one instruction is executed.The family of clauses (one for each time t
£ p(n)) of the form:(INSTR[0,t], INSTR[1,t], ... , INSTR[m,t])
maintain that
Mi is executing at least one instruction during each computational step. There are (m+1)* p(n) literals in these. We can outlaw pairs of instructions (or more) at each step by including a clause of the form:![]()
for each instruction pair i and j (where i < j) and each time t. These clauses state that no pair of instructions can be executed at once and there are about
p(n)* m2 literals in them.Clauses which mandate the tape head to be on one and only one square at each step are very much the same. So are the clauses which state that exactly one symbol is written upon each tape square at each step of the computation. The number of literals in these clauses is on the order of
p(n)2. (So, we still have a polynomial number of literals in our clauses to date.)Now we must describe the action of
Mi when it changes from configuration to configuration during computation. Consider the Turing machine instruction:
Thus if
Mi is to execute instruction 27 at step 239 and is reading a 0 on square 45 we would state the following implication:if
Recalling that the phrase (if A then B) is equivalent to (not(A) or B), we now translate the above statement into the clauses:
![]()
![]()
![]()
Note that the second line of instruction 27 contains a halt. In this case we switch to instruction 0 and place the tape head on a bogus tape square (square number 0). This would be something like:
![]()
![]()
![]()
(These clauses are not very intuitive, but they do mean exactly the same as the if-then way of saying it. And besides, we've got it in clauses just like we needed to. This was quite convenient.)
In general, we need trios of clauses like the above for every line of each instruction, at every time, for all of the tape squares. Again, O(
p(n)2) literals are involved in this.To make sure that the rest of the symbols (those not changed by the instruction) remain on the tape for the next step, we need to state things like:
![]()
which become clauses such as:
![]()
These must be jotted down for each tape square and each symbol, for every single time unit. Again, we have O(
p(n)2) literals.When
Mi halts we pretend that it goes to instruction 0 and place the head on square 0. Since the machine should stay in that configuration for the rest of the computation, we need to state for all times t:![]()
![]()
(this was another if-then statement) and note that there are O(p(n)) literals here.
One more assertion and we are done. Before p(n) steps, M
i must halt if it is going to accept. This is an easy one since the machine goes to instruction 0 only if it halts. This is merely the clause(INSTR[0, p(n)]).
Of course this one must be true if the entire collection of clauses is to be satisfiable.
That is the construction of
gi(x). We need to show that it can be done in polynomial time. Let us think about it. Given the machine and the time bound p(n), it is easy (long and tedious, but easy) to read the description of the Turing machine and generate the above clauses. In fact we could write them down in a steady stream as we counted to p(n) in loops such as
So, computing g
i(x) takes about as much time to compute as it does to write it down. Thus its complexity is O(|gi(x)|). The same as the length of all of the literals in the clauses. Since there are O(p(n)2) of these and the length of a literal will not exceed log2(p(n)) we arrive at polynomial time complexity for the computation of gi(x).The remainder of the proof is to show that
Mi accepts x if and only if gi(x)
Î SAT.While not completely trivial, it does follow from an examination of the definitions of how Turing machines operate compared to the satisfiability of the clauses in the above construction. The first part of the proof is to argue that if
Mi accepts x, then there is a sequence of configurations which Mi progresses through. Setting the HEAD, CHAR, and INSTR variables so that they describe these configurations makes the set of clauses computed by gi(x) satisfiable. The remainder of the proof is to argue that if gi(x) can be satisfied then there is an accepting computation for Mi(x). That was our first
NP -complete problem. It may not be quite everyone's favorite, but at least we have shown that one does indeed exist. And now we are able to state a result having to do with the question about whether P = NP in very explicit terms. In fact the satisfiability problem has become central to that question. And by the second corollary, this problem can aid in proving NP-completeness.Corollary.
SAT is in P if and only if P = NP.Corollary.
If A Î NP and SAT £ p A then A is NP -complete.So, all we need to do is determine the complexity of the satisfiability problem and we have discovered whether
P and NP are the same. Unfortunately this seems much easier said than done!