P versus NP for dummies

Okay, let's see if I can correctly interpret scripture.

Algorithmic complexity

Firstly, you probably already know what big O notation is. To put it in very simple terms, big O notation is a way to express the computational complexity of an algorithm. An algorithm is a straightforward method or sequence of instructions for solving a particular set of problems.

A great example of an algorithm is the pen-and-paper method of long multiplication. You learned this in primary school so I won't patronise you by rehearsing the steps, but here's a worked example:

     42
 x   64
--------
    168
 + 252
--------
   2688
--------

Notice how this involves a total of four separate multiplications and then three more addition steps to determine the final answer. Notice also how the total number of steps (multiplications and additions and so on) is dependent on the lengths of the numbers that you are multiplying. Multiplying two single-digit numbers takes just one step, because you have the result memorised. Multiplying a two-digit number with a one-digit number takes two multiplication steps (one for each digit) and one addition step to put those two results together. (Addition itself can be broken down into further steps as well.)

The point is that long multiplication is an algorithm which is equally useful for multiplying any pair of integers. The only difference is the number of steps the algorithm takes. In general, the number of steps scales with the number of digits. If the first integer has m digits and the second integer has n digits, the number of individual multiplication steps required is of order O(mn). Because we assume that each step takes the same amount of time to complete, this is the time complexity of the algorithm.

  • An algorithm of time complexity O(n) is one which increases in time linearly as the "size of the problem" (whatever n stands for) increases. Double n, and it will take twice as long to complete.

  • An algorithm of complexity O(n2) takes quadratic time, meaning that if you double n it will take four times as long to complete.

  • An algorithm of complexity O(nk) for any constant k is said to take polynomial time. k = 1 is linear time, k = 2 is quadratic time.

  • An algorithm of complexity O(2n) takes exponential time; simply increasing n by 1 will double calculation time.

Often, multiple algorithms may be applicable to the same problem or set of problems. Selecting the best algorithm for a situation can be difficult, especially if you're smart enough that devising an entirely new, more efficient algorithm might be an option.

(There are also some very important sets of problems for which it has been proven that no algorithm exists at all. These problem sets are called undecidable. The name is misleading: just because a set of problems has been deemed generally undecidable doesn't mean that any specific problem within that set is unsolvable.)

Automata

When I said "straightforward sequence of instructions" above, what I really meant was "computer program". And by "computer" what I mean is an idealised mathematical model of a computer called an automaton. Automata come in varying degrees of capability, starting from very simple machines which can only implement very simple algorithms, ranging up to full-blown Turing machines which can implement basically any algorithm.

This is where the notions of "step" and "time" become important. The computational complexity of an algorithm is based directly on the number of steps it would take a Turing machine to work the algorithm all the way through.

P

P is defined as the set of all decision problems for which an algorithm exists which can be carried out by a deterministic Turing machine in polynomial time.

Your computer is a deterministic Turing machine.

A sample set of problems falling in P is "Is n prime?"

NP

NP is defined as the set of all decision problems for which an algorithm exists which can be carried out by a nondeterministic Turing machine in polynomial time.

A nondeterministic Turing machine is like a conventional Turing machine, but at any given step during computation it can do more than one thing at once. At this point it effectively bifurcates (multifurcates?), becoming multiple Turing machines simultaneously computing multiple possibilities. As computation continues, a nondeterministic Turing machine can split many times, calculating a vast number of possibilities simultaneously.

There is no computer in the world which qualifies as a true nondeterministic Turing machine. With some minor ifs and buts, nondeterministic Turing machines are a purely abstract concept and can't exist in reality.

A sample set of problems falling in NP is "What integers between 1 and q are prime?"

A nondeterministic Turing machine can do multiple things at once. So, a nondeterministic Turing machine can check every n between 1 and q at once, in the same amount of time that a deterministic Turing machine would take to check just one value of n.

Does P equal NP?

Clearly, any nondeterministic Turing machine can masquerade as a deterministic Turing machine by simply not splitting at any step. So, any problem solvable by a deterministic Turing machine in polynomial time is also solvable by a nondeterministic Turing machine in polynomial time. Thus, PNP.

(It is also true, though harder to prove, that a deterministic Turing machine can be made which emulates the behaviour of any nondeterministic Turing machine. However, this requires the nondeterministic Turing machine's algorithm to be modified. This modification will almost certainly make the algorithm slower in absolute terms, and probably increase its algorithmic complexity from polynomial time to something larger.)

However, this doesn't tell us whether the two are equal or not. While nondeterministic Turing machines appear to be vastly more powerful than deterministic Turing machines, this is neither obvious nor proven. In order to prove that PNP, we would need to prove that there exists a set of problems X such that:

  1. X falls in NP. There exists an algorithm with which a nondeterministic Turing machine could solve problems in X in polynomial time.
  2. X does not fall in P. There exists no algorithm whatsoever with which a deterministic Turing machine could solve problems in X in polynomial time.

Plenty of Xes in NP can be found, which satisfies the first condition easily. But satisfying this second condition, proving no polynomial time algorithm can solve X, is incredibly difficult, and in fact has never yet been done.

So, while it is widely suspected that PNP, and seemingly extremely obvious, it is unproven. And until it's proven nobody can say for sure.

NOTE: actually, we just have to prove that X exists (a non-constructive proof). We don't have to actually find it (a constructive proof), though this would be nice. In theory, this could make the problem easier to solve. Despite this, the problem is still unsolved!

Why is this important?

The most important sample problem set in P is "Is X the correct decryption key for this encrypted file?" Obviously, checking to see whether a key is correct for an encrypted file is a very straightforward process, otherwise we would be sitting around all day waiting for things to fully decrypt.

However, this has a corresponding problem in NP which is "Which of all the possible decryption keys for this encrypted file is the correct one?" In the same amount of time that a deterministic Turing machine takes to check one key, a nondeterministic Turing machine running the exact same algorithm could check every single key simultaneously and tell you the correct one.

Now, as I mentioned, nondeterministic Turing machines are fictitious. Only deterministic Turing machines exist in reality. A deterministic Turing machine can always pretend to be an nondeterministic Turing machine (i.e. a brute-force systematic search of every possible key, one after the other), but only by modifying the algorithm to work in exponential time, which is almost always much less favourable than polynomial time. In other words, breaking open an encrypted file is extremely difficult.

But if P = NP, then for every polynomial-time nondeterministic Turing machine, such as our fictitious instantaneous code-breaker, we would know that there also exists a polynomial-time deterministic Turing machine, possibly with a less favourable polynomial but with the exact same functionality. We would know that checking every single key simultaneously can be done in polynomial time. In other words, breaking open an encrypted file would be very easy.

Now, most people are of the opinion that this is not the case. But again, until it's proven, nobody can say for sure.

Discussion (28)

2010-08-09 21:17:50 by Snowyowl:

For the record, in the official P=NP problem statement, the complexity of the problem is defined as the number of bits used to describe the input. In the case of your multiplication problem, this means you can solve it in O(n) time. (Well, bits... for P=NP you can write it in decimal digits if you like, it makes no difference.)

2010-08-11 16:02:11 by DaveA:

What is the reasoning behind allowing comments for some posts and not others? I wanted to make a suggestion on your FTL entry but couldn't. If you're interested in forward looking real engineering sci-fi, the coolest subject would be in the realm of the first real Machine Intelligence. Humans struggle to fathom the physics of this universe. Our attention span is low, we get tired, we have limited memory, we're slow. How to figure out the solution to what's really going on at the bottom level of this universe? Create a Machine Intelligence and let <i>it</i> figure it out. MI is the last problem humans will need to solve. And the only one worth solving. Because it in itself solves all the rest.

2010-08-11 16:24:45 by qntm:

"FTL entry"?

2010-08-11 17:15:09 by DaveA:

"FTL entry" http://qntm.org/ftl

2010-08-11 17:29:58 by qntm:

Okay. Well, the reason I disabled comments on that entry was that I wasn't particularly looking for feedback on it. You've kind of twice defeated the comments system, firstly by providing feedback on an entry where no feedback was solicited and secondly by putting feedback on an entry under a completely different and unrelated entry. So, well done, I guess?

2010-08-11 17:38:35 by DaveA:

Regarding feedback: How can you be certain you don't want feedback? The whole idea of feedback is to expose yourself to something unexpected. Lately as regards myself I've realized I'm thinking in ruts. The same thoughts over and over again. No breakthroughs. How to get out of the rut? The answer is, modify my life to allow for the Unexpected to occur. This is why old people never innovate. They surround themselves with a nice, soft, warm, cozy environment where nothing unexpected ever comes up. Then they themselves become zombies. Note I'm not trying to hack your comments machinery. I'm just interested in some interaction on some of your topics of interest. I'm amazed at the sheer volume of your creative output. How do you maintain your motivation?

2010-08-12 13:19:07 by Snowyowl:

Ahem, breaking news: P VERSUS NP HAS BEEN SOLVED! (The proof hasn't been peer-reviewed yet though. It might still be wrong.) Yesterday, a paper was published concerning the conjunctive Boolean satisfiability problem, which asks whether a given list of logical statements contradict each other or not. This is an NP-complete problem, but the paper at www.hpl.hp.com/personal/Vinay_Deolalikar/Papers/pnp12pt.pdf asserts that it cannot be solved in polynomial time. If, like me, you don't want to read all 100-plus pages of that PDF, New Scientist has a relevant article at http://www.newscientist.com/article/dn19287 I thought you should be the first to know. Facebook is second.

2010-08-12 15:38:52 by qntm:

I heard about that a week ago! Why do you think I wrote this essay in the first place?

2010-08-14 11:10:52 by Snowyowl:

I realised that a few minutes after posting my comment... Sorry for that.

2010-08-30 01:59:12 by ejl:

Could you, like, write more on this subject? 'Cos you do explain things pretty well :-D (I do enjoy reading your occasional writes-up on these sorts of things)

2011-09-18 22:23:55 by Justin:

I have a thought experiment. Suppose you leave Earth at .99c. You begin working on a problem while at the same time a person on Earth is working on the exact same problem. You stop your ship after you have done one step on the problem and find out that back on Earth the other person has all ready done multiple steps. Feel free to email me at midshipman01_2001@yahoo.com

2012-03-15 21:56:22 by Ryan:

Maybe I'm reading this wrong, but wouldn't P=NP make entropy false? Wouldn't that imply that we could predict anything and everything such as a calculating a electrons velocity and place in space at the same time?

2012-03-15 22:15:56 by qntm:

No, this has nothing to do with entropy.

2012-04-20 17:55:30 by StefanFroelich:

I am so sticking with your blogs. You have a way of explaining things, I think you should try your hands at teaching.

2012-09-07 08:32:34 by Mandy:

I know this is an old post, but I was attempting to explain P vs NP to my husband and came across this article. You have a wonderful ability to explain something so complex into a very understandable term. I am amazed at the flow of the article and your ability to go from one explanation to the next without losing the reader in the process. I agree with the above comment, you should try your hand at teaching! Thank you for taking the time to form this blog, I look forward to reading other areas of interest!

2012-11-01 13:51:14 by Frank:

Could I have your opinion of my paper published in IEEE about P versus NP? See the links here: http://the-point-of-view-of-frank.blogspot.com/2012/08/p-versus-up.html

2013-07-20 03:48:18 by Nicky:

Nice work. The consequences of sequential versus concurrent programming and the two achieving the same solution in deterministic and non-deterministic polynomial time (or not if proven) is quite a computer science problem (as I am a ComSci). I was initially reflecting on series and their calculations both ways (seq and con) but then turned my attention to calculations involving sequences of conditional probabilities and that problem domain set. However although slightly dismissed, a previous comment involved space-time/entropy, and I did wonder if a computer on board a spacecraft could be affected by time distortions/fluctuations therefore affecting calculations at the quantum level - compressions and decompressions of time similar to black hole time fluctuations and whether P=NP could actually occur for some calculations at that moment in time theoretically speaking of course. Even so thanks for the explanation, I think my mind is now toast!

2013-07-20 04:01:41 by Nicky:

Checked your CV. What a surprise. I did 10 years at NHbr and Hursley. Alex (ex Notts grad) started around 2008 at Hursley - Oracle apps, spends time in NY for them. If you come across him say Hi from me. Just say Jemima to him... He will know.

2013-10-14 21:38:14 by tonyv:

The human brain is a device which can formulate both questions AND solutions. Therefore, one could reasonably ask, does it not also require an algorithm (or group of algorithms) to formulate a question? If one thinks about this hard enough one might then ask: Could a question such as P vs. NP ever arise in a universe where P = NP? On which type of Turing machine would the question be formulated in the first place? Basically, it seems to amount to the absurdity of one Turing machine asking another Turing machine to validate a parking ticket that does not require validation. Pardon me Mr. Turing, can I trouble you for a moment to answer a question for which we all already know the answer and for which this question could not exist? If a non-deterministic Turing machine cannot exist in a truly relativistic world due to problems of absolute simultaneity, then doesn't that sort of offer a hint at the physical-theoretical unification problem? In an absolutely relativistic world, absolute simultaneity does not exist and fixed position cannot be determined - so one could never establish a notion of what a measurement would mean in the first place. The traveling salesman problem, for example, would not arise because the notion of a fixed distance between two points would be an absurdity. And for subset sum - where would the notion of a closed set come from? Could the human brain really even be said to exist if it where not capable of functioning as a brain?

2013-12-02 20:24:46 by John:

I have read all of the above and decided that I am thick as I still do not have a clue what P vs NP means. Just thought I would share that!

2014-04-17 14:53:12 by MAGIC:

very helpful to the layperson

2014-10-23 11:18:41 by MRY:

Indeed that the way of explaining things .. You should make your own blog and populate it regarding different topics .. Bless

2014-11-06 21:08:59 by wat:

I don't Get it

2015-06-18 17:41:13 by Alex A.:

You guys are silly. I'm not a math guy but essentially p v. Np is completely useless and has no merrit in the real world. For instance a single threaded computer vs. a computer with 1billion threads cannot in or of itself or on it's own determine a solution. Even if you created a true parallel computer, or serial computer even and no they aren't around yet. What we have today are boolean true/false , 1 and 0 computers. You could never solve a problem, of this type anyway. Primarily becaus3 the nature of P will change as modern tech progresses towards non-deterministic machines. So in essence in the eventuality that we can use np to solve p problems, the problems themselves will become np problems... So why would this matter in the least? I mean at least a single version of p will equal np at some instance. The nature of computers is such that rather then like the human brain excluding functionality and specialising they incorporate new functionality. Which means p might evolve to np but will never exist in the same time frame and for both to be valid. Hence p v. Np is not relavent. The very nature of capacitors limits our ability to use non-deterministic Turing machines. Once the hardware is developed for truely tri state capacitors, p becomes useless.

2015-06-18 17:44:29 by Alex A.:

Basically I guess I was trying to say that p = np but there is no equation which could determine which p would equal np without a non-deterministic computer at which point p becomes irrelevant.

2015-10-27 23:31:44 by Resuna:

As I understand it... one of the theoretical benefits of quantum computing is that, while nobody actually knows how to do it, it might be possible to build a quantum computer that can function as a nondeterministic turning machine. At the very least Greg Egan and other post-information-revolution SF writers have written some pretty good stories about the possibility. :)

2015-12-05 19:18:30 by Jay:

What about NP-complete? How would one go about explaining that?

2017-08-19 19:02:16 by Ix:

You say "Your computer is a deterministic Turing machine.". A Turing Machine is equivalente to an algorithm procesed by my computer, but not only the hardware. You must see the TM as a software compatible with your computer but you computer doing nothing are not a TM. A non-deterministic TM can be a software that not solve a problem allways but offten give a good solution.

New comment by :

Plain text only. Line breaks become <br/>
The square root of minus one: