2010-08-09 by
qntm

Okay, let's see if I can correctly interpret scripture.

Firstly, you probably already know what big O notation is. To put it in very simple terms, big O notation is a way to express the computational complexity of an algorithm. An algorithm is a straightforward method or sequence of instructions for solving a particular **set of problems**.

A great example of an algorithm is the pen-and-paper method of long multiplication. You learned this in primary school so I won't patronise you by rehearsing the steps, but here's a worked example:

42 x 64 -------- 168 + 252 -------- 2688 --------

Notice how this involves a total of four separate multiplications and then three more addition steps to determine the final answer. Notice also how the total number of steps (multiplications and additions and so on) is dependent on the lengths of the numbers that you are multiplying. Multiplying two single-digit numbers takes just one step, because you have the result memorised. Multiplying a two-digit number with a one-digit number takes two multiplication steps (one for each digit) and one addition step to put those two results together. (Addition itself can be broken down into further steps as well.)

The point is that long multiplication is an *algorithm* which is equally useful for multiplying any pair of integers. The only difference is the number of steps the algorithm takes. In general, the number of steps scales with the number of digits. If the first integer has `m` digits and the second integer has `n` digits, the number of individual multiplication steps required is of order O(`m``n`). Because we assume that each step takes the same amount of time to complete, this is the time complexity of the algorithm.

An algorithm of time complexity O(

`n`) is one which increases in time*linearly*as the "size of the problem" (whatever`n`stands for) increases. Double`n`, and it will take twice as long to complete.An algorithm of complexity O(

`n`^{2}) takes*quadratic*time, meaning that if you double`n`it will take four times as long to complete.An algorithm of complexity O(

`n`^{k}) for any constant`k`is said to take*polynomial*time.`k`= 1 is linear time,`k`= 2 is quadratic time.An algorithm of complexity O(2

^{n}) takes*exponential*time; simply increasing`n`by 1 will double calculation time.

Often, multiple algorithms may be applicable to the same problem or set of problems. Selecting the best algorithm for a situation can be difficult, especially if you're smart enough that devising an entirely new, more efficient algorithm might be an option.

(There are also some very important sets of problems for which it has been *proven* that *no algorithm exists at all*. These problem sets are called undecidable. The name is misleading: just because a *set* of problems has been deemed *generally* undecidable doesn't mean that any *specific* problem within that set is unsolvable.)

When I said "straightforward sequence of instructions" above, what I really meant was "computer program". And by "computer" what I mean is an idealised mathematical model of a computer called an *automaton*. Automata come in varying degrees of capability, starting from very simple machines which can only implement very simple algorithms, ranging up to full-blown Turing machines which can implement basically any algorithm.

This is where the notions of "step" and "time" become important. The computational complexity of an algorithm is based directly on the number of steps it would take a Turing machine to work the algorithm all the way through.

**P** is defined as *the set of all decision problems for which an algorithm exists which can be carried out by a deterministic Turing machine in polynomial time*.

Your computer is a deterministic Turing machine.

A sample set of problems falling in **P** is "Is `n` prime?"

**NP** is defined as *the set of all decision problems for which an algorithm exists which can be carried out by a nondeterministic Turing machine in polynomial time*.

A nondeterministic Turing machine is like a conventional Turing machine, but at any given step during computation it can do *more than one thing at once*. At this point it effectively bifurcates (multifurcates?), becoming multiple Turing machines simultaneously computing multiple possibilities. As computation continues, a nondeterministic Turing machine can split many times, calculating a vast number of possibilities simultaneously.

There is no computer in the world which qualifies as a true nondeterministic Turing machine. With some minor ifs and buts, nondeterministic Turing machines are a purely abstract concept and can't exist in reality.

A sample set of problems falling in **NP** is "What integers between 1 and `q` are prime?"

A nondeterministic Turing machine can do multiple things at once. So, a nondeterministic Turing machine can check every `n` between 1 and `q` at once, in the same amount of time that a deterministic Turing machine would take to check just one value of `n`.

Clearly, any nondeterministic Turing machine can masquerade as a deterministic Turing machine by simply not splitting at any step. So, any problem solvable by a deterministic Turing machine in polynomial time is also solvable by a nondeterministic Turing machine in polynomial time. Thus, **P** ⊆ **NP**.

(It is also true, though harder to prove, that a deterministic Turing machine can be made which emulates the behaviour of any nondeterministic Turing machine. However, this requires the nondeterministic Turing machine's algorithm to be modified. This modification will almost certainly make the algorithm slower in absolute terms, and probably increase its algorithmic complexity from polynomial time to something larger.)

However, this doesn't tell us whether the two are equal or not. While nondeterministic Turing machines *appear* to be vastly more powerful than deterministic Turing machines, this is neither obvious nor proven. In order to prove that **P** ≠ **NP**, we would need to prove that there exists a set of problems `X` such that:

`X`falls in**NP**. There exists an algorithm with which a nondeterministic Turing machine could solve problems in`X`in polynomial time.`X`does*not*fall in**P**. There exists*no algorithm whatsoever*with which a deterministic Turing machine could solve problems in`X`in polynomial time.

Plenty of `X`es in **NP** can be found, which satisfies the first condition easily. But satisfying this second condition, proving *no* polynomial time algorithm can solve `X`, is *incredibly* difficult, and in fact has never yet been done.

So, while it is widely suspected that **P** ≠ **NP**, and seemingly extremely obvious, it is unproven. And until it's proven nobody can say for sure.

NOTE: actually, we just have to prove that `X` *exists* (a non-constructive proof). We *don't* have to *actually find it* (a constructive proof), though this would be nice. In theory, this could make the problem easier to solve. Despite this, the problem is still unsolved!

The most important sample problem set in **P** is "Is `X` the correct decryption key for this encrypted file?" Obviously, checking to see whether a key is correct for an encrypted file is a very straightforward process, otherwise we would be sitting around all day waiting for things to fully decrypt.

However, this has a corresponding problem in **NP** which is "Which of *all the possible decryption keys* for this encrypted file is the correct one?" In the same amount of time that a deterministic Turing machine takes to check *one* key, a nondeterministic Turing machine running the exact same algorithm could check *every single key simultaneously and tell you the correct one*.

Now, as I mentioned, nondeterministic Turing machines are fictitious. Only deterministic Turing machines exist in reality. A deterministic Turing machine can always pretend to be an nondeterministic Turing machine (i.e. a brute-force systematic search of every possible key, one after the other), but only by modifying the algorithm to work in exponential time, which is almost always much less favourable than polynomial time. In other words, breaking open an encrypted file is extremely difficult.

But if **P** = **NP**, then for every polynomial-time nondeterministic Turing machine, such as our fictitious instantaneous code-breaker, we would know that there also exists a polynomial-time deterministic Turing machine, possibly with a less favourable polynomial but with the exact same functionality. We would know that *checking every single key simultaneously can be done in polynomial time*. In other words, breaking open an encrypted file would be very easy.

Now, most people are of the opinion that this is not the case. But again, until it's proven, nobody can say for sure.

## Discussion (28)

## 2010-08-09 21:17:50 by Snowyowl:

## 2010-08-11 16:02:11 by DaveA:

## 2010-08-11 16:24:45 by qntm:

## 2010-08-11 17:15:09 by DaveA:

## 2010-08-11 17:29:58 by qntm:

## 2010-08-11 17:38:35 by DaveA:

## 2010-08-12 13:19:07 by Snowyowl:

## 2010-08-12 15:38:52 by qntm:

## 2010-08-14 11:10:52 by Snowyowl:

## 2010-08-30 01:59:12 by ejl:

## 2011-09-18 22:23:55 by Justin:

## 2012-03-15 21:56:22 by Ryan:

## 2012-03-15 22:15:56 by qntm:

## 2012-04-20 17:55:30 by StefanFroelich:

## 2012-09-07 08:32:34 by Mandy:

## 2012-11-01 13:51:14 by Frank:

## 2013-07-20 03:48:18 by Nicky:

## 2013-07-20 04:01:41 by Nicky:

## 2013-10-14 21:38:14 by tonyv:

## 2013-12-02 20:24:46 by John:

## 2014-04-17 14:53:12 by MAGIC:

## 2014-10-23 11:18:41 by MRY:

## 2014-11-06 21:08:59 by wat:

## 2015-06-18 17:41:13 by Alex A.:

## 2015-06-18 17:44:29 by Alex A.:

## 2015-10-27 23:31:44 by Resuna:

## 2015-12-05 19:18:30 by Jay:

## 2017-08-19 19:02:16 by Ix: