Monday, December 26, 2011

13: Another Problem For The Millenium

    Some of you may have noticed that in last week's podcast, while I
discussed the politics of the Poincare' Conjecture, I never got around
to describing the actual problem.  There's a simple reason for that--
I just don't understand it.  It comes from a domain, topology, that I
have not studied in nearly enough depth.  But that got me thinking
that it might be fun to try to explain the one millenium problem that
I *do* understand.  This problem, from the domain of theoretical
computer science, is known as the "P=NP" question.  It also has the
distinction of being the only Millenium Problem that has been
mentioned on The Simpsons.  Obviously, I'll have to gloss over a bunch
of details, but I think I can give you the general flavor.
    To understand this problem, first we need the concept of the
"complexity" of a computer program.  Basically, this is a formula that
describes how much time a program takes, based on the size of its
input.  For example, suppose I give you a program to sort a list of
names, and tell you it has n-squared complexity.  Then you run the
program on your Outlook contact list, and it takes 1 second.  Now
suppose you have an unusually social weekend, and your number of
friends doubles, and you run it again.  Since the complexity is
n-squared, the increase in runtime will be by a factor of 2-squared,
or 4 seconds.  Now suppose people find out that you listen to Math
Mutation, and your number of friends multiplies by 100.  The program
will now take 100-squared, or 10000 seconds, a little under 3 hours.
    Now suppose I give you another sorting program written by a
slightly less talented programmer, and tell you it has a complexity of
2 to the nth power, and it also runs in 1 second on your original
contact list.  On the double-sized list, it will take 2 to the 2nd
power, or 4 seconds again.  But when the list multiples by 100, it
will take 2 to the 100th power seconds, which means the sun will have
burned out eons before the program finishes. 
    These examples illustrate two major complexity classes:
polynomial and exponential.  Polynomial time means formulas like
n-squared, where the exponents are constant.  Exponential time means
formulas like 2 to the nth power, where the exponent contains a
variable.  In general, if you have a polynomial time solution for a
problem, that is likely to mean that it is practical to fully solve it
on a modern computer.  If the best you can do is exponential time,
then a practical and fully general solution is not feasible, and we
usually have to use various forms of approximation or incomplete
heuristics to get real-life solutions.
    Now we are ready to talk about P and NP, the main terms of
this millenium problem.  "P" refers to the class of problems that have
a polynomial-time solution.  Sorting a list of names is a typical
example.  "NP" refers to a presumably wider class of problems where,
if someone gives you the answer, you can check it in polynomial time.
A classic example of this is the "travelling salesman problem":  given
a roadmap of a country, can you visit every city while driving less
than 10000 miles total?  Finding a solution is very hard, since there
are an  exponential number of possible routes.  But if someone gives
you a route, it's easy to add up the driving time and see if the
answer is correct.  Many problems known to be in NP currently only
have exponential-time solutions, which makes solving them with modern
computers very difficult. 
    Finally, we are at the point where we can describe the "P=NP"
problem.  Basically, this problem asks: does P=NP?  Is
it the case that every problem in NP, where we can check the answer in
polynomial time, is also in P, which means we can solve it in
polynomial time?  Or in other words, is it the case that for any
problem, if we can write an efficient computer program to check the
answer, we could have also written an efficient program to solve it
from scratch?
    Computer scientists generally believe that the answer has to be
"no"-- there are just too many practical problems in NP, like the
travelling salesman problem, that would make someone millions of
dollars if they could solve with an efficient, polynomial program.
Yet nobody has solved these.  And besides, it would defy common
sense-- isn't it obvious that it's much easier to check an answer
someone gives you, than to solve the problem in the first place?  But
mathematicians have not yet been able to prove it.  So while we have a
strong feeling that P does not equal NP, that it is truly harder to
solve these problems than to check the answer, nobody knows for sure.
    Proving that something can't be done, that nobody will ever write
a clever program to solve the travelling salesman problem efficiently,
is very hard.  It's much easier to prove that something *can* be done,
and as a result, every week or so some deluded contrarian publishes a
faulty proof on the net that P really does equal NP, by claiming some
horrendously complex algorithm they have designed can solve general NP
problems in polynomial time.  They always mess up some trivial detail,
or fail to understand the exact parameters of the problem. 

    Anyway, that is my 5-minute attempt at summarizing the P=NP
problem.  Send me an email to tell me how you think I did. 

    And this has been your Math Mutation for today.
  • Millenium Problems on Wikipedia
  • P=NP on Wikipedia
  • Travelling Salesman Problem on Wikipedia
  • Sorting Algorithms on Wikipedia
  • No comments:

    Post a Comment