Sunday, November 25, 2012

174: Too Much Math?

Audio Link

Before we start, I'd like to thank listeners JMS and Daniel54600, who posted nice reviews recently on iTunes. Remember, if you like the podcast, seeing good reviews does help motivate me to record the next episode!

Anyway, on to today's topic. Recently I read online about a research study that was published this past summer in the Proceedings of the US National Academy of Sciences, called "Heavy Use Of Equations Impedes Communications By Biologists", by Tim Fawcett and Andrew Higginson. The authors analyzed a large number of recent biology papers, and tried to relate the number of equations in each paper to the citation count, or number of later papers that referenced it. They concluded that the more equations you have, the fewer people will later read and use your paper: each increase of 1 equation per page caused a 28% penalty in citations. So, does this really mean that scientists are afraid of math?

As you would expect, the Fawcett-Higginson paper led to quite a bit of discussion in the blogosphere. Do scientists really hate equations? One caveat that was often pointed out is the fact that theoretical papers with lots of equations tend to be more specialized, which inherently leads to lower citation rates. You can probably think of many other factors in the audiences and targets of papers which might have led to the reported results. On the other hand, there are also numerous articles supporting the implication that biologists don't like math in their papers, one pointing out that some Ph.D.s in the life sciences never had to take a math class beyond calculus. Going through the equations can be difficult-- one blogger linked in the show notes suggests that reading papers should be accompanied by "Eye tracking and mild electroshock therapy. If scientists skim over pages of equations or stare into space for too long while reading a technical paper, they get a gentle jolt of electricity to bring them back to the important equations at hand." A bit of hyperbole, but a good way to highlight the amount of mental discipline it takes to follow a detailed series of equations in a paper.

Reading about this paper brought to mind memories of my graduate studies in computer science, back in the early 90s. One day I had come to my advisor with a proposal for my dissertation topic, involving efficient checkpointing methods for parallel programming systems. After looking over my proposal, he looked up at me and said, "It's a good topic, but can you work in more equations?" I was a little confused, and asked where he thought I had skipped a needed equation. "Nowhere specific, it's just that more equations are expected." Needless to say, that was one of the more frustrating conversations of my aborted academic career.

I think what most discussions of the paper are missing is to ask the question: what is the role of an equation in a paper? Is it a quasi-mystical invocation, as my advisor seemed to think, adding credibility to the thesis regardless of its content? Is it an ego-boosting method for the author, allowing him to claim a deep understanding that is inaccessible to the casual reader? Obviously, neither of these is a very good answer. My answer would be that in a science or engineering paper equations perform a very important role: they allow you to rigorously prove that some result is a consequence of your assumptions, definitions, and experimental work. Mathematics is what you use to start with precise definitions and assumptions, and show their logical consequences. When you can derive a new equation to describe a concept in a rigorous, universally applicable way, that is one of the most powerful results possible in a piece of scientific work.

But do scientists and engineers usually work by spending their days deriving series of equations? Actually, before there is an equation, there is usually some kind of intuition about how the world works. And it is refinining and clarifying these intuitions and experiments in a precise way that lead to the mathematical results. Often when writing a paper, proud of the difficult and rigorous work it took to derive the equations based on your original theories, it is easy to get caught in the trap of wanting to put those equations up front as the primary focus of your communication. It's not the fault of the authors that they make this mistake, as it has been programmed in us starting as early as high school geometry. Were you in a class where you were presented a series of proofs that seemingly sprang in ordered, fully understood form from Euclid's mind? I would bet that each of his results certainly came from many hours of doodling and experimenting with different pictures and measurements. How many triangles do you think the ancient Greeks drew before they were ready to prove the Pythagorean Theorem?

When you're writing a paper in any scientific or engineering discipline, you need to keep in mind that your primary job is to communicate with your audience-- to enable the reader to understand the ideas in your paper. If you have performed some insightful math to verify the surprising and more general consequences of your initial intuitions, it is important to tell your readers about that, BUT also important to help them understand your original intutions that led them to the equations. The first reading of a paper by an individual reader will nearly always be a casual attempt to get the basic idea; only after they have been convinced intuitively of the value of the new contribution will they take the time and mental energy to follow all the details. I've sat through way too many lectures at conferences that presented a dense series of derivations and equations one by one, rather than describing at a high level what they are talking about. The results were probably really good, but without any intuition to grab on to, it's nearly impossible to stare at a long series of equations on the screen or in a paper without zoning out.

So I wouldn't take the Fawcett-Higginson result as a statement against equations-- but as a call for theoreticians to keep their audience in mind when trying to describe their results. If describing the intuition first and then putting the detailed equations in an appendix makes the paper more understandable, they should not be afraid to do it. It doesn't detract from the math to provide the audience with an intution about what inspired it, and often can make it much more likely for the work to ultimately be understood and built upon. And isn't that the real point of a scientific or engineering paper?

And this has been your math mutation for today.

References: