[Originally published in 2013 as Can Random Processes Produce Biological Information?]
Anyone who follows my work is well aware that I think random processes cannot produce biological information. So it was unsurprising that a recent comment on my blog included a link to an article by Dr. Fazale Rana in which Dr. Rana makes the claim that a recent study demonstrates that biological information can be produced by random processes. Obviously, the commenter wanted my take on the article.
Before I comment further, I want to make it clear that Dr. Rana has probably forgotten more biochemistry than I have ever learned. I have a lot of respect for him and am a big fan of his latest book. He and I disagree on some issues, but the issues on which we agree are far more numerous and far more important. This particular issue, however, represents one of the former. While I think the difference in our positions is largely semantic, it is important and worth defining.
In the article, Dr. Rana reports on a study1 that was published in the Proceedings of the National Academy of Sciences of the United States of America. In the study, the authors compared the binding pockets of all known proteins in nature to a database of randomly-generated peptides (molecules that are very much like proteins but not large enough to be considered proteins).
In order to understand the results of the study, you need to know what a binding pocket is.

A protein is a large molecule, but the workhorse of the protein is typically called its active site. When a protein needs to modify a molecule in some way, it attaches itself to the molecule at its active site. This active site is held in a region of the protein called the binding pocket. So the binding pocket is the area on the protein that contains the active site. The illustration here gives you a simplified view of a protein called phenylalanine racemase, a good example of a protein that is used in a wide variety of living organisms. The star points out the binding pocket.
In the study, the authors found that there were remarkably few varieties of binding pockets found in all the known proteins, and that all those pockets were able to bind (at least in some way) to something in the randomly-generated set of peptides.
The conclusion, then, is that random chance could, indeed, produce biologically active proteins. After all, if randomly-generated molecules could bind to the binding pockets of the known proteins of life, then those known proteins of life could also be randomly generated.
To me, the problem with such reasoning is rather obvious. Look at the illustration again. Notice how small the binding pocket is compared to the protein as a whole.
While the binding pocket is an important part of the protein, it is not the only important part. I could easily create a protein that has the right binding pocket, but all that means is that it will bind to the molecule in question.
But once it binds, what will it do? Will it bind to and destroy the molecule? Will it coagulate with other proteins that have bound to other molecules and make a useless mess? Will it bind to the molecule and activate a function that will cause a destructive reaction to occur? Will it promote the reaction that the cell needs the molecule to undergo? What happens once the protein bonds to the molecule depends on the rest of the protein.
So we can’t just analyze protein binding pockets. We must analyze proteins as a whole. When we do that, we find that, unlike what Dr. Rana claims, it is very hard to produce a biologically functional protein. For example, when a researcher considered more than just the binding pockets of proteins, he found2
…the overall prevalence of sequences performing a specific function by any domain-sized fold may be as low as 1 in 10(77), adding to the body of evidence that functional folds require highly extraordinary sequences.
In other words, to produce a biologically functional protein by random processes, the odds are one in 1077. That is clearly beyond the reach of chance!
Now I said that the difference between Dr. Rana’s position and mine is probably mostly semantic.
In his article, he goes on to say that while biological information can be produced by random processes, the algorithmic information necessary for the biological molecules to function in the context of the cell is beyond the reach of chance. Perhaps the rest of the protein is what provides the “algorithmic” information. If that’s the case, then we are essentially saying the same thing.
In my mind, however, the ability for a protein to bind to another molecule has little to do with biological information. It is necessary for the protein to do its job, of course, but it takes a lot more than binding to get the job done! Unless the protein can actually use its binding ability to do a biologically helpful task, it contains no biological information. Thus, I don’t see the relevance of this study to the issue of biological information.
However, the study does have relevance.
Many drugs work by binding to a specific protein in order to inhibit its ability to do its job. Since this study indicates that there are relatively few binding pockets in biological proteins, this indicates that when I make a drug that binds to one protein, it probably will bind to many other proteins.
As I see it, then, the main conclusion we can draw from this study is that drugs which bind to specific proteins probably will bind to lots of other proteins, which means that such drugs will have the possibility of several unintended side effects.
References
- Jeffrey Skolnick and Mu Gao, “Interplay of physics and evolution in the likely origin of protein biochemical function,” Proceedings of the National Academy of Sciences of the United States of America, doi 10.1073/pnas.1300011110, 2103
- Axe DD, “Estimating the prevalence of protein sequences adopting functional enzyme folds,” Journal of Molecular Biology 341:1295-1315, 2004