The structure and function of globular proteins

You don’t need to be an aeronautical engineer to know that a plane’s ability to fly depends on more than just the parts that it’s built from. A powerful engine, well-designed wings, and an aerodynamic body are all necessary for flight—but they aren’t sufficient. The right structure is needed, too. A plane with both wings on the left side of its body isn’t flying anywhere. In other words, you don’t just need the right parts; You need the right parts, put together in the right way.
The relationship between the structure of a machine’s parts in relation to one another and the proper functioning of the machine itself, applies not just to the machines that you use on a daily basis, but also to the ones at work inside our bodies. These molecular machines, called globular proteins, depend on finely tuned three-dimensional structures in order to function properly.

Proteins are linked-together amino acids

Unlike human-sized machines, which are often built from a bewildering variety of different parts, globular proteins are put together from one class of components, called amino acids. There are 20 different types of amino acids. A protein consists of a unique combination of amino acids drawn from this 20-member library. The amino acids that make up a protein are linked together into long linear chains, like a train made up of lots of individual box cars connected together, one after another.
The reason amino acids can be linked together into long linear chains is because the right side of any one amino acid is strongly attracted to the left side of any other amino acid. When you bring two amino acids close together, with the left side of one lined up to the right side of the other, they stick together, much like two magnets that have been aligned so that the north pole of one meets the south pole of another. When two amino acids stick together in this way, chemists say that a peptide bond has formed.
The purpose of a gene is to tell a cell what amino acids, and in which order, make a particular protein. Once the molecular mechanics of a cell link the specific amino acids together into a linear chain, with each amino acid joined to its two neighbors by peptide bonds, the protein folds up into a complex three-dimensional shape, called the native conformation. The native conformation is analogous to a plane with all its parts in the right place: when a protein is in its native conformation, it’s ready to work. When something happens to knock it out of its native conformation, its effectiveness decreases or gets lost all together.

Proteins in their native conformations have multiple levels of structural organization

What does a protein in its native conformation look like? There are four levels of structural organization for proteins in their native conformations.
The primary structure of a protein refers to the specific amino acid sequence of the protein, plus the peptide bonds that join each of these amino acids together. In other words, the primary structure of a protein is fixed as soon as the amino acids are linked together. Primary structure is the two-dimensional component of the eventual three-dimensional shape.
The secondary structure of a native conformation refers to the three dimensional organization of the main chain atoms of a protein. The main chain atoms of a protein are the atoms that all amino acids in a protein have in common (shown in the picture below in black). Main chain atoms are named in contrast to side chain atoms, which are the atoms in an amino acid—leucine, say—that distinguish it from another amino acid—isoleucine, for example (shown in the picture below in blue).
As it happens, the two most common types of secondary structures that occur in the main chain atoms of proteins resemble coils and zigzags. The coils are called alpha helices.
The zigzags are called beta sheets.
Different types of proteins have different distributions of alpha helices and beta sheets: some have lots of the former, and few of the latter; some have lots of the latter and few of the former; and others have a mix of both.
The tertiary structure of a native conformation refers to the three dimensional organization of all the atoms—including side chain atoms—in a protein. Perhaps the best way to visualize what tertiary structure looks like is to imagine taking an amino acid sequence with primary and secondary structure and crumpling it up into a ball. Just as each type of protein has its own unique primary and secondary structure, it also has its own unique tertiary structure.
The quaternary structure of a native conformation refers to the three dimensional organization of all the atoms in a multi-subunit protein. Multi-subunit proteins consist of two or more individual amino acid chains, each with their own primary, secondary, and tertiary structures. The way these individual chains fit together into an overall three dimensional arrangement is called quaternary structure. Only multi-subunit proteins have quaternary structure.

All 4 levels of protein structure are determined by amino acids interacting with each other and their environment

Why do native conformations happen? Like anything else, the ultimate explanation involves the laws of physics. As we already said, amino acids first come together to form primary structures because of attractions between the left and right sides of neighboring amino acids. Similarly, secondary structures form primarily because of attractive and repulsive forces generated by interactions between the main chain atoms of neighboring amino acids. And, finally, tertiary structure mostly arises from interactions between side chain atoms of amino acids and the water molecules from the surrounding environment.
What does this mean? Well, the laws of thermodynamics conspire to maximize the free movement of water molecules at the molecular level. When a protein is stretched out—when it isn’t folded up into a secondary and tertiary structure—the freedom of movement for the water in the surrounding environment is limited. It turns out that crumpling proteins up into specific tertiary or quaternary structures maximizes the freedom for water molecules to move.

Many diseases are caused by errors in protein structure

When it comes to people-sized machines, we know that changing the structure of the machine can alter its function. A plane can’t fly unless all its parts are put together in the right way. The same is true for proteins. So true, in fact, that you can think of many diseases as errors of protein structure: something happens in the body that causes a protein to lose an aspect of its native conformation, and this loss of structure causes a loss of function.
The most well known example of this happens in sickle cell anemia. The hemoglobin protein is responsible for transporting oxygen through your blood. People with sickle cell anemia have a genetic mutation that alters the shape of their hemoglobin molecules, and this alteration causes the proteins to aggregate together into useless clumps.
Another example of changes in protein structure that lead to disease happens in something called fatal familial sleeping sickness. The main symptom of this disease, as the name suggests, is permanent, incurable, and ultimately deadly insomnia. Like sickle cell anemia, the cause is a mutation that leads to a malformation of a protein, called major prion protein.
It’s not all doom and gloom, though. In some cases—they’re rare, but they exist—a change in protein shape leads not to a broken protein, but rather to one that does its job better. For instance, the famous Finnish cross-country skier Eero Mäntyranta (1937-2013), who won multiple Olympic gold medals and set many world records, was found by anti-doping authorities to have surprisingly elevated red blood cell levels, considered a tell-tale sign of EPO abuse (EPO is a hormone that increases red blood cell counts, and hence a person’s ability to transport oxygen to their muscles during exercise).
However, it was soon discovered that, rather than being a cheater, Eero Mäntyranta’s super-human red blood cell levels were caused by a genetically-caused change in native conformation in the receptors he produced for EPO. This change in three-dimensional shape for the EPO receptors in Mäntyranta’s cells caused them to be super-sensitized to the presence of naturally occurring hormones. This is one example of an inherited error in protein structure that results in a happy ending—world records and gold medals—rather than a debilitating disease.

Consider the following: gene therapy seeks to fix errors in protein structure at the source

Given the central role of structure for the proper function of proteins, one might wonder if there have been any attempts to cure diseases of protein structure by prompting the body to produce properly-shaped versions.
Something called “gene therapy” can be thought of as one such attempt. If proteins stop working correctly because of changes to shape, and they undergo these changes because of alterations to their amino acid sequence, then the obvious thing to do is fix the problem at its source, by changing the instructions that specify the amino acid sequence of the protein: the gene. The idea is that if you can tell your cells to start using the right amino acids to build a protein, you should be able to get rid of the disease.
Somewhat surprisingly, the obvious problem with this approach—how can you tell a cell how to start using the right amino acids to build a protein?—has largely been solved, at least in theory. Gene editing techniques such as CRISPR-cas9 allow scientists to cut and paste DNA sequences into the genome of living organisms. Although this technology is still in its infancy, it should eventually allow doctors to identify and replace mutated portions of a gene coding for a misshapen protein. The remaining roadblocks largely have to do with unintended consequences of such techniques. Editing the DNA of a certain gene has so far proven to introduce accidental changes in other parts of the genome, which often times results in permanently altering the production of other proteins. And these unintended alterations in protein production can lead to cancer. In other words, we don’t yet know, in general, how to cure one disease without potentially causing another. Recent developments have shown promise in controlling the unintended consequences of gene therapy, and potential cures for genetic diseases and certain types of cancer via gene therapeutic routes are in clinical trials now.