Chemistry of amino acids and protein structure

Proteins are large, complex molecules that are critical for the normal functioning of the human body. They are essential for the structure, function, and regulation of the body’s tissues and organs. Proteins are made up of hundreds of smaller units called amino acids that are attached to one another by peptide bonds, forming a long chain. You can think of a protein as a string of beads where each bead is an amino acid.
Image of beads on a string

Amino acid structure and its classification

An amino acid contains both a carboxylic group and an amino group. Amino acids that have an amino group bonded directly to the alpha-carbon are referred to as alpha amino acids. The simplest representation of an alpha amino acid is shown below.
Image of an amino acid structure
Every alpha amino acid has a carbon atom, called an alpha carbon, ; bonded to a carboxylic acid, COOH\text{–COOH}, group; an amino, NH2\text{–NH}_{2}, group; a hydrogen atom; and an R group that is unique for every amino acid. If you notice in the structure above, is a chiral center, that is to say, this carbon atom is attached to four different groups. Chirality refers to a molecule that has optical activity, so amino acids are optically active molecules. The only exception is glycine, the simplest amino acid, in which R = H.
Image of glycine
Commonly, amino acids are represented as follows:
Image of amino acid structures as fischer projections

L and D amino acids

Image showing right-handed and left-handed structures, indicated by R and L
As shown above, L and D amino acids are mirror images of each other and are non-superimposable on each other, just like our left and right hands. By non-superimposable, we mean that when the mirror image of the object is placed over the original object, they do not have a perfect overlap. Pairs of amino acids like these are called enantiomers.
Only L-amino acids are constituents of proteins. Our body synthesizes most of its own L-amino acids; these then get incorporated into proteins. Proteins are catalysts for most of the biochemical reactions that take place in our body. Along with D, N, A and R, N, A, proteins constitute the genetic machinery of living organisms. Proteins are often called the building blocks of life.

Isoelectric point (pI) of amino acids

Isoelectric point is the point along the pH scale where the amino acid has a net zero charge. Consider glycine. Look at the equilibrium below; as we add hydroxide ions—in other words, raise the pH—different charged forms of glycine exist. Form B has a net zero charge and is called a zwitterion. Form A has a net charge of +1, and form C has a net charge of -1.
Image illustrating the isoelectric point
The titration curve of glycine will look something like the one shown below; at p, H, equals, 2, point, 34, left parenthesis, p, K, a, start subscript, 1, end subscript, right parenthesis, forms A and B will be in equilibrium—i.e., the concentration of A equals the concentration of B. At p, H, equals, 9, point, 6, forms B and C will be in equilibrium—i.e., concentration of B equals concentration of C.
Graph of the titration curve for glycine
Mathematically, isoelectric point, pI, for glycine is calculated using the formula below:
p, I, equals, start fraction, p, K, a, start subscript, 1, end subscript, plus, p, K, a, start subscript, 2, end subscript, divided by, 2, end fraction, equals, start fraction, 2, point, 34, plus, 9, point, 6, divided by, 2, end fraction, equals, 5, point, 97
Every amino acid has a different pI, which largely depends on the nature of the side chain present. Acidic and basic side chains affect the p, K, a, start subscript, 1, end subscript and p, K, a, start subscript, 2, end subscript values, thus affecting the pI of the amino acid.

Classification of different amino acids

Image of an amino acid structure
There are 20 common amino acids. Based on the nature of the R group, they are classified as follows:
Flow chart illustrating the types of amino acids
Let's summarize the flowchart above:
  • Hydrophobic amino acids have nonpolar side chains, such as alkyl groups or aromatic groups.
  • Hydrophilic—neutral—amino acids contain polar side chains, such as hydroxyl, negative, O, H, and sulfhydryl, negative, S, H, groups.
  • Hydrophilic—acidic—amino acids have side chains that contain carboxylic acid, negative, C, O, O, H, groups.
  • Hydrophilic—basic—amino acids have side chains that contain amine, negative, N, H, start subscript, 2, end subscript, groups.

How are amino acids joined together?

Simple image of peptide bonds
Amino acids are joined together through peptide bonds. Peptide bonds are covalent bonds formed by the nucleophilic addition-elimination reaction between the carboxylic group of one amino acid and the amino group of another amino acid; this reaction releases a molecule of water as the by product. A peptide bond is essentially an amide bond.

Mechanism of peptide bond formation

The simplest way to represent a peptide bond formation is as follows. Let’s consider two amino acids with side chains, start color redD, R, start subscript, 1, end subscript, end color redD and start color blueD, R, start subscript, 2, end subscript, end color blueD respectively.
Image illustrating the mechanism of peptide bond formation
Step 1: The nucleophilic amino group of the second amino acid attacks the electrophilic carbonyl group of the first amino acid.
Step 2: The carbonyl bond reforms with the elimination of hydroxide ion.
Step 3: The hydroxide ion abstracts a proton—elimination of water—and the positive charge on nitrogen is neutralized. This results in the formation of a new bond—a peptide bond between the two amino acids.
Please note that this is a very simplistic representation of the mechanism of a peptide bond formation. The mechanism gets complicated in the context of peptide-protein synthesis in biological systems where catalysts, cofactors, and enzymes are involved.

The double-bond character of the peptide bond

If you were asked to draw a peptide bond, you might draw a single bond between the nitrogen and the carbonyl carbon atoms. But in reality, this single bond is not a conventional single bond; in fact, it has a double-bond character. This double-bond character comes from the various resonance structures of a polypeptide. The next question would be: what are resonance structures?
Resonance structures are different representations of the same molecule; the arrangement of the atoms remains the same, but the electrons are distributed differently amongst the atoms. Resonance structures exist when there is a possibility of movement of electrons between neighboring functional groups, as in the case of polypeptides.
Let’s draw three amino acids connected to each other through peptide bonds.
Image of three amino acids connected to each other through peptide bonds
As illustrated above, electrons can move across the amide, O=C-N\text{O=C-N}, bond generating the two structures A and B respectively. Structure C is the overall hybrid representation of the two resonance structures A and B, where the entire peptide bond, O=C-N\text{O=C-N}, is shown to have a partial-double-bond character, represented by a solid line with a dotted line running parallel to to it. So, as a result of resonance, the bond between the carbonyl carbon and nitrogen acquires a partial-double-bond character, and, just like any double bond, rotation around this peptide bond is now restricted. Also, as with all double bonds, the atoms of the peptide bond have planar geometries. This planar geometry causes the peptide bond to be either in the cis or the trans configuration. In the cis configuration, the two alpha carbon atoms fall on the same side of the peptide bond. In the trans configuration, these groups are on opposite sides of the peptide bond.
The take home message:
Flowchart showing resonating structure moving into rigid peptide bonds
Different levels of protein structure
The four levels of protein structure are: primary structure, secondary structure, tertiary structure, and quaternary structure. This concept can get a bit confusing, so let’s try to understand it through a simple analogy.
Can you imagine being able to write a paragraph if alphabets didn’t exist? In fact, we have to go through a hierarchy of complexity before we can even attempt to write a paragraph. The alphabet is needed to construct words; words are needed to construct sentences; and sentences are needed to construct a paragraph. Similarly, a fully functional protein is assembled through four levels of hierarchy as illustrated below.
Image showing different levels of protein structure related to the alphabet and sentence analogy described in the text
Primary structure simply refers to the linear sequence of amino acids joined to each other through peptide bonds. The sequence of amino acids determines the basic structure of the protein.
Image of primary protein structure
Unlike the rigid peptide bond, the bond linking the amino group to the alpha carbon atom and the bond linking the alpha carbon atom to the carbonyl carbon are single bonds—as shown in the image below. These two bonds are free to rotate about the amide bonds, allowing the amino acids in the polypeptide chain to take on a variety of orientations.
Image showing possible rotations and restricted rotations
The enhanced freedom of rotation with regards to these two bonds allows proteins to fold into a variety of shapes. These folded structures are referred to as secondary protein structures and are essentially of two types—alpha helix and *beta pleated sheets. These folded secondary structures are stabilized by the formation of hydrogen bonds between the amino acids.

α-helix

In an α helix, the amino acids get oriented in such a manner that the carbonyl, C=O\text{C=O}, group of the nth amino acid can form a hydrogen bond with the amido, N-H, group of the left parenthesis, n, plus, 4, right parenthesis, start superscript, t, h, end superscript amino acid. This results in a strong hydrogen bond that has an optimum hydrogen to oxygen, H….O, distance of 2.8 Å. The hydrogen bonds between the amino acids stabilize the α-helix structure. The structure of a α helix is shown below:
Image of an alpha helix

β-pleated sheet

In β sheets, however, hydrogen bonding occurs between neighboring polypeptide chains rather than within the same polypeptide, as in the case of an α helix. Sheets exist in two forms. The first, the antiparallel β sheet, has neighboring hydrogen-bonded polypeptide chains running in opposite directions—i.e., one polypeptide chain starts from the terminal carboxylic group and ends at the terminal amino group, left to right, while the other polypeptide chain starts from the terminal amino group and ends at the terminal carboxylic group, left to right. The second form, the parallel β sheet, has hydrogen-bonded chains extending in the same direction. You can see the two forms in the cartoons below.
image of a beta sheet
Tertiary structure: When several secondary structures come together, tertiary structures are formed. In tertiary structures, in addition to hydrogen bonding, amino acid side chains of the various secondary structures start interacting with each other in a number of ways. These interactions include hydrophobic interactions, ionic interactions, and disulfide bonds as illustrated below.
Image of tertiary protein structure
Quaternary structure: When several tertiary structures come together, a quaternary protein structure is formed. For example, hemoglobin is a functional quaternary protein formed by the coming together of four tertiary structures, called globin proteins. The same forces of interactions operate in a quaternary structure as operate in a tertiary structure.

Forces that keep the different protein structures together

Level of protein structureInteractions that stabilize the structure
PrimaryCovalent bond (amide/peptide bond)
SecondaryHydrogen bonds
TertiaryIonic bonds, disulfide bonds, hydrophobic interactions, hydrogen bonding
QuaternaryIonic bonds, disulfide bonds, hydrophobic interactions, hydrogen bonding
In summary, the primary structure of a protein simply refers to the linear polypeptide with its amino acid sequence. The secondary structure is the folded version of the linear polypeptide stabilized by hydrogen bonding. The tertiary structure is formed by the coming together of several secondary structures that are held together by various types of interactions, and finally a quaternary structure is formed by the combination of several tertiary structures, again held together via different types of interactions.

Attribution:

This article is licensed under a CC-BY-NC-SA 4.0 license. https://creativecommons.org/licenses/by-nc-sa/4.0/

Additional references

"Quaternary structure" by Holger87, own work. Licensed under CC BY-SA 3.0 via Wikimedia Commons, http://commons.wikimedia.org/wiki/File:Quaternary_structure.png#/media/File:Quaternary_structure.png