17 Gene Expression Overview
Andrea Bierema
Learning Objectives
Students will be able to:
- Describe the structure and purpose of DNA and RNA.
- Describe the general process of protein synthesis.
- Describe the molecular anatomy of genes and genomes.
- Identify DNA and mRNA bases and binding patterns.
- Interpret a codon-amino acid chart.
- Given a DNA sequence, determine the corresponding mRNA sequence and amino acid sequence.
What is a Gene?
The gene is the basic physical unit of inheritance. Genes are passed from parents to offspring and contain the information needed to specify traits. Genes are arranged, one after another, on structures called chromosomes. A chromosome contains a single, long DNA molecule- only a portion of which corresponds to a single gene- as well as the structural proteins (called histones) that the DNA molecule wraps around. Humans have approximately 20,000 genes arranged on their chromosomes. Watch the following brief video for an animated view of the relationship between chromosomes and genes.
Central Dogma
The central dogma of molecular biology is that DNA codes for RNA and RNA codes for protein. In addition to DNA coding for RNA, much of the DNA regulates the synthesis of RNA- which ultimately means that it regulates the synthesis of protein. We will learn about gene regulation in later chapters.
Because proteins are coded by genes, the term “gene expression” refers to protein synthesis (i.e., making proteins), including the regulation of that synthesis.
There are two main processes that must occur to synthesize proteins: transcription and translation. During the process of transcription—which occurs in the nucleus—an mRNA molecule is created by reading the DNA. Note that DNA never “becomes” RNA; rather, the DNA is “read” to make an RNA molecule. The mRNA leaves the nucleus and then, through the process of translation, the mRNA is read to create an amino acid sequence that folds into a protein.
Transcription occurs in the nucleus and translation occurs outside of the nucleus at the ribosomes (which are either in the cytoplasm or attached to the rough endoplasmic reticulum. Below is a micrograph image that was taken of this area and the other is a cartoon representation.
Consider what the terms “transcribe” and “translate” mean in relation to language. To “transcribe” something means to rewrite text again in the same language while to “translate” something means to rewrite the text in a different language. Similar to these meanings, in biology, DNA is transcribed into RNA: both DNA and RNA are made of nucleic acid (i.e., the same “language”). With the assistance of proteins, DNA is “read” and transcribed into an mRNA sequence. To read RNA and create protein, though, we refer to it as being translated: RNA is made of nucleic acid, and protein is made of amino acids (i.e., different “languages”). Therefore, DNA is transcribed to create an mRNA sequence, and then the mRNA sequence is translated to make a protein.
DNA and RNA
The two main types of nucleic acids are deoxyribonucleic acid (DNA) and ribonucleic acid (RNA). As described earlier in this chapter, DNA is the genetic material in all living organisms, ranging from single-celled bacteria to multicellular mammals. It is in the nucleus of eukaryotes and in the organelles mitochondria and chloroplasts. In prokaryotes, the DNA is not enclosed in a membranous envelope.
The cell’s entire genetic content is its genome, and the study of genomes is genomics. In eukaryotic cells but not in prokaryotes, a DNA molecule may contain tens of thousands of genes. Many genes contain information to make protein products (e.g., mRNA). Other genes code for RNA products. DNA controls all of the cellular activities by turning the genes “on” or “off.”
The other type of nucleic acid, RNA, is mostly involved in protein synthesis. The DNA molecules never leave the nucleus but instead, use an intermediary molecule to communicate with the rest of the cell. This intermediary is the messenger RNA (mRNA). Other types of RNA—like rRNA, tRNA, and microRNA—are involved in protein synthesis and its regulation.
DNA and RNA are comprised of monomers that scientists call nucleotides. The nucleotides combine with each other to form a polynucleotide, DNA, or RNA. Three components comprise each nucleotide: a nitrogenous base, a pentose (five-carbon) sugar, and a phosphate group. Each nitrogenous base in a nucleotide is attached to a sugar molecule, which is attached to one or more phosphate groups. Therefore, although the terms “base” and “nucleotide” are sometimes used interchangeably, a nucleotide contains a base as well as part of the sugar-phosphate backbone.
Exercises
Examine the image above and then answer the following questions:
Protein Synthesis Overview
The two main processes in protein synthesis are transcription and translation. The following is an overview of each of these processes. Each process will be described in more detail in future chapters. Note that the rest of this textbook will focus on what happens in eukaryotic cells. Please see this page by Lumen for details on prokaryotic gene expression.
Transcription
A gene is complex: it contains not only the code for the resulting protein but also several regulatory factors that determine if and when the region that codes for a protein is read to create protein. What follows is a diagram of the components of a gene that are used in transcription.
This textbook focuses on the DNA and the ending product of transcription: mRNA.
Exercise
Given a specific DNA strand, what is the sequence of the resulting mRNA molecule? We will learn about how mRNA is created in a later chapter.
Translation
Translation involves different types of RNA, and we will explain them in more detail in later chapters: rRNA, tRNA, mRNA, and microRNA.
After an mRNA is created, it leaves the nucleus and is attracted to or attracts a ribosome, which is a molecule made of rRNA and polypeptides. Then, in the ribosome, and with the assistance of tRNAs, the mRNA is read and an amino acid sequence is created.
DNA and mRNA create sequences with just four types of bases; yet, these bases code for 20 unique amino acids (the makeup of protein). How is this possible? Watch the following video to find out!
For closed captioning or to view the full transcript see the video on YouTube. Or click on the “YouTube” link in the video.
The mRNA is read in sets of three bases known as codons. Each codon codes for a single amino acid. In this way, the mRNA is read and the protein product is made.
Below is a table showing which codons code for which bases.
Codon Chart
Codon | Amino Acid |
---|---|
UUU | Phenylalanine (Phe) |
UUC | Phenylalanine (Phe) |
UUA | Leucine (Leu) |
UUG | Leucine (Leu) |
CUU | Leucine (Leu) |
CUC | Leucine (Leu) |
CUA | Leucine (Leu) |
CUG | Leucine (Leu) |
AUU | Isoleucine (Ile) |
AUC | Isoleucine (Ile) |
AUA | Isoleucine (Ile) |
AUG | Methionine (Met) |
GUU | Valine (Val) |
GUA | Valine (Val) |
GUG | Valine (Val) |
UCU | Serine (Ser) |
UCC | Serine (Ser) |
UCA | Serine (Ser) |
UCG | Serine (Ser) |
CCU | Proline (Pro) |
CCC | Proline (Pro) |
CCA | Proline (Pro) |
CCG | Proline (Pro) |
ACU | Threonine (Thr) |
ACC | Threonine (Thr) |
ACA | Threonine (Thr) |
ACG | Threonine (Thr) |
GCU | Alanine (Ala) |
GCC | Alanine (Ala) |
GCA | Alanine (Ala) |
GCG | Alanine (Ala) |
UAA | Stop (not an amino acid) |
UAG | Stop (not an amino acid) |
CAU | Histidine (His) |
CAC | Histidine (His) |
CAA | Glutamine (Gln) |
CAG | Glutamine (Gln) |
AAU | Asparagine (Asn) |
AAC | Asparagine (Asn) |
AAA | Lysine (Lys) |
AAG | Lysine (Lys) |
GAU | Aspartic Acid (Asp) |
GAC | Aspartic Acid (Asp) |
GAA | Glutamic Acid (Glu) |
UAU | Tyrosine (Tyr) |
UAC | Tyrosine (Tyr) |
UGU | Cysteine (Cys) |
UGC | Cysteine (Cys) |
UGA | Stop (not an amino acid) |
UGG | Tryptophan (Trp) |
CGU | Arginine (Arg) |
CGC | Arginine (Arg) |
CGA | Arginine (Arg) |
CGG | Arginine (Arg) |
AGU | Serine (Ser) |
AGC | Serine (Ser) |
AGA | Arginine (Arg) |
AGG | Arginine (Arg) |
GGU | Glycine (Gly) |
GGC | Glycine (Gly) |
GGA | Glycine (Gly) |
GGG | Glycine (Gly) |
Codon chart of triplet mRNA base codes and corresponding amino acids.
The following are two representations of the information in the above table; move to the next slide for the second representation. These representations are commonly used in biology textbooks.
These charts can be a little confusing at first. Watch the following video to learn how to interpret both chart formats.
Exercise
Conclusion
This chapter focused on DNA, mRNA, and protein sequences. The next several chapters describe gene expression processes- both protein synthesis and regulation of that synthesis. Master how sequences are read during protein synthesis (the focus of the current chapter) before moving on to the next chapter. Below are some sources to help further your understanding!
Example
Check out Learn.Genetics’ “How a Firefly’s Tail Makes Light” video for an overview of protein synthesis!
Exercises
Need a little more practice?
Try out Learn.Genetics’ “Transcribe and Translate a Gene” and The Concord Consortium’s “DNA to Protein” interactives for further practice!
Attributions
This chapter is a modified derivative of the following articles:
“Gene” by National Human Genome Research Institute, National Institutes of Health, Talking Glossary of Genetic Terms.
“Nucleic Acids” by OpenStax College, Biology 2e, CC BY 4.0. Download the original article at https://openstax.org/books/biology-2e/pages/3-5-nucleic-acids