Author: Brianna Nee
Editor: Liane Xu and Megan Liu
Artist: Tiffany Gao
The human genome consists of about 3.2 billion base pairs, which make up approximately 30,000 genes. When genes are expressed, the sequence of bases is transcribed in the nucleus from DNA into RNA. Then, in the ribosome, the RNA is translated into proteins. These proteins aid in the function of the human body.
Surprisingly, much of human DNA doesn’t code for proteins. These non-coding regions can be sites that control the expression of genes. Transcription factors bind to these areas and activate or repress gene expression. Gene expression regulates the presence of proteins to adapt to changes in the human body. Non-coding DNA can also be transcribed into transfer RNAs and ribosomal RNAs, and then spliced out in different ways to form different sequences of RNA from the same original strand. Thus, although non-coding regions do not code for proteins, they are also vital parts of the human genome.
The Human Genome Project attempts to sequence all bases in the human genome, increasing our understanding of human genetics and developing gene-based treatments for disease. The human genome only slightly varies between individuals, about 0.001%, so the ability to identify possibly strong similarities and differences using the genetic blueprint derived from the Human Genome Project would be revolutionary.
The Human Genome Project used DNA sequencing technologies, mainly Sanger sequencing. Sanger sequencing uses fluorescent terminators and electrophoresis. DNA is double-stranded, so the double-stranded DNA produces two single-stranded DNA, and then a primer is attached to the single-stranded DNA. The template DNA and attached primer, deoxynucleotides, and a low concentration of the fluorescent terminators (dideoxynucleotides) are combined in the test tube. Deoxynucleotides consist of an organic base (adenine, thymine, guanine, and cytosine), ribose sugar, and phosphate. DNA polymerase then makes a complementary copy of a strand of DNA using deoxynucleotides and dideoxynucleotides. Without a 3’ hydroxyl (OH) group in dideoxynucleotides, the chain terminates because the phosphodiester bond needed to continue the sequence cannot be formed. Many DNA fragments are produced with different lengths, ending with a fluorescent dye corresponding to one of the four dideoxynucleotides. Gel electrophoresis orders the DNA sequences based on size, and less heavy DNA fragments move toward the positive electrode faster. As DNA fragments pass through the end of the capillary, a laser excites the DNA fragments. The fluorescent dye in the dideoxynucleotide emits light at a specific wavelength detected by a light sensor. The sequence of DNA appears in a colorful electropherogram through the peaks of certain colors. Because of the Human Genome Project, DNA sequencing advanced. Sanger sequencing was automated, and more samples could be sequenced simultaneously.
By the end of the project in 2003, gaps remained in the DNA sequence, mainly in areas that were hard to sequence. More advanced technology was needed to sequence the remaining gaps in the human genome, and different DNA sequencing technologies have since been developed. In recent news, PacBio and Oxford Nanopore technologies may have helped complete the human genome, according to a preprint (not peer-reviewed) paper. Currently, widely used Illumina sequencing uses small DNA fragments to sequence genomes, but problems arise when there are long repeating sequences that make it challenging to develop accurate DNA sequences in those regions. Nanopore technology, which sequences longer DNA molecules, and PacBio, which repeatedly analyzes the same DNA sequence, seem to solve this problem. As DNA sequencing technologies advance, sequencing can become more affordable, accurate, and efficient and may even detect epigenetic events.
Citations:
Adams, Jill. “DNA Sequencing Technologies.” Scitable, 2008,
https://www.nature.com/scitable/topicpage/dna-sequencing-technologies-690/.
“Deoxynucleotide.” School of Biomedical Sciences Wiki,
https://teaching.ncl.ac.uk/bms/wiki/index.php/Deoxynucleotide. Accessed 2021.
“DNA Sequencing.” National Human Genome Research Institute, https://www.genome.gov/genetics-
glossary/DNA-sequencing#:~:text=DNA%20sequence%20a%20laboratory,investigating%20the
%20functions%20of%20genes.
Fridovich-Keil, Judith L. “Human genome.” Britannica, britannica.com/science/human-genome.
Herper, Matthew. “Researchers may have sequenced the ‘final unknown’ of the human genome.”
PBS, 3 6 2021, https://www.pbs.org/newshour/amp/science/researchers-may-have-sequenced
the-final-unknown-of-the-human-genome.
“The Human Genome Project.” National Human Genome Research Institute,
https://www.genome.gov/human-genome-project.
Rettner, Rachael. “Epigenetics: Definition & Examples.” LiveScience, 2013,
https://www.livescience.com/37703-epigenetics.html.
Schoales, Jeremy. “How Does Sanger Sequencing Work?” ThermoFisher Scientific, 2015,
https://www.thermofisher.com/blog/behindthebench/how-does-sanger-sequencing-work/.
“Studying Genes.” National Institute of General Medical Sciences, 2017,
https://www.nigms.nih.gov/education/Documents/Studying_genes_final.pdf.
“What is noncoding DNA?” MedlinePlus,
Comments