In “Finding Hidden Messages in DNA“, we discussed how to separate some of the signal from the apparent noise of DNA sequences. But how do we know what the DNA sequence making up a genome is in the first place? After all, DNA nucleotides are far too small to view with a normal microscope, and biologists still do not possess technology that would read all the nucleotides of your genome from beginning to end.
In the first chapter of this course, we will see that entire genomes are assembled from millions of short overlapping pieces of DNA. The scale of this problem (the human genome is 3 billion nucleotides long!) implies that computers must be involved. Yet the problem is even more complex than it may appear … to solve it, we will need to travel back in time to meet three famous mathematicians, and learn about algorithms based on graph theory.
In the second chapter of the course, we will consider antibiotics, which are mini-proteins engineered by bacteria to fight each other. Determining the sequence of amino acids making up an antibiotic is an important problem for medical research, but the practical barriers to sequencing an antibiotic containing just 10 amino acids are often more substantial than the limitations when assembling a genome with 10 million nucleotides! To decode the amino acid sequence of an antibiotic, biologist must blast many copies of this antibiotic into pieces and use an expensive molecular scale to weigh the resulting fragments. It may not seem possible that we can determine an antibiotic just from the masses of these fragments, but we will see that brute force algorithms will often succeed.
How Do We Assemble Genomes? (Graph Algorithms)
The String Reconstruction Problem
String reconstruction as a walk in the overlap graph
Another graph for string reconstruction
Walking in the de Bruijn graph
The seven bridges of Konigsberg
From Euler’s Theorem to an Algorithm for Finding Eulerian Cycles
Assembling genomes from read-pairs
Epilogue: Genome assembly faces real sequencing data
How Do We Sequence Antibiotics? (Brute Force Algorithms)
The discovery of antibiotics
How do bacteria make antibiotics?
Dodging the Central Dogma
Sequencing antibiotics by shattering them into pieces
A brute force algorithm for cyclopeptide sequencing
A branch-and-bound algorithm for cyclopeptide sequencing
Just how fast is this algorithm?
Adapting cyclopeptide sequencing for spectra with errors
From 20 to more than 100 amino acids
The spectral convolution saves the day
Epilogue: From simulated to real spectra
“Finding Hidden Messages in DNA” is the suggested prerequisite for taking this course, but it is not a strict prerequisite, especially if you have some programming experience.
The programming assignments in this class can be solved using any programming language.
The printed course companion is Bioinformatics Algorithms: An Active-Learning Approach, by Compeau & Pevzner.
The majority of assessments for the course will consist of exercises and programming assignments. This course covers two chapters taken from Bioinformatics Algorithms: An Active Learning Approach, by Compeau & Pevzner.
Each chapter is also accompanied by a summary quiz and lecture videos.
Q: Will I get a statement of accomplishment after completing this class?
Yes. Students who successfully complete the class will receive a statement of accomplishment signed by the instructor.
Q: Can I receive a verified certificate for this course?
Yes. Students who would like a verified certificate can sign up for the course’s Signature Track option.
Q: I remember this course used to be part of the larger “Bioinformatics Algorithms (Part 1) course. Why was it split into three courses?
Based on survey feedback, completion data, and studies of other courses, we realized that having shorter courses gives our students more flexibility around their busy schedules. Even though the courses have been split, the overall content remains the same, so we feel confident that we’re maintaining learning standards of our material.
Q: What if I earned a voucher for retaking the old course? Can I use it in this course?
Vouchers from the older course will be valid for the newer courses. If you took the original course and earned a voucher, you will be issued a voucher for this course as well as for “Finding Hidden Messages in DNA” and “Comparing Genes, Proteins, and Genomes” (three vouchers total).
Q: Does this mean that the overall cost for earning Verified Certificates in the course is greater now?
Yes. Since there are more courses now, the overall cost for Verified Certificates is greater than before. Coursera offers a Financial Aid program for learners who would face a serious hardship paying for our courses. Plus, if you just want to join and check out our course content, it’s still free and available to everyone.