Although biologists assembled a reference human genomes in 2001, your own genome differs from this reference genome by millions of mutations. In this chapter, we will learn combinatorial pattern matching algorithms for finding these mutations, something that future doctors will be able to do within minutes in order to find disease genes.
We will then examine the problem of detecting subtle similarities that evade the traditional sequence comparison algorithms that we studied in “Comparing Genes, Genomes, and Proteins“. Having efficient algorithms for comparing highly diverged sequences is important, since a single misaligned nucleotide can misclassify a protein and prevent biologists from inferring its function. We will consider a popular approach to sequence comparison based on an abstract machine called a Hidden Markov Model.
How Do We Locate Disease-Causing Mutations? (Combinatorial Pattern Matching)
What Causes Ohdo Syndrome?
Introduction to Multiple Pattern Matching
Herding Patterns into a Trie
Preprocessing the Genome Instead
The Burrows-Wheeler Transform
Inverting the Burrows-Wheeler Transform
Pattern Matching with the Burrows-Wheeler Transform
Speeding Up Burrows-Wheeler Pattern Matching
Where are the Matched Patterns?
Burrows and Wheeler Set Up Checkpoints
Epilogue: Mismatch-Tolerant Read Mapping
Why Have Biologists Still Not Developed an HIV Vaccine? (Hidden Markov Models)
Classifying the HIV Phenotype
Gambling with Yakuza
Two Coins up the Dealer’s Sleeve
Hidden Markov Models
The Viterbi Algorithm
Finding the Most Likely Outcome of an HMM
Profile HMMs for Sequence Alignment
Classifying proteins with profile HMMs
Learning the Parameters of an HMM
Soft Decisions in Parameter Estimation
The Many Faces of HMMs
Epilogue: Nature is a Tinkerer and not an Inventor
“Clustering Biological Data” is the suggested prerequisite for taking this course, but it is not a strict prerequisite, especially if you have some programming experience.
The programming assignments in this class can be solved using any programming language.
The printed course companion is Bioinformatics Algorithms: An Active-Learning Approach, by Compeau & Pevzner.
The majority of assessments for the course will consist of exercises and programming assignments. This course covers two chapters taken from Bioinformatics Algorithms: An Active Learning Approach, by Compeau & Pevzner.
Each chapter is also accompanied by a summary quiz and lecture videos.
Q: Will I get a statement of accomplishment after completing this class?
Yes. Students who successfully complete the class will receive a statement of accomplishment signed by the instructor.
Q: Can I receive a verified certificate for this course?
Yes. Students who would like a verified certificate can sign up for the course’s Signature Track option.
Q: I remember this course used to be part of the larger “Bioinformatics Algorithms (Part 2) course. Why was it split into three courses?
Based on survey feedback, completion data, and studies of other courses, we realized that having shorter courses gives our students more flexibility around their busy schedules. Even though the courses have been split, the overall content remains the same, so we feel confident that we’re maintaining learning standards of our material.
Q: What if I earned a voucher for retaking the old course? Can I use it in this course?
Vouchers from the older course will be valid for this course as well as “Deciphering Molecular Evolution“.
Q: Does this mean that the overall cost for earning Verified Certificates in the course is greater now?
Yes. Since there are more courses now, the overall cost for Verified Certificates is greater than before. Coursera offers a Financial Aid program for learners who would face a serious hardship paying for our courses. Plus, if you just want to join and check out our course content, it’s still free and available to everyone.