The capstone project will be an analysis using R that answers a specific scientific/business question provided by the course team. A large and complex dataset will be provided to learners and the analysis will require the application of a variety of methods and techniques introduced in the previous courses, including exploratory data analysis through data visualization and numerical summaries, statistical inference, and modeling as well as interpretations of these results in the context of the data and the research question. The analysis will implement both frequentist and Bayesian techniques and discuss in context of the data how these two approaches are similar and different, and what these differences mean for conclusions that can be drawn from the data.
A sampling of the final projects will be featured on the Duke Statistical Science department website.
Note: Only learners who have passed the four previous courses in the specialization are eligible to take the Capstone.
About the Capstone Project
Welcome to the capstone project! This week's content is an introduction to the project assignment and goals. The readings in this week will introduce the data set that you will be analyzing for your project and the specific questions you will answer using data analysis techniques we learned in the previous courses. It is important to understand what we will be doing in the course before jumping into the detailed analysis. So we encourage you to start with the first lecture to get the big picture, and then delve into the specifics of the analysis. Enjoy, and good luck! Remember, if you have questions, you can post them on the discussion forums.
EDA and Basic Model Selection - Submission
This week we will dig deeper into our exploratory data analysis of the data. We now have all the information and data necessary to perform a deep dive into the EDA and it is time start your initial analysis report! We encourage you to start your analysis report (presented in peer-review format next week) early so you will have enough time to complete it. You will conduct exploratory data analysis, model selection, and model evaluation, and then complete a written report which answers several questions which will guide you through the process. This report will be your first peer-review assignment in this course.
EDA and Basic Model Selection - Evaluation
Great work so far! We hope you will also learn as much from evaluating your peers' work as completing your own assignment. Happy learning!
Model Selection and Diagnostics
We are half way through the course! In this week, you will continue model selection and model diagnostics, which will serve a starting point for your final project. You will be assessed on your work through a quiz. If you have any questions so far, don't hesitate to post on the forum so that others can help and discuss the question together.
Out of Sample Prediction
In this week, you will gain experience using your model to perform out-of-sample prediction and validation. The skills honed this week will guide you through your final analysis in the weeks to come. Please feel free to go back to prior weeks and review the necessary background knowledge.
Final Data Analysis - Submission
In the next two weeks, you will complete your final data analysis project. You will submit your answers using the Final Data Analysis peer review assignment link in Week 8.
Final Data Analysis - Evaluation
Congratulations on making through to the final week of the course! In this week, we will finish this data analysis project by completing the evaluation of three of your peers' assignments.