Demonstrate use of Weka for key data mining tasks
Evaluate the performance of a classifier on new, unseen, instances
Explain how data miners can unwittingly overestimate the performance of their system
Identify learning methods that are based on different flavors of simplicity
Apply many different learning methods to a dataset of your choice
Interpret the output produced by classification methods
Describe the principles behind many modern machine learning methods
Compare the decision boundaries produced by different classification algorithms
Debate ethical issues raised by mining personal data
Today’s world generates more data than ever before! Being able to turn it into useful information is a key skill. This course introduces you to practical data mining using the Weka workbench. We’ll dispel the mystery that surrounds the subject. We’ll explain the principles of popular algorithms. We’ll show you how to use them in practical applications. You’ll get plenty of experience actually mining data during the course, and afterwards you’ll be well equipped to mine your own. Weka originated at the University of Waikato in NZ, and Ian Witten has authored a leading book on data mining.
What is data mining?
Where can it be applied?
How do simple classification algorithms work?
What are their strengths and weaknesses?
In what ways are real-life classification methods more complex?
How should you evaluate a classifier’s performance?
What is “overfitting” and how can you combat it?
How can ensemble techniques combine the result of different algorithms?
What ethical considerations arise when mining data?
This course is aimed at anyone who deals in data. It involves no computer programming, although you need some experience with using computers for everyday tasks. High school maths should be more than enough and you’ll need an understanding of some elementary statistics concepts (means and variances).