# Syllabus

Statistical Learning is a field that ties together statistical theory and practice with the methods of machine learning have emerged in the last several decades.

This course is geared towards students with the following experience.

- Exposure and understanding of the fundamental concepts of linear regression: What is it used for? How does one interpret the parameter estimates? What does inference mean in this context?
- Experience with a programming language. We will use R in this course, but it will be easy enough to learn if you’ve worked before with Python, Java, Matlab, etc.

If you’ve taken Math 141, you have these bases covered. In terms of the mathematical notation, we will be using the vector/matrix formulation in places, though knowledge of linear algebra is not necessary.

### Contact

Andrew Bray

Office: Library 304

Office hours: Tuesday 4-5, Thursday 3-4

### Textbook

An Introduction to Statistical Learning (2013), by James, Witten, Hastie, and Tibshirani. The pdf is available for free and the printed book is available from the bookstore for around $60. The textbook is a key component of the course and I recommend having a hardcopy on hand if possible.

### Class components

This course has three components: problem sets, labs, and exams/quizzes/project. For details on the first two, see the tabs at the top of the page.

#### Exams

We’ll have several examinations and quizzes throughout the semester in order to challenge your understanding and provide us with a sense of where you’re at. Some will be more traditional pen and paper and others are to be done with the computer using R.

**Midterm I**

Friday, February 26 2016

**Midterm II**

Friday, April 8th (takehome)

**Final**

Takehome during finals week (link)

#### Project

Your goal is to find a data set of interest to you and develop insight into it by applying the principles and techniques of statistical learning. This is a group project (3 students in a group) that has two deliverables: a single Rmd research report that is submitted via GitHub and a 10-15 minute presentation.

**Research Report**

April 29th 1 pm: Template (invite)

**Presentations**

April 27 and 29th in class

### General Timeline

Week | Topic |
---|---|

1 | Foundations of Stat Learning |

2 | Simple Linear Regression |

3 | Multiple Linear Regression |

4 | Classification |

5 | Classification |

6 | Resampling Methods |

7 | Nonparametrics |

8 | SVM |

9 | Tree-based methods |

10 | Tree-based methods |

11 | Unsupervised Learning |

12 | Unsupervised Learning |

13 | Unsupervised Learning |