Washington University in St. Louis
Department of Computer Science and Engineering


CSE 417T: Introduction to Machine Learning

Fall 2017

ANNOUNCEMENTS

OVERVIEW

This course is an introduction to machine learning, focusing on supervised learning. We will cover the mathematical foundations of learning as well as a number of important techniques for classification and regression, including linear and logistic regression, neural networks, nearest neighbor techniques, kernel methods, decision trees, and ensemble methods. Note that the material in this course is a prerequisite for CSE 517A, the graduate level machine learning class. The overlap with CSE 511A (Artificial Intelligence) is minimal.

STAFF

Instructors:
Sanmay Das Jolley 512 sanmay at wustl dot edu
Chien-Ju Ho Jolley 532 chienju.ho at wustl dot edu

Graduate Assistants and TAs:
There are several graduate assistants and undergraduate TAs for the class. All assistants will hold regular office hours, answer questions on Piazza, and grade homeworks. The graduate assistants will also hold occasional recitation or review sessions.
Mingquan Yuan (Graduate Assistant to the Instructors) 7.yuan at wustl dot edu
Liang Zhou (Graduate Assistant to the Instructors) liang.zhou at wustl dot edu
David Flasterstein davidflasterstein at wustl dot edu
Aaron Gordon aarongordon at wustl dot edu
Trevor Larsen trevorlarsen at wustl dot edu
Gwyneth Pearson gpearson at wustl dot edu

Office Hours
Here is a grid with information on office hours organized by day of the week. The instructors will have office hours either in their offices or as otherwise mentioned in class.
Please check the updated locations.
Office Hours Note
Mondays 3-4pm (Sanmay) 5:45-7:45pm (David) Eads 103
Tuesdays 4-6pm (Gwyneth) Duncker 101 6-8pm (Liang) Duncker 101
Wednesdays 2-4pm (Aaron) January Hall 110
Thursdays 12-2pm (Trevor) Rudolph 203 4-5pm (Chien-Ju)
Fridays 9:30-11:30am (Mingquan) Louderman 461

POLICIES

Detailed policies are in the official syllabus. A few points to highlight: please read and understand the collaboration policy and the late day policy. There will be two exams, each covering approximately half the course material, and no separate final exam.

TEXTBOOKS

The course textbooks are

PREREQUISITES

CSE 247, ESE 326 (or Math 320), Math 233, and Math 309 (can be taken concurrently) or equivalents. If you do not have a solid background in calculus, probability, and computer science through a class in data structures and algorithms then you may have a hard time in this class. Matrix algebra will be used and is fundamental to modern machine learning, but it's OK to take that class concurrently.

SCHEDULE, READING, AND ASSIGNMENTS

Date Instructor Topics Readings Assignments
Aug 28/29 Introduction. Course policies. Course overview. Lecture notes (Ho). Lecture notes (Das). AML 1.1, 1.2.
Aug 30/31 The perceptron learning algorithm. Generalizing outside the training set and Hoeffding's inequality. AML 1.1.2, 1.3.1, 1.3.2, and Problem 1.3
Sep 5/6 Matlab session. Accessing Matlab (by Marion Neumann)
Sep 7 Sanmay Generalization, Hoeffding's inequality, multiple hypotheses, different costs, label noise AML: Rest of Chapter 1 HW1
Submission Instructions
Sep 11 Chien-Ju Hoeffding's inequality, multiple hypotheses, different costs, label noise, growth function. AML: Rest of Chapter 1. AML 2.1.1
Sep 12 Sanmay Infinite hypothesis spaces, growth function, and VC-dimension. AML 2.1.1-2.1.3
Sep 13 Chien-Ju Infinite hypothesis spaces, growth function, and VC-dimension. AML 2.1.1-2.1.3
Sep 14 Sanmay VC Generalization bound. Real-valued targets. AML 2.1.4-2.2
Sep 18 Chien-Ju VC Generalization bound. Test set. The bias-variance trade-off. AML 2.1.4-2.3
Sep 19 Sanmay The bias-variance trade-off; the pocket algorithm. AML 2.3-3.1 HW2
Sep 20 Chien-Ju The bias-variance trade-off, the pocket algorithm, and linear regression. AML 2.3-3.2
Sep 21 Sanmay Linear regression. Intro to logistic regression. AML 3.2-3.3.1
Sep 25 Chien-Ju Logistic regression, gradient descent, stochastic gradient descent. AML 3.3
Sep 26 Sanmay Logistic regression, gradient descent. AML 3.3
Sep 27 Chien-Ju Nonlinear transformations. Overfitting. AML 3.4, 4.1. Malik Magdon-Ismail's slides on overfitting
Sep 28 Sanmay Gradient descent practicalities. Stochastic gradient descent. Nonlinear transforms. AML 3.3-3.4
Oct 2 Chien-Ju VC dimension of Perceptron. Regularization. AML 4.2
Oct 3 and 4 Both sections (Liang) Midterm review
Oct 9 Chien-Ju Regularization. Validation AML 4.2-4.3.1
Oct 10 Sanmay (with Chien-Ju subbing) Overfitting AML 4.1; Malik Magdon-Ismail's slides on overfitting
Oct 11 Chien-Ju Validation, contd. AML: Rest of Chapter 4. HW3
Oct 12 Sanmay Regularization. AML 4.2
Oct 18 Chien-Ju Three learning principles. Human-in-the-loop machine learning. AML Chapter 5. Course slides
Oct 19 Sanmay Validation AML: Rest of Chapter 4.
Oct 23 Chien-Ju Decision Trees. Tom Mitchell, Machine Learning Ch3. CASI 8.4
Oct 24 Sanmay Three learning principles. AML Chapter 5.
Oct 25 Chien-Ju Bagging. Random Forest. CASI 17.1 HW4
Oct 26 Sanmay Decision trees and ID3 Tom Mitchell, Machine Learning Ch3. CASI 8.4
Oct 30 Chien-Ju Boosting. AdaBoost. Freund & Schapire's Tutorial. CASI 17.4
Nov 1 Chien-Ju Boosting contd. Nearest Neighbor. AML eChapter 6.1-6.2.1
Nov 6 Chien-Ju K-Nearest Neighbors. Curse of Dimensionality. AML eChapter 6.1-6.2
Nov 8 Chien-Ju Efficiency of Nearest Neighbor. Radial Basis Functions. AML eChapter 6.2-6.3 HW5
Nov 13 Chien-Ju Support Vector Machines (SVM). AML eChapter 8.1
Nov 15 Chien-Ju Dual SVM. Kernel Trick. AML eChapter 8.2-8.3
Nov 20 Chien-Ju Soft Margin SVM. Neural Networks. AML eChapter 8.4, 7.1 HW6