Deep Learning Systems

Course information, Fall 2024

First class: 8/27
Lectures: TR 11:00-12:20 GHC 4401

Office Hours (beginning 8/27):

Day	Time	Location	TA
Monday	2pm-3pm	GHC 8228	Sid Sapra
Monday	1pm-2pm	zoom	Kartik Khandelwal
Tuesday	4pm-5pm	GHC 8115	Yoyo Chung
Wednesday	3pm-4pm	zoom	Shrey Gupta
Wednesday	6:30pm-7:30pm	GHC 8115	Tiancheng Zhao
Thursday	3pm-4pm	GHC 5113	Yonghao Zhuang
Friday	6pm-7pm	GHC 8115	Utkarsh Priyam
Saturday	5pm-6pm	GHC 8102	Shubhika Garg
Saturday	6pm-7pm	GHC 8102	Bharathi

Online course information

Note that the online course will not be officially offered in 2024 (we are working on an autonomous offering, hopefully to be released later this year).

Introduction

Deep learning methods have revolutionized a number of fields in Artificial Intelligence and Machine Learning in recent years. The widespread adoption of deep learning methods have in no small part been driven by the widespread availability of easy-to-use deep learning systems, such as PyTorch and TensorFlow. But despite their widespread availability and use, it is much less common for students to get involved with the internals of these libraries, to understand how they function at a fundamental level. But understanding these libraries deeply will help you make better use of their functionality, and enable you to develop or extend these libraries when needed to fit your own custom use cases in deep learning.

The goal of this course is to provide students an understanding and overview of the “full stack” of deep learning systems, ranging from the high-level modeling design of modern deep learning systems, to the basic implementation of automatic differentiation tools, to the underlying device-level implementation of efficient algorithms. Throughout the course, students will design and build from scratch a complete deep learning library, capable of efficient GPU-based operations, automatic differentiation of all implemented functions, and the necessary modules to support parameterized layers, loss functions, data loaders, and optimizers. Using these tools, students will then build several state-of-the-art modeling methods, including convolutional networks for image classification and segmentation, recurrent networks and self-attention models for sequential tasks such as language modeling, and generative models for image generation.

Prerequisites

The course is targeting the advanced undergraduate and PhD level students. Prerequisites include courses in:

Systems programming (15-213)
Linear algebra (21-240 or 21-241)
Basic mathematical background (21-127 or 15-151).

Students are required to be familiar with both Python and C/C++ programming. Some degree of previous familiarity with machine learning is likely to be necessary as well, though we do not have a specific pre-requisite course herre. The first homework will cover background needed for the course.

Assignments and project

The coursework for the class will consist primarily of programming assignments done as homework, with four major homework assignments (plus an introductory homwork), and a final project. Through these four assignments, students will build a basic deep learning library, comparable to a very minimal version of PyTorch or TensorFlow, scalable to a reasonably-sized system (e.g., with fast GPU implementations of operations). Programming assignments must be done individually: though students are allowed to discuss the assignments with others, they must submit individual code.

The final project, which will be done in groups of 2-3 students, will consist of an implementation of a substantial new feature within the developed library, plus an implementation of a model using this feature (than runs under the developed library not, e.g., done within PyTorch/Tensorflow). We will provide several candidates for such features and modeling projects, including methods for further hardware acceleration, adversarial training, advanced autodiff operators (e.g., linear algebra operators like system solves / SVDs), probabilistic modeling, etc. In addition to the code, you will also submit a report as part of your assignment.

Generative AI (“ChatGPT”) Course Policy

You are welcome to use Generative AI systems such as ChatGPT for assistance in any and all coding assignments. It does not need to be explicitly cited or mentioned in your submission. However, all content you submit is ultimately your responsibility, and you are responsible for any issues or flaws that may be introduced by the use of such software.

Grades will be assigned according to the following breakdown

55% Homework
35% Final Project
10% Class Participation (via course Forum)