Course information, Fall 2024

  • First class: 8/27
  • Lectures: TR 11:00-12:20 GHC 4401
  • Office Hours (beginning 8/27):

    Day Time Location TA
    Monday 2pm-3pm GHC 8228 Sid Sapra
    Monday 11:30am-12:30pm zoom Kartik Khandelwal
    Tuesday 4pm-5pm GHC 8115 Yoyo Chung
    Wednesday 3pm-4pm zoom Shrey Gupta
    Wednesday 6:30pm-7:30pm GHC 8115 Tiancheng Zhao
    Thursday 3pm-4pm GHC 5113 Yonghao Zhuang
    Friday 6pm-7pm GHC 8115 Utkarsh Priyam
    Saturday 5pm-6pm GHC 8102 Shubhika Garg
    Saturday 6pm-7pm GHC 8102 Bharathi

Online course information

Note that the online course will not be officially offered in 2024 (we are working on an autonomous offering, hopefully to be released later this year).

Introduction

Deep learning methods have revolutionized a number of fields in Artificial Intelligence and Machine Learning in recent years. The widespread adoption of deep learning methods have in no small part been driven by the widespread availability of easy-to-use deep learning systems, such as PyTorch and TensorFlow. But despite their widespread availability and use, it is much less common for students to get involved with the internals of these libraries, to understand how they function at a fundamental level. But understanding these libraries deeply will help you make better use of their functionality, and enable you to develop or extend these libraries when needed to fit your own custom use cases in deep learning.

The goal of this course is to provide students an understanding and overview of the “full stack” of deep learning systems, ranging from the high-level modeling design of modern deep learning systems, to the basic implementation of automatic differentiation tools, to the underlying device-level implementation of efficient algorithms. Throughout the course, students will design and build from scratch a complete deep learning library, capable of efficient GPU-based operations, automatic differentiation of all implemented functions, and the necessary modules to support parameterized layers, loss functions, data loaders, and optimizers. Using these tools, students will then build several state-of-the-art modeling methods, including convolutional networks for image classification and segmentation, recurrent networks and self-attention models for sequential tasks such as language modeling, and generative models for image generation.

Prerequisites

The course is targeting the advanced undergraduate and PhD level students. Prerequisites include courses in:

  1. Systems programming (15-213)
  2. Linear algebra (21-240 or 21-241)
  3. Basic mathematical background (21-127 or 15-151).

Students are required to be familiar with both Python and C/C++ programming. Some degree of previous familiarity with machine learning is likely to be necessary as well, though we do not have a specific pre-requisite course herre. The first homework will cover background needed for the course.

Assignments and project

The coursework for the class will consist primarily of programming assignments done as homework, with four major homework assignments (plus an introductory homwork), and a final project. Through these four assignments, students will build a basic deep learning library, comparable to a very minimal version of PyTorch or TensorFlow, scalable to a reasonably-sized system (e.g., with fast GPU implementations of operations). Programming assignments must be done individually: though students are allowed to discuss the assignments with others, they must submit individual code.

The final project, which will be done in groups of 2-3 students, will consist of an implementation of a substantial new feature within the developed library, plus an implementation of a model using this feature (than runs under the developed library not, e.g., done within PyTorch/Tensorflow). We will provide several candidates for such features and modeling projects, including methods for further hardware acceleration, adversarial training, advanced autodiff operators (e.g., linear algebra operators like system solves / SVDs), probabilistic modeling, etc. In addition to the code, you will also submit a report as part of your assignment.

Generative AI (“ChatGPT”) Course Policy

You are welcome to use Generative AI systems such as ChatGPT for assistance in any and all coding assignments. It does not need to be explicitly cited or mentioned in your submission. However, all content you submit is ultimately your responsibility, and you are responsible for any issues or flaws that may be introduced by the use of such software.

Grades will be assigned according to the following breakdown

  • 55% Homework
  • 35% Final Project
  • 10% Class Participation (via course Forum)