Course information, Fall 2023

  • First class: 8/29
  • Lectures: TR 11:00-12:20 Tepper 1403
  • Office Hours (beginning 8/30):

    Day Time Location TA
    Sunday 5:00pm - 6:00pm GHC 4301 Mingjie
    Monday 11:00am - 12:00pm GHC 8228 Junhong
    Monday 7:30pm - 8:30pm Zoom Zhiyi
    Tuesday 2:00pm - 3:00pm GHC 8228 Brandon
    Tuesday 3:00pm - 4:00pm Zoom Zhiyi
    Tuesday 7:00pm - 8:00pm GHC 6121 Mingjie
    Wednesday 3:30pm - 4:30pm GHC 8228 Brandon
    Thursday 9:30am - 10:30am GHC 8228 Junhong

Online course information

Note that the online course will not be officially offered in 2023 (we are working on an autonomous offering, hopefully to be released later this year).


Deep learning methods have revolutionized a number of fields in Artificial Intelligence and Machine Learning in recent years. The widespread adoption of deep learning methods have in no small part been driven by the widespread availability of easy-to-use deep learning systems, such as PyTorch and TensorFlow. But despite their widespread availability and use, it is much less common for students to get involved with the internals of these libraries, to understand how they function at a fundamental level. But understanding these libraries deeply will help you make better use of their functionality, and enable you to develop or extend these libraries when needed to fit your own custom use cases in deep learning.

The goal of this course is to provide students an understanding and overview of the “full stack” of deep learning systems, ranging from the high-level modeling design of modern deep learning systems, to the basic implementation of automatic differentiation tools, to the underlying device-level implementation of efficient algorithms. Throughout the course, students will design and build from scratch a complete deep learning library, capable of efficient GPU-based operations, automatic differentiation of all implemented functions, and the necessary modules to support parameterized layers, loss functions, data loaders, and optimizers. Using these tools, students will then build several state-of-the-art modeling methods, including convolutional networks for image classification and segmentation, recurrent networks and self-attention models for sequential tasks such as language modeling, and generative models for image generation.


The course is targeting the advanced undergraduate and PhD level students. Prerequisites include courses in:

  1. Systems programming (15-213)
  2. Linear algebra (21-240 or 21-241)
  3. Basic mathematical background (21-127 or 15-151).

Students are required to be familiar with both Python and C/C++ programming. Some degree of previous familiarity with machine learning is likely to be necessary as well, though we do not have a specific pre-requisite course herre. The first homework will cover background needed for the course.

Assignments and project

The coursework for the class will consist primarily of programming assignments done as homework, with four major homework assignments (plus an introductory homwork), plus a final project. Through these four assignments, students will build a basic deep learning library, comparable to a very minimal version of PyTorch or TensorFlow, scalable to a reasonably-sized system (e.g., with fast GPU implementations of operations). Programming assignments must be done individually: though students are allowed to discuss the assignments with others, they must submit individual code.

The final project, which will be done in groups of 2-3 students, will consist of an implementation of a substantial new feature within the developed library, plus an implementation of a model using this feature (than runs under the developed library not, e.g., done within PyTorch/Tensorflow). We will provide several candidates for such features and modeling projects, including methods for further hardware acceleration, adversarial training, advanced autodiff operators (e.g., linear algebra operators like system solves / SVDs), probabilistic modeling, etc. In addition to the code, you will also submit a report as part of your assignment.

Generative AI (“ChatGPT”) Course Policy

You are welcome to use Generative AI systems such as ChatGPT for assistance in any and all coding assignments. It does not need to be explicitly cited or mentioned in your submission. However, all content you submit is ultimately your responsibility, and you are responsible for any issues or flaws that may be introduced by the use of such software.

Grades will be assigned according to the following breakdown

  • 55% Homework
  • 35% Final Project
  • 10% Class Participation (via course Forum)