Key course information

  • Lectures: MW 11:50-1:10pm ET in HOA 160
  • Lecture live stream: Zoom link (requires CMU Zoom account)
  • Office Hours:
    • Tuesdays 2-3pm on Zoom
    • Tuesdays 6-7pm on Zoom
    • Wednesdays 6-7pm in GHC 8102
  • Lecture recordings: Canvas page


Deep learning methods have revolutionized a number of fields in Artificial Intelligence and Machine Learning in recent years. The widespread adoption of deep learning methods have in no small part been driven by the widespread availability of easy-to-use deep learning systems, such as PyTorch and TensorFlow. But despite their widespread availability and use, it is much less common for students to get involved with the internals of these libraries, to understand how they function at a fundamental level. But understanding these libraries deeply will help you make better use of their functionality, and enable you to develop or extend these libraries when needed to fit your own custom use cases in deep learning.

The goal of this course is to provide students an understanding and overview of the “full stack” of deep learning systems, ranging from the high-level modeling design of modern deep learning systems, to the basic implementation of automatic differentiation tools, to the underlying device-level implementation of efficient algorithms. Throughout the course, students will design and build from scratch a complete deep learning library, capable of efficient GPU-based operations, automatic differentiation of all implemented functions, and the necessary modules to support parameterized layers, loss functions, data loaders, and optimizers. Using these tools, students will then build several state-of-the-art modeling methods, including convolutional networks for image classification and segmentation, recurrent networks and self-attention models for sequential tasks such as language modeling, and generative models for image generation.


The course is targeting the advanced undergraduate and PhD level students. Prerequisites include courses in:

  1. Systems programming (15-213)
  2. Linear algebra (21-240 or 21-241)
  3. Basic mathematical background (21-127 or 15-151).

Students are required to be familiar with both Python and C/C++ programming. Some degree of previous familiarity with machine learning is likely to be necessary as well, though we do not have a specific pre-requisite course herre. The first homework will cover background needed for the course.

Assignments and project

The coursework for the class will consist primarily of programming assignments done as homework, with four major homework assignments (plus an introductory homwork), plus a final project. Through these four assignments, students will build a basic deep learning library, comparable to a very minimal version of PyTorch or TensorFlow, scalable to a reasonably-sized system (e.g., with fast GPU implementations of operations). Programming assignments must be done individually: though students are allowed to discuss the assignments with others, they must submit individual code.

The final project, which will be done in groups of 2-3 students, will consist of an implementation of a substantial new feature within the developed library, plus an implementation of a model using this feature (than runs under the developed library not, e.g., done within PyTorch/Tensorflow). We will provide several candidates for such features and modeling projects, including methods for further hardware acceleration, adversarial training, advanced autodiff operators (e.g., linear algebra operators like system solves / SVDs), probabilistic modeling, etc. In addition to the code, you will also submit a report as part of your assignment.

Grades will be assigned according to the following breakdown

  • 55% Homework
  • 35% Final Project
  • 10% Class Participation (via course Forum)