Resources | AIRE Curriculum

📖 14 books 🎓 6 courses ▶ 3 videos ◎ 4 onlines

All Resources

Click on a resource to see study progress options

Resource	Authors	Type	Role	Phase(s)	Progress
Axler Linear Algebra Done Right	Sheldon Axler	book	primary	1	—
Strang Introduction to Linear Algebra	Gilbert Strang	book	secondary	1	—
MIT 18.06 MIT 18.06 Linear Algebra	Gilbert Strang	video	secondary	1	—
Hubbard & Hubbard Vector Calculus, Linear Algebra, and Differential Forms: A Unified Approach	John H. Hubbard, Barbara Burke Hubbard	book	primary	1	—
Spivak Calculus on Manifolds	Michael Spivak	book	alternative	1	—
Blitzstein & Hwang Introduction to Probability	Joseph Blitzstein, Jessica Hwang	book	primary	1	—
Harvard Stat 110 Harvard Stat 110: Probability	Joseph Blitzstein	video	secondary	1	—
Wasserman All of Statistics: A Concise Course in Statistical Inference	Larry Wasserman	book	reference	1	—
Skiena The Algorithm Design Manual	Steven Skiena	book	primary	2	—
CS:APP Computer Systems: A Programmer's Perspective	Randal E. Bryant, David R. O'Hallaron	book	primary	2	—
PRML Pattern Recognition and Machine Learning	Christopher Bishop	book	primary	3	—
ESL The Elements of Statistical Learning	Trevor Hastie, Robert Tibshirani, Jerome Friedman	book	alternative	3	—
Shalev-Shwartz & Ben-David Understanding Machine Learning: From Theory to Algorithms	Shai Shalev-Shwartz, Shai Ben-David	book	reference	3	—
Prince Understanding Deep Learning	Simon Prince	book	primary	4	—
Nielsen Neural Networks and Deep Learning	Michael Nielsen	online	secondary	4	—
CS25 Stanford CS25: Transformers United	Jure Leskovec	course	primary	4	—
Illustrated Transformer The Illustrated Transformer	Jay Alammar	online	secondary	4	—
Let's build GPT Let's build GPT	Andrej Karpathy	online	secondary	4	—
CMU 10-714 CMU 10-714: Deep Learning Systems	J. Zico Kolter, Tianqi Chen	course	primary	5	—
GPU Mode GPU Mode	Jeremy Howard	video	primary	5	—
Hwu & Kirk Programming Massively Parallel Processors	David B. Kirk, Wen-mei W. Hwu	book	secondary	5	—
CS324 Stanford CS324: Large Language Models	Tatsunori Hashimoto, Percy Liang	course	primary	6	—
COS 597G Princeton COS 597G: Understanding Large Language Models	Sanjeev Arora	course	alternative	6	—
Sutton & Barto Reinforcement Learning: An Introduction	Richard S. Sutton, Andrew G. Barto	book	primary	6	—
Spinning Up OpenAI Spinning Up in Deep RL	OpenAI	course	secondary	6	—
CS285 UC Berkeley CS285: Deep Reinforcement Learning	Sergey Levine	course	reference	6	—
CleanRL CleanRL	Costa Huang	online	primary	6	—
DPO Paper Direct Preference Optimization: Your Language Model is Secretly a Reward Model	Rafael Rafailov, Archit Sharma, Eric Mitchell, et al.	paper	primary	6	—

📖 Books

Primary textbooks and reference materials

primary

Axler

Linear Algebra Done Right

by Sheldon Axler

Axler's text is renowned for its decision to banish determinants to the end of the book. This forces the student to understand linear maps, eigenvalues, and inner product spaces based on their geometric properties rather than algebraic formulas. This "operator-centric" view aligns perfectly with modern deep learning, where layers are viewed as operators acting on function spaces. It builds the mental models necessary to understand concepts like Low-Rank Adaptation (LoRA) and the spectral properties of weight matrices, which are crucial for understanding model stability and compression.

Phase 1

Study Time

secondary

Strang

Introduction to Linear Algebra

by Gilbert Strang

While Axler provides rigor, Strang provides the connection to computation. His focus on the "Four Fundamental Subspaces" provides a concrete mental image of how matrices manipulate data.

Phase 1

Study Time

primary

Hubbard & Hubbard

Vector Calculus, Linear Algebra, and Differential Forms: A Unified Approach

by John H. Hubbard, Barbara Burke Hubbard

This book is legendary among mathematics enthusiasts for treating the derivative not just as a number or a vector, but as a linear transformation (the Jacobian matrix) that best approximates a function near a point. This viewpoint is exactly how automatic differentiation engines (like PyTorch's autograd) operate—computing Jacobian-Vector products. It integrates linear algebra and calculus seamlessly, which is how they appear in machine learning. It provides proofs that allow a researcher to understand *when* optimization might fail (e.g., non-differentiable points like ReLU at 0, saddle points).

Phase 1

Study Time

alternative

Spivak

Calculus on Manifolds

by Michael Spivak

A concise, dense classic. While elegant, Hubbard & Hubbard is generally preferred for self-study due to its more explanatory nature and unified approach.

Phase 1

Study Time

primary

Blitzstein & Hwang

Introduction to Probability

by Joseph Blitzstein, Jessica Hwang

Based on the famous Harvard Stat 110 course. This book is unrivaled in building *intuition*. It emphasizes "story proofs"—understanding *why* a formula works through narrative logic rather than algebraic manipulation.

Phase 1

Study Time

reference

Wasserman

All of Statistics: A Concise Course in Statistical Inference

by Larry Wasserman

This book covers a massive amount of ground—from basic probability to VC dimension and bootstrapping—very quickly. It is an excellent bridge to the "Elements of Statistical Learning."

Phase 1

Study Time

primary

Skiena

The Algorithm Design Manual

by Steven Skiena

Unlike the standard *Introduction to Algorithms* (CLRS), which is encyclopedic and theoretical, Skiena's book focuses on the *design* process and practical "war stories." It teaches you how to recognize a problem type and select the right tool, which is critical for research interviews and actual engineering work.

Phase 2

Study Time

primary

CS:APP

Computer Systems: A Programmer's Perspective

by Randal E. Bryant, David R. O'Hallaron

This is the standard text for understanding how software interacts with hardware.

Phase 2

Study Time

primary

PRML

Pattern Recognition and Machine Learning

by Christopher Bishop

This book is the gold standard for the **Bayesian** perspective. It explains regularization not just as a heuristic, but as a prior belief on the model parameters.

Phase 3

Study Time

alternative

ESL

The Elements of Statistical Learning

by Trevor Hastie, Robert Tibshirani, Jerome Friedman

This text is more "frequentist" and statistical, excellent for understanding the bias-variance tradeoff and decision trees.

Phase 3

Study Time

reference

Shalev-Shwartz & Ben-David

Understanding Machine Learning: From Theory to Algorithms

by Shai Shalev-Shwartz, Shai Ben-David

This book is mathematically dense and focuses on **PAC Learning** (Probably Approximately Correct). It answers the fundamental question: "Under what conditions is learning even possible?"

Phase 3

Study Time

primary

Prince

Understanding Deep Learning

by Simon Prince

While Goodfellow's *Deep Learning* (2016) is a classic, it predates the Transformer revolution. Prince's book is modern, visually intuitive, and covers Transformers, Diffusion, and Generative AI. It is the superior choice for a student starting in 2025.

Phase 4

Study Time

secondary

Hwu & Kirk

Programming Massively Parallel Processors

by David B. Kirk, Wen-mei W. Hwu

Phase 5

Study Time

primary

Sutton & Barto

Reinforcement Learning: An Introduction

by Richard S. Sutton, Andrew G. Barto

The foundational text of the field.

Phase 6

Study Time

🎓 Courses

University courses and MOOCs

primary ↗

CS25

Stanford CS25: Transformers United

by Jure Leskovec

Phase 4

Study Time

primary ↗

CMU 10-714

CMU 10-714: Deep Learning Systems

by J. Zico Kolter, Tianqi Chen

This is arguably the most valuable course for an aspiring RE. You build a deep learning library (called "Needle") from scratch.

Phase 5

Study Time

primary ↗

CS324

Stanford CS324: Large Language Models

by Tatsunori Hashimoto, Percy Liang

Phase 6

Study Time

alternative ↗

COS 597G

Princeton COS 597G: Understanding Large Language Models

by Sanjeev Arora

Phase 6

Study Time

secondary ↗

Spinning Up

OpenAI Spinning Up in Deep RL

by OpenAI

While the original repo is older, forks and modern implementations (CleanRL) are the best way to learn PPO, DQN, and SAC.

Phase 6

Study Time

reference ↗

CS285

UC Berkeley CS285: Deep Reinforcement Learning

by Sergey Levine

Phase 6

Study Time

▶ Videos

Video lectures and tutorials

secondary ↗

MIT 18.06

MIT 18.06 Linear Algebra

by Gilbert Strang

Phase 1

Study Time

secondary ↗

Harvard Stat 110

Harvard Stat 110: Probability

by Joseph Blitzstein

Phase 1

Study Time

primary ↗

GPU Mode

by Jeremy Howard

Practical, modern GPU optimization. Community-driven resource with lectures, reading groups, and an extensive collection of CUDA/GPU programming materials.

Phase 5

Study Time

◎ Online Resources

Online articles, tutorials, and interactive resources

secondary ↗

Nielsen

Neural Networks and Deep Learning

by Michael Nielsen

For a gentle introduction to backpropagation.

Phase 4

Study Time

secondary ↗

Illustrated Transformer

The Illustrated Transformer

by Jay Alammar

Phase 4

Study Time

secondary ↗

Let's build GPT

by Andrej Karpathy

Phase 4

Study Time

primary ↗

CleanRL

by Costa Huang

Single-file implementations of Deep Reinforcement Learning algorithms. This is the modern standard for learning RL implementation details.

Phase 6

Study Time

Resources by Phase

See which resources are used in each curriculum phase

1 The Mathematical Substrate 8 resources

primary Axler — Linear Algebra Done Right
secondary Strang — Introduction to Linear Algebra
secondary MIT 18.06 — MIT 18.06 Linear Algebra ↗
primary Hubbard & Hubbard — Vector Calculus, Linear Algebra, and Differential Forms: A Unified Approach
alternative Spivak — Calculus on Manifolds
primary Blitzstein & Hwang — Introduction to Probability
secondary Harvard Stat 110 — Harvard Stat 110: Probability ↗
reference Wasserman — All of Statistics: A Concise Course in Statistical Inference

2 CS Fundamentals & Systems 2 resources

primary Skiena — The Algorithm Design Manual
primary CS:APP — Computer Systems: A Programmer's Perspective

3 Classical ML Theory 3 resources

primary PRML — Pattern Recognition and Machine Learning
alternative ESL — The Elements of Statistical Learning
reference Shalev-Shwartz & Ben-David — Understanding Machine Learning: From Theory to Algorithms

4 Deep Learning 5 resources

primary Prince — Understanding Deep Learning
secondary Nielsen — Neural Networks and Deep Learning ↗
primary CS25 — Stanford CS25: Transformers United ↗
secondary Illustrated Transformer — The Illustrated Transformer ↗
secondary Let's build GPT — Let's build GPT ↗

5 Frontier Systems 3 resources

primary CMU 10-714 — CMU 10-714: Deep Learning Systems ↗
primary GPU Mode — GPU Mode ↗
secondary Hwu & Kirk — Programming Massively Parallel Processors

6 Frontier Research Topics 7 resources

primary CS324 — Stanford CS324: Large Language Models ↗
alternative COS 597G — Princeton COS 597G: Understanding Large Language Models ↗
primary Sutton & Barto — Reinforcement Learning: An Introduction
secondary Spinning Up — OpenAI Spinning Up in Deep RL ↗
reference CS285 — UC Berkeley CS285: Deep Reinforcement Learning ↗
primary CleanRL — CleanRL ↗
primary DPO Paper — Direct Preference Optimization: Your Language Model is Secretly a Reward Model ↗

7 Research & Portfolio 0 resources

Recommended Resource Summary

Primary resources for each phase (from the curriculum conclusion)

Resource Type	Recommended Resource	Reasoning
Lin. Algebra	Linear Algebra Done Right (Axler)	Geometric intuition for latent spaces.
Calculus	Hubbard & Hubbard	Rigorous treatment of Jacobian/Hessian.
Probability	Introduction to Probability (Blitzstein)	Best intuition for random variables.
Algorithms	The Algorithm Design Manual (Skiena)	Practical design focus over theory.
ML Theory	Pattern Recognition & ML (Bishop)	Bayesian foundation is essential.
Deep Learning	Understanding Deep Learning (Prince)	Most up-to-date (Transformers/GenAI).
Systems	CMU 10-714 (Needle)	Build a framework from scratch.
RL	Sutton & Barto	The foundational text of the field.
GPU/CUDA	CUDA Mode (YouTube)	Practical, modern GPU optimization.