Quickstart Guide: Feature learning and the final network weights

Part 5 of A Quickstart Guide to Learning Mechanics (prev | next)

The Learning Mechanics Team
2025-09-01

[TO BE WRITTEN]

this one’s hard. it’s the subject of a lot of speculation, and a lot of people want to know. it’s been the subject of a lot of mechinterp, but it’s proven quite hard to get an analytically.

deep linear nets
single- / multi-index model theory
RFMs
complicated calculational tools at large width: DMFT, TP, etc.

A Quickstart Guide to Learning Mechanics

Introduction: asking a specific question
The average size of hidden representations
Hyperparameter selection (and why theorists should care)
The dynamics of optimization
🚧 Feature learning and the final network weights
🚧 Generalization
🚧 Neuron-level sparsity
🚧 The structure in the data
🚧 Places to make a difference

A Quickstart Guide to Learning Mechanics

Comments