Quickstart Guide: Feature learning and the final network weights


[TO BE WRITTEN]

this one’s hard. it’s the subject of a lot of speculation, and a lot of people want to know. it’s been the subject of a lot of mechinterp, but it’s proven quite hard to get an analytically.


A Quickstart Guide to Learning Mechanics

  1. Introduction: asking a specific question
  2. The average size of hidden representations
  3. Hyperparameter selection (and why theorists should care)
  4. The dynamics of optimization
  5. 🚧 Feature learning and the final network weights
  6. 🚧 Generalization
  7. 🚧 Neuron-level sparsity
  8. 🚧 The structure in the data
  9. 🚧 Places to make a difference

Comments