Open Question 3.16
What's even going on with Adam?
Open Question 3.16: What's even going on with Adam? What scaling relationships apply to Adam’s \(\beta\) or \(\epsilon\) hyperparameters?
This is a discussion page for the open question above. Feel free to share ideas, approaches, or relevant research in the comments below.
Discussion