Open Question 3.16

What's even going on with Adam?


Open Question 3.16: What's even going on with Adam? What scaling relationships apply to Adam’s \(\beta\) or \(\epsilon\) hyperparameters?

This is a discussion page for the open question above. Feel free to share ideas, approaches, or relevant research in the comments below.

Discussion