In this article, we go through the Stochastic Gradient Descent with Warm Restarts paper. We analyze how the SGDR technique helps in training deep neural networks and converge much faster than other scheduling techniques. ...
Stochastic Gradient Descent with Warm Restarts: Paper Explanation
