#concept

What is gradient descent? ? The process of a forward pass, backward pass, then update, in order to minimize loss. It is a method to minimize the loss in neural networks by iteratively adjusting the weights and biases. It updates each parameter in the direction that most decreases the loss.

References

  1. Build micrograd, a scalar-based neural network with Karpathy

Notes

Mathematically, it is an iterative algorithm to find a local minimum of a differentiable multivariate function. https://github.com/maxim5/cs229-2018-autumn/blob/main/notes/cs229-notes-backprop.pdf