What is gradient descent? ? The process of a forward pass, backward pass, then update, in order to minimize loss. It is a method to minimize the loss in neural networks by iteratively adjusting the weights and biases. It updates each parameter in the direction that most decreases the loss.
Mathematically, it is an iterative algorithm to find a local minimum of a differentiable multivariate function. https://github.com/maxim5/cs229-2018-autumn/blob/main/notes/cs229-notes-backprop.pdf