Model optimization | Notion

Optimization - aims at reducing losses and provide most accurate results possible

Weight - initialized using strategies and updated w/ each epoch according to the equation.

https://cs231n.github.io/neural-networks-3/#hyper

https://dataaspirant.com/optimization-algorithms-deep-learning/

https://medium.com/@minions.k/optimization-techniques-popularly-used-in-deep-learning-3c219ec8e0cc

(Batch)Gradient Descent

Aim to find the global minimum
Entire data is loaded at a time
May stuck at local minima
→ computationally intensive

Stochastic Gradient Descent

Compute derivative one at a time
contain noise → slow to converge to minimum

Mini-Batch SGD

Adaptive Optimization techniques

Used stats from previous iterations to speed up the converging process

Momentum based optimizer

Exponentially used weighted average gradients over previous iterations to stabilize the convergence → quick optimization
Formula: add a fraction (gamma) to previous iteration values
Momentum term increase when gradient points are in the same directions & reduce when gradients fluctuate