Gradient Descent & Learning Rate
IME 775 — Chapter 8: Visualizing the optimization process
Loss Function
L(w) = (w-2)²+1
L(w) = sin(w)+w²/10+1
L(w) = tanh²(w-2)+0.5
Learning Rate (r)
r = 0.100
Initial w
w₀ = -2.0
▶ Run Gradient Descent
Step
Reset
Loss Surface L(w)
Loss vs Iteration
Iteration
0
w
-2.000
L(w)
17.000
∂L/∂w
-8.000
Status
Ready