Before the lecture
If you like videos, you can watch (as much as you can) the videos of 3Blue1Brown on Neural Networks:
- if you need to brush up your skills in linear algebra: Hui’s medium blog post
The lecture
After the lecture
-
Ruder’s overview of gradient-descent based optimizer (SGD, momentum, Adam, etc.)
-
try re-deriving by yourself, on a sheet of paper, all the maths that were covered
-
Overview of optimization from the famous “deeplearningbook”
-
In-depth explications about momentum on distill.pub