I am going to be controversial here, because my advice runs contrary to most peoples, but it may be useful to some. The usual advice is to code up an MNIST CNN and see it work its magic, well it may well be magic to you, hence a step back is a useful piece of advice.
There are three critical cornerstones to deep learning
Linear Algebra (up to about Eigenvalues)
Multivariate Calculus (up to chain rule, partials)
Probability (Some understanding of Bayes and Distribution functions)
If you can compute eigenvalues (or even understand them), do a little differentiation and know a little Bayesian statistics you’ll go a long way, fast.
This premise is based upon that the theory gives great insight into functionality.
Linear and Logistic regression using a single neuron. Given that logistic regression uses a differentiable function (hence enabling backprop) once you undertstand how these simple “architectures” (in need of a better word) work, you have really covered half of deep learning.
So at stage II you should be able to construct some TensorFlow (you will also need to pick up some Pandas, NumPy) to solve the above problems. In fact by grabbing the weights and bias (y=mx+b) you should be able to solve simple regression problems in a few lines of code, not rocket science. Once however you move on to a neuron with an activation function (hence non-linear and differentiable) it becomes a little more challenging to manually engineer.
will update this article soon!