This is a rough list of my favorite deep learning resources. It has
been useful to me for learning how to do deep learning, I use it for
revisiting topics or for reference. I (Guillaume Chevalier)
have built this list and got through all of the content listed here,
carefully.
Contents
Trends
Here are the all-time Google
Trends, from 2004 up to now, September 2017:
You might also want to look at Andrej Karpathy’s new
post about trends in Machine Learning research.
I believe that Deep learning is the key to make computers think more
like humans, and has a lot of potential. Some hard automation tasks can
be solved easily with that while this was impossible to achieve earlier
with classical algorithms.
Moore’s Law about exponential progress rates in computer science
hardware is now more affecting GPUs than CPUs because of physical limits
on how tiny an atomic transistor can be. We are shifting toward parallel
architectures [read
more]. Deep learning exploits parallel architectures as such under
the hood by using GPUs. On top of that, deep learning algorithms may use
Quantum Computing and apply to machine-brain interfaces in the
future.
I find that the key of intelligence and cognition is a very
interesting subject to explore and is not yet well understood. Those
technologies are promising.
Online Classes
- DL&RNN
Course - I created this richely dense course on Deep Learning and
Recurrent Neural Networks.
- Machine
Learning by Andrew Ng on Coursera - Renown entry-level online class
with certificate.
Taught by: Andrew Ng, Associate Professor, Stanford University; Chief
Scientist, Baidu; Chairman and Co-founder, Coursera.
- Deep
Learning Specialization by Andrew Ng on Coursera - New series of 5
Deep Learning courses by Andrew Ng, now with Python rather than
Matlab/Octave, and which leads to a specialization
certificate.
- Deep
Learning by Google - Good intermediate to advanced-level course
covering high-level deep learning concepts, I found it helps to get
creative once the basics are acquired.
- Machine
Learning for Trading by Georgia Tech - Interesting class for
acquiring basic knowledge of machine learning applied to trading and
some AI and finance concepts. I especially liked the section on
Q-Learning.
- Neural
networks class by Hugo Larochelle, Université de Sherbrooke -
Interesting class about neural networks available online for free by
Hugo Larochelle, yet I have watched a few of those videos.
- GLO-4030/7030
Apprentissage par réseaux de neurones profonds - This is a class
given by Philippe Giguère, Professor at University Laval. I especially
found awesome its rare visualization of the multi-head attention
mechanism, which can be contemplated at the slide
28 of week 13’s class.
- Deep
Learning & Recurrent Neural Networks (DL&RNN) - The most
richly dense, accelerated course on the topic of Deep Learning &
Recurrent Neural Networks (scroll at the end).
Books
- Clean
Code - Get back to the basics you fool! Learn how to do Clean Code
for your career. This is by far the best book I’ve read even if this
list is related to Deep Learning.
- Clean
Coder - Learn how to be professional as a coder and how to interact
with your manager. This is important for any coding career.
- How
to Create a Mind - The audio version is nice to listen to while
commuting. This book is motivating about reverse-engineering the mind
and thinking on how to code AI.
- Neural
Networks and Deep Learning - This book covers many of the core
concepts behind neural networks and deep learning.
- Deep Learning - An MIT
Press book - Yet halfway through the book, it contains satisfying
math content on how to think about actual deep learning.
- Some
other books I have read - Some books listed here are less related to
deep learning but are still somehow relevant to this list.
Posts and Articles
- Predictions
made by Ray Kurzweil - List of mid to long term futuristic
predictions made by Ray Kurzweil.
- The
Unreasonable Effectiveness of Recurrent Neural Networks - MUST READ
post by Andrej Karpathy - this is what motivated me to learn RNNs, it
demonstrates what it can achieve in the most basic form of NLP.
- Neural
Networks, Manifolds, and Topology - Fresh look on how neurons map
information.
- Understanding
LSTM Networks - Explains the LSTM cells’ inner workings, plus, it
has interesting links in conclusion.
- Attention and
Augmented Recurrent Neural Networks - Interesting for visual
animations, it is a nice intro to attention mechanisms as an
example.
- Recommending
music on Spotify with deep learning - Awesome for doing clustering
on audio - post by an intern at Spotify.
- Announcing
SyntaxNet: The World’s Most Accurate Parser Goes Open Source -
Parsey McParseface’s birth, a neural syntax tree parser.
- Improving
Inception and Image Classification in TensorFlow - Very interesting
CNN architecture (e.g.: the inception-style convolutional layers is
promising and efficient in terms of reducing the number of
parameters).
- WaveNet:
A Generative Model for Raw Audio - Realistic talking machines:
perfect voice generation.
- François Chollet’s
Twitter - Author of Keras - has interesting Twitter posts and
innovative ideas.
- Neuralink and
the Brain’s Magical Future - Thought provoking article about the
future of the brain and brain-computer interfaces.
- Migrating
to Git LFS for Developing Deep Learning Applications with Large
Files - Easily manage huge files in your private Git projects.
- The
future of deep learning - François Chollet’s thoughts on the future
of deep learning.
- Discover
structure behind data with decision trees - Grow decision trees and
visualize them, infer the hidden logic behind data.
- Hyperopt
tutorial for Optimizing Neural Networks’ Hyperparameters - Learn to
slay down hyperparameter spaces automatically rather than by hand.
- Estimating
an Optimal Learning Rate For a Deep Neural Network - Clever trick to
estimate an optimal learning rate prior any single full training.
- The
Annotated Transformer - Good for understanding the “Attention Is All
You Need” (AIAYN) paper.
- The
Illustrated Transformer - Also good for understanding the “Attention
Is All You Need” (AIAYN) paper.
- Improving
Language Understanding with Unsupervised Learning - SOTA across many
NLP tasks from unsupervised pretraining on huge corpus.
- NLP’s ImageNet
moment has arrived - All hail NLP’s ImageNet moment.
- The
Illustrated BERT, ELMo, and co. (How NLP Cracked Transfer Learning)
- Understand the different approaches used for NLP’s ImageNet
moment.
- Uncle
Bob’s Principles Of OOD - Not only the SOLID principles are needed
for doing clean code, but the furtherless known REP, CCP, CRP, ADP, SDP
and SAP principles are very important for developping huge software that
must be bundled in different separated packages.
- Why
do 87% of data science projects never make it into production? -
Data is not to be overlooked, and communication between teams and data
scientists is important to integrate solutions properly.
- The
real reason most ML projects fail - Focus on clear business
objectives, avoid pivots of algorithms unless you have really clean
code, and be able to know when what you coded is “good enough”.
- SOLID
Machine Learning - The SOLID principles applied to Machine
Learning.
Practical Resources
Librairies and
Implementations
- Neuraxle, a
framwework for machine learning pipelines - The best framework for
structuring and deploying your machine learning projects, and which is
also compatible with most framework (e.g.: Scikit-Learn, TensorFlow,
PyTorch, Keras, and so forth).
- TensorFlow’s
GitHub repository - Most known deep learning framework, both
high-level and low-level while staying flexible.
- skflow -
TensorFlow wrapper à la scikit-learn.
- Keras - Keras is another intersting
deep learning framework like TensorFlow, it is mostly high-level.
- carpedm20’s repositories
- Many interesting neural network architectures are implemented by the
Korean guy Taehoon Kim, A.K.A. carpedm20.
- carpedm20/NTM-tensorflow
- Neural Turing Machine TensorFlow implementation.
- Deep
learning for lazybones - Transfer learning tutorial in TensorFlow
for vision from high-level embeddings of a pretrained CNN, AlexNet
2012.
- LSTM
for Human Activity Recognition (HAR) - Tutorial of mine on using
LSTMs on time series for classification.
- Deep
stacked residual bidirectional LSTMs for HAR - Improvements on the
previous project.
- Sequence
to Sequence (seq2seq) Recurrent Neural Network (RNN) for Time Series
Prediction - Tutorial of mine on how to predict temporal sequences
of numbers - that may be multichannel.
- Hyperopt
for a Keras CNN on CIFAR-100 - Auto (meta) optimizing a neural net
(and its architecture) on the CIFAR-100 dataset.
- ML
/ DL repositories I starred - GitHub is full of nice code samples
& projects.
- Smoothly
Blend Image Patches - Smooth patch merger for semantic
segmentation with a U-Net.
- Self
Governing Neural Networks (SGNN): the Projection Layer - With this,
you can use words in your deep learning models without training nor
loading embeddings.
- Neuraxle -
Neuraxle is a Machine Learning (ML) library for building neat pipelines,
providing the right abstractions to both ease research, development, and
deployment of your ML applications.
- Clean
Machine Learning, a Coding Kata - Learn the good design patterns to
use for doing Machine Learning the good way, by practicing.
Some Datasets
Those are resources I have found that seems interesting to develop
models onto.
Other Math Theory
Gradient
Descent Algorithms & Optimization Theory
Complex Numbers &
Digital Signal Processing
Okay, signal processing might not be directly related to deep
learning, but studying it is interesting to have more intuition in
developing neural architectures based on signal.
Papers
Recurrent Neural Networks
Convolutional Neural
Networks
- What is
the Best Multi-Stage Architecture for Object Recognition? - Awesome
for the use of “local contrast normalization”.
- ImageNet
Classification with Deep Convolutional Neural Networks - AlexNet,
2012 ILSVRC, breakthrough of the ReLU activation function.
- Visualizing and
Understanding Convolutional Networks - For the “deconvnet
layer”.
- Fast and Accurate
Deep Network Learning by Exponential Linear Units - ELU activation
function for CIFAR vision tasks.
- Very Deep
Convolutional Networks for Large-Scale Image Recognition -
Interesting idea of stacking multiple 3x3 conv+ReLU before pooling for a
bigger filter size with just a few parameters. There is also a nice
table for “ConvNet Configuration”.
- Going
Deeper with Convolutions - GoogLeNet: Appearance of “Inception”
layers/modules, the idea is of parallelizing conv layers into many
mini-conv of different size with “same” padding, concatenated on
depth.
- Highway
Networks - Highway networks: residual connections.
- Batch
Normalization: Accelerating Deep Network Training by Reducing Internal
Covariate Shift - Batch normalization (BN): to normalize a layer’s
output by also summing over the entire batch, and then performing a
linear rescaling and shifting of a certain trainable amount.
- U-Net: Convolutional
Networks for Biomedical Image Segmentation - The U-Net is an
encoder-decoder CNN that also has skip-connections, good for image
segmentation at a per-pixel level.
- Deep Residual
Learning for Image Recognition - Very deep residual layers with
batch normalization layers - a.k.a. “how to overfit any vision dataset
with too many layers and make any vision model work properly at
recognition given enough data”.
- Inception-v4,
Inception-ResNet and the Impact of Residual Connections on Learning
- For improving GoogLeNet with residual connections.
- WaveNet: a
Generative Model for Raw Audio - Epic raw voice/music generation
with new architectures based on dilated causal convolutions to capture
more audio length.
- Learning a
Probabilistic Latent Space of Object Shapes via 3D
Generative-Adversarial Modeling - 3D-GANs for 3D model generation
and fun 3D furniture arithmetics from embeddings (think like word2vec
word arithmetics with 3D furniture representations).
- Accurate,
Large Minibatch SGD: Training ImageNet in 1 Hour - Incredibly fast
distributed training of a CNN.
- Densely Connected
Convolutional Networks - Best Paper Award at CVPR 2017, yielding
improvements on state-of-the-art performances on CIFAR-10, CIFAR-100 and
SVHN datasets, this new neural network architecture is named
DenseNet.
- The One Hundred
Layers Tiramisu: Fully Convolutional DenseNets for Semantic
Segmentation - Merges the ideas of the U-Net and the DenseNet, this
new neural network is especially good for huge datasets in image
segmentation.
- Prototypical Networks
for Few-shot Learning - Use a distance metric in the loss to
determine to which class does an object belongs to from a few
examples.
Attention Mechanisms
Other
YouTube and Videos
Misc. Hubs & Links
- Hacker News - Maybe
how I discovered ML - Interesting trends appear on that site way before
they get to be a big deal.
- DataTau - This is a hub
similar to Hacker News, but specific to data science.
- Naver - This is a Korean search
engine - best used with Google Translate, ironically. Surprisingly,
sometimes deep learning search results and comprehensible advanced math
content shows up more easily there than on Google search.
- Arxiv Sanity Preserver -
arXiv browser with TF/IDF features.
- Awesome
Neuraxle - An awesome list for Neuraxle, a ML Framework for coding
clean production-level ML pipelines.
License

To the extent possible under law, Guillaume Chevalier
has waived all copyright and related or neighboring rights to this
work.
deeplearningresources.md
Github