Research "Records" That Shaped and Transformed the Field
Gradient Descent
Stochastic Approximation
Automatic Differentiation
- A simple automatic derivative evaluation program, Wengert, 1964
- Introduction to Automatic Differentiation, Griewank and Walther, 2003
Backpropagation
Neural Network Architectures
Activation Functions
Optimization Techniques
Normalization Techniques
- Understanding the difficulty of training deep feedforward neural networks, Glorot and Bengio, 2010
- Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification, He et al., 2015
Initialization Techniques
- Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift, Ioffe and Szegedy, 2015
- Layer Normalization, Ba et al., 2016
Regularization Techniques
Deep Supervised Learning
- ImageNet Classification with Deep Convolutional Neural Networks, Krizhevsky et al., 2012)
- Deep Neural Networks for Acoustic Modeling in Speech Recognition, Hinton et al., 2012
Deep Unsupervised Learning
- Auto-Encoding Variational Bayes, Kingma and Welling, 2013
- Generative Adversarial Networks, Ian Goodfellow et al., 2014
Deep Reinforcement Learning
- Playing Atari with Deep Reinforcement Learning, Mnih et al., 2013
- Human-level control through deep reinforcement learning, Mnih et al., 2015
- Mastering the game of Go with deep neural networks and tree search", Silver et al., 2016
Pre-Transformer Era: Building Blocks of Modern NLP
- A Neural Probabilistic Language Model, Bengio et al., 2003
- Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation, Cho et al., 2014
- Sequence to Sequence Learning with Neural Networks, Sutskever et al., 2014
- Neural Machine Translation of Rare Words with Subword Units, Sennrich et al., 2015
- Distributed Representations of Words and Phrases and their Compositionality, Mikolov et al., 2013
Transformer Revolution: Attention Is All You Need
Language Models: The Rise of Pre-training
- BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding, Devlin et al., 2018
- Improving Language Understanding by Generative Pre-Training, Radford et al., 2018
Large Language Models: Scaling to New Heights
- Language Models are Few-Shot Learners, Brown et al., 2020
- Training language models to follow instructions with human feedback, Ouyang et al., 2022
Image Classifications
- Visualizing and Understanding Convolutional Networks, Zeiler and Fergus, 2013)
- Very Deep Convolutional Networks for Large-Scale Image Recognition, Simonyan and Zisserman, 2014
- Going Deeper with Convolutions, Szegedy et al., 2014
- Deep Residual Learning for Image Recognition, He et al., 2015
Object Detection and Segmentation
- Rich feature hierarchies for accurate object detection and semantic segmentation, Girshick et al., 2014
- Fast R-CNN, Girshick, 2015
- Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks, Ren et al., 2015
- Mask R-CNN, He et al., 2017
- You Only Look Once: Unified, Real-Time Object Detection, Redmon et al., 2016
- YOLO9000: Better, Faster, Stronger, Redmon and Farhadi, 2017
- YOLOv3: An Incremental Improvement, Redmon and Farhadi, 2018
- U-Net: Convolutional Networks for Biomedical Image Segmentation, Ronneberger et al., 2015
Generative Models
Transformers - Veni, Vidi, Vici!!!
- An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale, Dosovitskiy et al., 2020
- End-to-End Object Detection with Transformers, Carion et al., 2020
- Segment Anything Model, Kirillov et al., 2023
- Scalable Diffusion Models with Transformers, Peebles and Xie, 2023
Vision-Language Models
- Learning Transferable Visual Models From Natural Language Supervision, Radford et al., 2021
- Flamingo: a Visual Language Model for Few-Shot Learning, Alayrac et al., 2022
Audio-Language Models