Updating conversion, creating readmes

This commit is contained in:
Jonas Zeunert
2024-04-19 23:37:46 +02:00
parent 3619ac710a
commit 08e75b0f0a
635 changed files with 30878 additions and 37344 deletions

View File

@@ -1,4 +1,4 @@
 Awesome Recurrent Neural Networks
 Awesome Recurrent Neural Networks
A curated list of resources dedicated to recurrent neural networks (closely related to deep learning).
@@ -11,8 +11,7 @@
The project is not actively maintained.
!Join the chat at https://gitter.im/kjw0612/awesome-rnn (https://badges.gitter.im/Join%20Chat.svg) 
(https://gitter.im/kjw0612/awesome-rnn?utm_source=badge&utm_medium=badge&utm_campaign=pr-badge&utm_content=badge)
!Join the chat at https://gitter.im/kjw0612/awesome-rnn (https://badges.gitter.im/Join%20Chat.svg) (https://gitter.im/kjw0612/awesome-rnn?utm_source=badge&utm_medium=badge&utm_campaign=pr-badge&utm_content=badge)
Sharing
+ Share on Twitter (http://twitter.com/home?status=http://jiwonkim.org/awesome-rnn%0AResources%20for%20Recurrent%20Neural%20Networks)
@@ -80,8 +79,7 @@
  ⟡ torchnet (https://github.com/torchnet/torchnet) : modular framework that enables building neural network models
  ⟡ char-rnn (https://github.com/karpathy/char-rnn) by Andrej Karpathy : multi-layer RNN/LSTM/GRU for training/sampling from character-level language models
  ⟡ torch-rnn (https://github.com/jcjohnson/torch-rnn) by Justin Johnson : reusable RNN/LSTM modules for torch7 - much faster and memory efficient reimplementation of char-rnn
  ⟡ neuraltalk2 (https://github.com/karpathy/neuraltalk2) by Andrej Karpathy : Recurrent Neural Network captions image, much faster and better version of the original neuraltalk 
(https://github.com/karpathy/neuraltalk)
  ⟡ neuraltalk2 (https://github.com/karpathy/neuraltalk2) by Andrej Karpathy : Recurrent Neural Network captions image, much faster and better version of the original neuraltalk (https://github.com/karpathy/neuraltalk)
  ⟡ LSTM (https://github.com/wojzaremba/lstm) by Wojciech Zaremba : Long Short Term Memory Units to train a language model on word level Penn Tree Bank dataset
  ⟡ Oxford (https://github.com/oxford-cs-ml-2015) by Nando de Freitas : Oxford Computer Science - Machine Learning 2015 Practicals
  ⟡ rnn (https://github.com/Element-Research/rnn) by Nicholas Leonard : general library for implementing RNN, LSTM, BRNN and BLSTM (highly unit tested).
@@ -90,8 +88,8 @@
  ⟡ Practical PyTorch tutorials (https://github.com/spro/practical-pytorch) by Sean Robertson : focuses on using RNNs for Natural Language Processing
  ⟡ Deep Learning For NLP In PyTorch (https://github.com/rguthrie3/DeepLearningForNLPInPytorch) by Robert Guthrie : written for a Natural Language Processing class at Georgia Tech
⟡ DL4J (http://deeplearning4j.org/) by Skymind (http://www.skymind.io/) : Deep Learning library for Java, Scala & Clojure on Hadoop, Spark & GPUs
  ⟡ Documentation (http://deeplearning4j.org/) (Also in Chinese (http://deeplearning4j.org/zh-index.html), Japanese (http://deeplearning4j.org/ja-index.html), Korean (http://deeplearning4j.org/kr-index.html)) : 
RNN (http://deeplearning4j.org/usingrnns.html), LSTM (http://deeplearning4j.org/lstm.html)
  ⟡ Documentation (http://deeplearning4j.org/) (Also in Chinese (http://deeplearning4j.org/zh-index.html), Japanese (http://deeplearning4j.org/ja-index.html), Korean (http://deeplearning4j.org/kr-index.html)) : RNN 
(http://deeplearning4j.org/usingrnns.html), LSTM (http://deeplearning4j.org/lstm.html)
  ⟡ rnn examples (https://github.com/deeplearning4j/dl4j-examples/tree/master/dl4j-examples/src/main/java/org/deeplearning4j/examples/recurrent)
⟡ Etc.
  ⟡ Neon (http://neon.nervanasys.com/docs/latest/index.html): new deep learning library in Python, with support for RNN/LSTM, and a fast image captioning model
@@ -141,9 +139,8 @@
⟡ GFRNN Paper-arXiv (http://arxiv.org/pdf/1502.02367) Paper-ICML (http://jmlr.org/proceedings/papers/v37/chung15.pdf) Supplementary (http://jmlr.org/proceedings/papers/v37/chung15-supp.pdf) 
  ⟡ Junyoung Chung, Caglar Gulcehre, Kyunghyun Cho, Yoshua Bengio, Gated Feedback Recurrent Neural Networks, arXiv:1502.02367 / ICML 2015
⟡ Tree-Structured RNNs
  ⟡ Kai Sheng Tai, Richard Socher, and Christopher D. Manning, Improved Semantic Representations From Tree-Structured Long Short-Term Memory Networks, arXiv:1503.00075 / ACL 2015 Paper 
(http://arxiv.org/pdf/1503.00075) 
  ⟡ Samuel R. Bowman, Christopher D. Manning, and Christopher Potts, Tree-structured composition in neural networks without tree-structured architectures, arXiv:1506.04834 Paper (http://arxiv.org/pdf/1506.04834)
  ⟡ Kai Sheng Tai, Richard Socher, and Christopher D. Manning, Improved Semantic Representations From Tree-Structured Long Short-Term Memory Networks, arXiv:1503.00075 / ACL 2015 Paper (http://arxiv.org/pdf/1503.00075) 
  ⟡ Samuel R. Bowman, Christopher D. Manning, and Christopher Potts, Tree-structured composition in neural networks without tree-structured architectures, arXiv:1506.04834 Paper (http://arxiv.org/pdf/1506.04834) 
⟡ Grid LSTM Paper (http://arxiv.org/pdf/1507.01526) Code (https://github.com/coreylynch/grid-lstm) 
  ⟡ Nal Kalchbrenner, Ivo Danihelka, and Alex Graves, Grid Long Short-Term Memory, arXiv:1507.01526
⟡ Segmental RNN Paper (http://arxiv.org/pdf/1511.06018v2.pdf) 
@@ -158,8 +155,8 @@
⟡ LSTM Paper (http://deeplearning.cs.cmu.edu/pdfs/Hochreiter97_lstm.pdf) 
  ⟡ Sepp Hochreiter and Jurgen Schmidhuber, Long Short-Term Memory, Neural Computation 1997
⟡ GRU (Gated Recurrent Unit) Paper (http://arxiv.org/pdf/1406.1078.pdf) 
  ⟡ Kyunghyun Cho, Bart van Berrienboer, Caglar Gulcehre, Dzmitry Bahdanau, Fethi Bougares, Holger Schwenk, and Yoshua Bengio, Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine 
Translation, arXiv:1406.1078 / EMNLP 2014
  ⟡ Kyunghyun Cho, Bart van Berrienboer, Caglar Gulcehre, Dzmitry Bahdanau, Fethi Bougares, Holger Schwenk, and Yoshua Bengio, Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation, 
arXiv:1406.1078 / EMNLP 2014
⟡ NTM Paper (http://arxiv.org/pdf/1410.5401) 
  ⟡ A.Graves, G. Wayne, and I. Danihelka., Neural Turing Machines, arXiv preprint arXiv:1410.5401
⟡ Neural GPU Paper (http://arxiv.org/pdf/1511.08228.pdf) 
@@ -171,8 +168,7 @@
⟡ Deep Attention Recurrent Q-Network Paper (http://arxiv.org/abs/1512.01693) 
  ⟡ Ivan Sorokin, Alexey Seleznev, Mikhail Pavlov, Aleksandr Fedorov, Anastasiia Ignateva, Deep Attention Recurrent Q-Network , arXiv:1512.01693
⟡ Dynamic Memory Networks Paper (http://arxiv.org/abs/1506.07285) 
  ⟡ Ankit Kumar, Ozan Irsoy, Peter Ondruska, Mohit Iyyer, James Bradbury, Ishaan Gulrajani, Victor Zhong, Romain Paulus, Richard Socher, "Ask Me Anything: Dynamic Memory Networks for Natural Language 
Processing", arXiv:1506.07285
  ⟡ Ankit Kumar, Ozan Irsoy, Peter Ondruska, Mohit Iyyer, James Bradbury, Ishaan Gulrajani, Victor Zhong, Romain Paulus, Richard Socher, "Ask Me Anything: Dynamic Memory Networks for Natural Language Processing", arXiv:1506.07285
Surveys
⟡ Yann LeCun, Yoshua Bengio, and Geoffrey Hinton, Deep Learning (http://www.nature.com/nature/journal/v521/n7553/pdf/nature14539.pdf), Nature 2015
@@ -192,8 +188,7 @@
(http://www.fit.vutbr.cz/research/groups/speech/publi/2011/mikolov_icassp2011_5528.pdf) 
⟡ Stefan Kombrink, Tomas Mikolov, Martin Karafiat, Lukas Burget, Recurrent Neural Network based Language Modeling in Meeting Recognition, Interspeech 2011 Paper 
(http://www.fit.vutbr.cz/~imikolov/rnnlm/ApplicationOfRNNinMeetingRecognition_IS2011.pdf) 
⟡ Jiwei Li, Minh-Thang Luong, and Dan Jurafsky, A Hierarchical Neural Autoencoder for Paragraphs and Documents, ACL 2015 Paper (http://arxiv.org/pdf/1506.01057) , Code 
(https://github.com/jiweil/Hierarchical-Neural-Autoencoder) 
⟡ Jiwei Li, Minh-Thang Luong, and Dan Jurafsky, A Hierarchical Neural Autoencoder for Paragraphs and Documents, ACL 2015 Paper (http://arxiv.org/pdf/1506.01057) , Code (https://github.com/jiweil/Hierarchical-Neural-Autoencoder) 
⟡ Ryan Kiros, Yukun Zhu, Ruslan Salakhutdinov, and Richard S. Zemel, Skip-Thought Vectors, arXiv:1506.06726 / NIPS 2015 Paper (http://arxiv.org/pdf/1506.06726.pdf) 
⟡ Yoon Kim, Yacine Jernite, David Sontag, and Alexander M. Rush, Character-Aware Neural Language Models, arXiv:1508.06615 Paper (http://arxiv.org/pdf/1508.06615) 
⟡ Xingxing Zhang, Liang Lu, and Mirella Lapata, Tree Recurrent Neural Networks with Application to Language Modeling, arXiv:1511.00060 Paper (http://arxiv.org/pdf/1511.00060.pdf) 
@@ -201,27 +196,24 @@
Speech Recognition
⟡ Geoffrey Hinton, Li Deng, Dong Yu, George E. Dahl, Abdel-rahman Mohamed, Navdeep Jaitly, Andrew Senior, Vincent Vanhoucke, Patrick Nguyen, Tara N. Sainath, and Brian Kingsbury, Deep Neural Networks for 
Acoustic Modeling in Speech Recognition, IEEE Signam Processing Magazine 2012 Paper (http://cs224d.stanford.edu/papers/maas_paper.pdf) 
⟡ Geoffrey Hinton, Li Deng, Dong Yu, George E. Dahl, Abdel-rahman Mohamed, Navdeep Jaitly, Andrew Senior, Vincent Vanhoucke, Patrick Nguyen, Tara N. Sainath, and Brian Kingsbury, Deep Neural Networks for Acoustic Modeling in Speech 
Recognition, IEEE Signam Processing Magazine 2012 Paper (http://cs224d.stanford.edu/papers/maas_paper.pdf) 
⟡ Alex Graves, Abdel-rahman Mohamed, and Geoffrey Hinton, Speech Recognition with Deep Recurrent Neural Networks, arXiv:1303.5778 / ICASSP 2013 Paper (http://www.cs.toronto.edu/~fritz/absps/RNN13.pdf) 
⟡ Jan Chorowski, Dzmitry Bahdanau, Dmitriy Serdyuk, Kyunghyun Cho, and Yoshua Bengio, Attention-Based Models for Speech Recognition, arXiv:1506.07503 / NIPS 2015 Paper (http://arxiv.org/pdf/1506.07503) 
⟡ Haşim Sak, Andrew Senior, Kanishka Rao, and Françoise Beaufays. Fast and Accurate Recurrent Neural Network Acoustic Models for Speech Recognition, arXiv:1507.06947 2015 Paper 
(http://arxiv.org/pdf/1507.06947v1.pdf) .
⟡ Haşim Sak, Andrew Senior, Kanishka Rao, and Françoise Beaufays. Fast and Accurate Recurrent Neural Network Acoustic Models for Speech Recognition, arXiv:1507.06947 2015 Paper (http://arxiv.org/pdf/1507.06947v1.pdf) .
Machine Translation
⟡ Oxford Paper (http://www.nal.ai/papers/kalchbrennerblunsom_emnlp13) 
  ⟡ Nal Kalchbrenner and Phil Blunsom, Recurrent Continuous Translation Models, EMNLP 2013
⟡ Univ. Montreal
  ⟡ Kyunghyun Cho, Bart van Berrienboer, Caglar Gulcehre, Dzmitry Bahdanau, Fethi Bougares, Holger Schwenk, and Yoshua Bengio, Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine 
Translation, arXiv:1406.1078 / EMNLP 2014 Paper (http://arxiv.org/pdf/1406.1078) 
  ⟡ Kyunghyun Cho, Bart van Merrienboer, Dzmitry Bahdanau, and Yoshua Bengio, On the Properties of Neural Machine Translation: Encoder-Decoder Approaches, SSST-8 2014 Paper 
(http://www.aclweb.org/anthology/W14-4012) 
  ⟡ Kyunghyun Cho, Bart van Berrienboer, Caglar Gulcehre, Dzmitry Bahdanau, Fethi Bougares, Holger Schwenk, and Yoshua Bengio, Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation, 
arXiv:1406.1078 / EMNLP 2014 Paper (http://arxiv.org/pdf/1406.1078) 
  ⟡ Kyunghyun Cho, Bart van Merrienboer, Dzmitry Bahdanau, and Yoshua Bengio, On the Properties of Neural Machine Translation: Encoder-Decoder Approaches, SSST-8 2014 Paper (http://www.aclweb.org/anthology/W14-4012) 
  ⟡ Jean Pouget-Abadie, Dzmitry Bahdanau, Bart van Merrienboer, Kyunghyun Cho, and Yoshua Bengio, Overcoming the Curse of Sentence Length for Neural Machine Translation using Automatic Segmentation, SSST-8 2014
  ⟡ Dzmitry Bahdanau, KyungHyun Cho, and Yoshua Bengio, Neural Machine Translation by Jointly Learning to Align and Translate, arXiv:1409.0473 / ICLR 2015 Paper (http://arxiv.org/pdf/1409.0473) 
  ⟡ Sebastian Jean, Kyunghyun Cho, Roland Memisevic, and Yoshua Bengio, On using very large target vocabulary for neural machine translation, arXiv:1412.2007 / ACL 2015 Paper (http://arxiv.org/pdf/1412.2007.pdf)
  ⟡ Sebastian Jean, Kyunghyun Cho, Roland Memisevic, and Yoshua Bengio, On using very large target vocabulary for neural machine translation, arXiv:1412.2007 / ACL 2015 Paper (http://arxiv.org/pdf/1412.2007.pdf) 
⟡ Univ. Montreal + Middle East Tech. Univ. + Univ. Maine Paper (http://arxiv.org/pdf/1503.03535.pdf) 
  ⟡ Caglar Gulcehre, Orhan Firat, Kelvin Xu, Kyunghyun Cho, Loic Barrault, Huei-Chi Lin, Fethi Bougares, Holger Schwenk, and Yoshua Bengio, On Using Monolingual Corpora in Neural Machine Translation, 
arXiv:1503.03535
  ⟡ Caglar Gulcehre, Orhan Firat, Kelvin Xu, Kyunghyun Cho, Loic Barrault, Huei-Chi Lin, Fethi Bougares, Holger Schwenk, and Yoshua Bengio, On Using Monolingual Corpora in Neural Machine Translation, arXiv:1503.03535
⟡ Google Paper (http://papers.nips.cc/paper/5346-sequence-to-sequence-learning-with-neural-networks.pdf) 
  ⟡ Ilya Sutskever, Oriol Vinyals, and Quoc V. Le, Sequence to Sequence Learning with Neural Networks, arXiv:1409.3215 / NIPS 2014
⟡ Google + NYU Paper (http://arxiv.org/pdf/1410.8206) 
@@ -236,10 +228,9 @@
Conversation Modeling
⟡ Lifeng Shang, Zhengdong Lu, and Hang Li, Neural Responding Machine for Short-Text Conversation, arXiv:1503.02364 / ACL 2015 Paper (http://arxiv.org/pdf/1503.02364) 
⟡ Oriol Vinyals and Quoc V. Le, A Neural Conversational Model, arXiv:1506.05869 Paper (http://arxiv.org/pdf/1506.05869) 
⟡ Ryan Lowe, Nissan Pow, Iulian V. Serban, and Joelle Pineau, The Ubuntu Dialogue Corpus: A Large Dataset for Research in Unstructured Multi-Turn Dialogue Systems, arXiv:1506.08909 Paper 
(http://arxiv.org/pdf/1506.08909) 
⟡ Jesse Dodge, Andreea Gane, Xiang Zhang, Antoine Bordes, Sumit Chopra, Alexander Miller, Arthur Szlam, and Jason Weston, Evaluating Prerequisite Qualities for Learning End-to-End Dialog Systems, 
arXiv:1511.06931 Paper (http://arxiv.org/pdf/1511.06931) 
⟡ Ryan Lowe, Nissan Pow, Iulian V. Serban, and Joelle Pineau, The Ubuntu Dialogue Corpus: A Large Dataset for Research in Unstructured Multi-Turn Dialogue Systems, arXiv:1506.08909 Paper (http://arxiv.org/pdf/1506.08909) 
⟡ Jesse Dodge, Andreea Gane, Xiang Zhang, Antoine Bordes, Sumit Chopra, Alexander Miller, Arthur Szlam, and Jason Weston, Evaluating Prerequisite Qualities for Learning End-to-End Dialog Systems, arXiv:1511.06931 Paper 
(http://arxiv.org/pdf/1511.06931) 
⟡ Jason Weston, Dialog-based Language Learning, arXiv:1604.06045, Paper (http://arxiv.org/pdf/1604.06045) 
⟡ Antoine Bordes and Jason Weston, Learning End-to-End Goal-Oriented Dialog, arXiv:1605.07683 Paper (http://arxiv.org/pdf/1605.07683) 
@@ -252,23 +243,20 @@
⟡ DeepMind + Oxford Paper (http://arxiv.org/pdf/1506.03340.pdf) 
  ⟡ Karl M. Hermann, Tomas Kocisky, Edward Grefenstette, Lasse Espeholt, Will Kay, Mustafa Suleyman, and Phil Blunsom, Teaching Machines to Read and Comprehend, arXiv:1506.03340 / NIPS 2015
⟡ MetaMind Paper (http://arxiv.org/pdf/1506.07285.pdf) 
  ⟡ Ankit Kumar, Ozan Irsoy, Jonathan Su, James Bradbury, Robert English, Brian Pierce, Peter Ondruska, Mohit Iyyer, Ishaan Gulrajani, and Richard Socher, Ask Me Anything: Dynamic Memory Networks for Natural 
Language Processing, arXiv:1506.07285
  ⟡ Ankit Kumar, Ozan Irsoy, Jonathan Su, James Bradbury, Robert English, Brian Pierce, Peter Ondruska, Mohit Iyyer, Ishaan Gulrajani, and Richard Socher, Ask Me Anything: Dynamic Memory Networks for Natural Language Processing, 
arXiv:1506.07285
Computer Vision
Object Recognition
⟡ Pedro Pinheiro and Ronan Collobert, Recurrent Convolutional Neural Networks for Scene Labeling, ICML 2014 Paper (http://jmlr.org/proceedings/papers/v32/pinheiro14.pdf) 
⟡ Ming Liang and Xiaolin Hu, Recurrent Convolutional Neural Network for Object Recognition, CVPR 2015 Paper 
(http://www.cv-foundation.org/openaccess/content_cvpr_2015/papers/Liang_Recurrent_Convolutional_Neural_2015_CVPR_paper.pdf) 
⟡ Ming Liang and Xiaolin Hu, Recurrent Convolutional Neural Network for Object Recognition, CVPR 2015 Paper (http://www.cv-foundation.org/openaccess/content_cvpr_2015/papers/Liang_Recurrent_Convolutional_Neural_2015_CVPR_paper.pdf) 
⟡ Wonmin Byeon, Thomas Breuel, Federico Raue1, and Marcus Liwicki1, Scene Labeling with LSTM Recurrent Neural Networks, CVPR 2015 Paper 
(http://www.cv-foundation.org/openaccess/content_cvpr_2015/papers/Byeon_Scene_Labeling_With_2015_CVPR_paper.pdf) 
⟡ Mircea Serban Pavel, Hannes Schulz, and Sven Behnke, Recurrent Convolutional Neural Networks for Object-Class Segmentation of RGB-D Video, IJCNN 2015 Paper 
(http://www.ais.uni-bonn.de/papers/IJCNN_2015_Pavel.pdf) 
⟡ Shuai Zheng, Sadeep Jayasumana, Bernardino Romera-Paredes, Vibhav Vineet, Zhizhong Su, Dalong Du, Chang Huang, and Philip H. S. Torr, Conditional Random Fields as Recurrent Neural Networks, arXiv:1502.03240 
Paper (http://arxiv.org/pdf/1502.03240) 
⟡ Xiaodan Liang, Xiaohui Shen, Donglai Xiang, Jiashi Feng, Liang Lin, and Shuicheng Yan, Semantic Object Parsing with Local-Global Long Short-Term Memory, arXiv:1511.04510 Paper 
(http://arxiv.org/pdf/1511.04510.pdf) 
⟡ Mircea Serban Pavel, Hannes Schulz, and Sven Behnke, Recurrent Convolutional Neural Networks for Object-Class Segmentation of RGB-D Video, IJCNN 2015 Paper (http://www.ais.uni-bonn.de/papers/IJCNN_2015_Pavel.pdf) 
⟡ Shuai Zheng, Sadeep Jayasumana, Bernardino Romera-Paredes, Vibhav Vineet, Zhizhong Su, Dalong Du, Chang Huang, and Philip H. S. Torr, Conditional Random Fields as Recurrent Neural Networks, arXiv:1502.03240 Paper 
(http://arxiv.org/pdf/1502.03240) 
⟡ Xiaodan Liang, Xiaohui Shen, Donglai Xiang, Jiashi Feng, Liang Lin, and Shuicheng Yan, Semantic Object Parsing with Local-Global Long Short-Term Memory, arXiv:1511.04510 Paper (http://arxiv.org/pdf/1511.04510.pdf) 
⟡ Sean Bell, C. Lawrence Zitnick, Kavita Bala, and Ross Girshick, Inside-Outside Net: Detecting Objects in Context with Skip Pooling and Recurrent Neural Networks, arXiv:1512.04143 / ICCV 2015 workshop Paper 
(http://arxiv.org/pdf/1512.04143) 
@@ -300,21 +288,21 @@
⟡ Univ. Toronto Paper (http://arxiv.org/pdf/1411.2539) Web demo (http://deeplearning.cs.toronto.edu/i2t) 
  ⟡ Ryan Kiros, Ruslan Salakhutdinov, and Richard S. Zemel, Unifying Visual-Semantic Embeddings with Multimodal Neural Language Models, arXiv:1411.2539 / TACL 2015
⟡ Berkeley Web (http://jeffdonahue.com/lrcn/) Paper (http://arxiv.org/pdf/1411.4389) 
  ⟡ Jeff Donahue, Lisa Anne Hendricks, Sergio Guadarrama, Marcus Rohrbach, Subhashini Venugopalan, Kate Saenko, and Trevor Darrell, Long-term Recurrent Convolutional Networks for Visual Recognition and 
Description, arXiv:1411.4389 / CVPR 2015
  ⟡ Jeff Donahue, Lisa Anne Hendricks, Sergio Guadarrama, Marcus Rohrbach, Subhashini Venugopalan, Kate Saenko, and Trevor Darrell, Long-term Recurrent Convolutional Networks for Visual Recognition and Description, arXiv:1411.4389 / 
CVPR 2015
⟡ Google Paper (http://arxiv.org/pdf/1411.4555) 
  ⟡ Oriol Vinyals, Alexander Toshev, Samy Bengio, and Dumitru Erhan, Show and Tell: A Neural Image Caption Generator, arXiv:1411.4555 / CVPR 2015
⟡ Stanford Web  (http://cs.stanford.edu/people/karpathy/deepimagesent/) Paper  (http://cs.stanford.edu/people/karpathy/cvpr2015.pdf)
  ⟡ Andrej Karpathy and Li Fei-Fei, Deep Visual-Semantic Alignments for Generating Image Description, CVPR 2015
⟡ Microsoft Paper (http://arxiv.org/pdf/1411.4952) 
  ⟡ Hao Fang, Saurabh Gupta, Forrest Iandola, Rupesh Srivastava, Li Deng, Piotr Dollar, Jianfeng Gao, Xiaodong He, Margaret Mitchell, John C. Platt, Lawrence Zitnick, and Geoffrey Zweig, From Captions to Visual 
Concepts and Back, arXiv:1411.4952 / CVPR 2015
  ⟡ Hao Fang, Saurabh Gupta, Forrest Iandola, Rupesh Srivastava, Li Deng, Piotr Dollar, Jianfeng Gao, Xiaodong He, Margaret Mitchell, John C. Platt, Lawrence Zitnick, and Geoffrey Zweig, From Captions to Visual Concepts and Back, 
arXiv:1411.4952 / CVPR 2015
⟡ CMU + Microsoft Paper-arXiv (http://arxiv.org/pdf/1411.5654) , Paper-CVPR (http://www.cs.cmu.edu/~xinleic/papers/cvpr15_rnn.pdf) 
  ⟡ Xinlei Chen, and C. Lawrence Zitnick, Learning a Recurrent Visual Representation for Image Caption Generation
  ⟡ Xinlei Chen, and C. Lawrence Zitnick, Minds Eye: A Recurrent Visual Representation for Image Caption Generation, CVPR 2015
⟡ Univ. Montreal + Univ. Toronto Web (http://kelvinxu.github.io/projects/capgen.html) Paper (http://www.cs.toronto.edu/~zemel/documents/captionAttn.pdf) 
  ⟡ Kelvin Xu, Jimmy Lei Ba, Ryan Kiros, Kyunghyun Cho, Aaron Courville, Ruslan Salakhutdinov, Richard S. Zemel, and Yoshua Bengio, Show, Attend, and Tell: Neural Image Caption Generation with Visual Attention, 
arXiv:1502.03044 / ICML 2015
  ⟡ Kelvin Xu, Jimmy Lei Ba, Ryan Kiros, Kyunghyun Cho, Aaron Courville, Ruslan Salakhutdinov, Richard S. Zemel, and Yoshua Bengio, Show, Attend, and Tell: Neural Image Caption Generation with Visual Attention, arXiv:1502.03044 / ICML 
2015
⟡ Idiap + EPFL + Facebook Paper (http://arxiv.org/pdf/1502.03671) 
  ⟡ Remi Lebret, Pedro O. Pinheiro, and Ronan Collobert, Phrase-based Image Captioning, arXiv:1502.03671 / ICML 2015
⟡ UCLA + Baidu Paper (http://arxiv.org/pdf/1504.06692) 
@@ -336,8 +324,8 @@
Video Captioning
⟡ Berkeley Web (http://jeffdonahue.com/lrcn/) Paper (http://arxiv.org/pdf/1411.4389) 
  ⟡ Jeff Donahue, Lisa Anne Hendricks, Sergio Guadarrama, Marcus Rohrbach, Subhashini Venugopalan, Kate Saenko, and Trevor Darrell, Long-term Recurrent Convolutional Networks for Visual Recognition and 
Description, arXiv:1411.4389 / CVPR 2015
  ⟡ Jeff Donahue, Lisa Anne Hendricks, Sergio Guadarrama, Marcus Rohrbach, Subhashini Venugopalan, Kate Saenko, and Trevor Darrell, Long-term Recurrent Convolutional Networks for Visual Recognition and Description, arXiv:1411.4389 / 
CVPR 2015
⟡ UT Austin + UML + Berkeley Paper (http://arxiv.org/pdf/1412.4729) 
  ⟡ Subhashini Venugopalan, Huijuan Xu, Jeff Donahue, Marcus Rohrbach, Raymond Mooney, and Kate Saenko, Translating Videos to Natural Language Using Deep Recurrent Neural Networks, arXiv:1412.4729
⟡ Microsoft Paper (http://arxiv.org/pdf/1505.01861) 
@@ -349,8 +337,7 @@
⟡ MPI + Berkeley Paper (http://arxiv.org/pdf/1506.01698.pdf) 
  ⟡ Anna Rohrbach, Marcus Rohrbach, and Bernt Schiele, The Long-Short Story of Movie Description, arXiv:1506.01698
⟡ Univ. Toronto + MIT Paper (http://arxiv.org/pdf/1506.06724.pdf) 
  ⟡ Yukun Zhu, Ryan Kiros, Richard Zemel, Ruslan Salakhutdinov, Raquel Urtasun, Antonio Torralba, and Sanja Fidler, Aligning Books and Movies: Towards Story-like Visual Explanations by Watching Movies and 
Reading Books, arXiv:1506.06724
  ⟡ Yukun Zhu, Ryan Kiros, Richard Zemel, Ruslan Salakhutdinov, Raquel Urtasun, Antonio Torralba, and Sanja Fidler, Aligning Books and Movies: Towards Story-like Visual Explanations by Watching Movies and Reading Books, arXiv:1506.06724
⟡ Univ. Montreal Paper (http://arxiv.org/pdf/1507.01053.pdf) 
  ⟡ Kyunghyun Cho, Aaron Courville, and Yoshua Bengio, Describing Multimedia Content using Attention-based Encoder-Decoder Networks, arXiv:1507.01053
⟡ Zhejiang Univ. + UTS Paper (http://arxiv.org/abs/1511.03476) 
@@ -362,8 +349,7 @@
Visual Question Answering
⟡ Virginia Tech. + MSR Web (http://www.visualqa.org/) Paper (http://arxiv.org/pdf/1505.00468) 
  ⟡ Stanislaw Antol, Aishwarya Agrawal, Jiasen Lu, Margaret Mitchell, Dhruv Batra, C. Lawrence Zitnick, and Devi Parikh, VQA: Visual Question Answering, arXiv:1505.00468 / CVPR 2015 SUNw:Scene Understanding 
workshop
  ⟡ Stanislaw Antol, Aishwarya Agrawal, Jiasen Lu, Margaret Mitchell, Dhruv Batra, C. Lawrence Zitnick, and Devi Parikh, VQA: Visual Question Answering, arXiv:1505.00468 / CVPR 2015 SUNw:Scene Understanding workshop
⟡ MPI + Berkeley Web (https://www.mpi-inf.mpg.de/departments/computer-vision-and-multimodal-computing/research/vision-and-language/visual-turing-challenge/) Paper (http://arxiv.org/pdf/1505.01121) 
  ⟡ Mateusz Malinowski, Marcus Rohrbach, and Mario Fritz, Ask Your Neurons: A Neural-based Approach to Answering Questions about Images, arXiv:1505.01121
⟡ Univ. Toronto Paper (http://arxiv.org/pdf/1505.02074) Dataset (http://www.cs.toronto.edu/~mren/imageqa/data/cocoqa/) 
@@ -403,8 +389,7 @@
Robotics
⟡ Hongyuan Mei, Mohit Bansal, and Matthew R. Walter, Listen, Attend, and Walk: Neural Mapping of Navigational Instructions to Action Sequences, arXiv:1506.04089 Paper (http://arxiv.org/pdf/1506.04089.pdf) 
⟡ Marvin Zhang, Sergey Levine, Zoe McCarthy, Chelsea Finn, and Pieter Abbeel, Policy Learning with Continuous Memory States for Partially Observed Robotic Control, arXiv:1507.01273. Paper  
(http://arxiv.org/pdf/1507.01273)
⟡ Marvin Zhang, Sergey Levine, Zoe McCarthy, Chelsea Finn, and Pieter Abbeel, Policy Learning with Continuous Memory States for Partially Observed Robotic Control, arXiv:1507.01273. Paper  (http://arxiv.org/pdf/1507.01273)
Other
⟡ Alex Graves, Generating Sequences With Recurrent Neural Networks, arXiv:1308.0850 Paper  (http://arxiv.org/abs/1308.0850)
@@ -417,10 +402,8 @@
⟡ Cesar Laurent, Gabriel Pereyra, Philemon Brakel, Ying Zhang, and Yoshua Bengio, Batch Normalized Recurrent Neural Networks, arXiv:1510.01378 Paper (http://arxiv.org/pdf/1510.01378) 
⟡ Jiwon Kim, Jung Kwon Lee, Kyoung Mu Lee, Deeply-Recursive Convolutional Network for Image Super-Resolution, arXiv:1511.04491 Paper  (http://arxiv.org/abs/1511.04491)
⟡ Quan Gan, Qipeng Guo, Zheng Zhang, and Kyunghyun Cho, First Step toward Model-Free, Anonymous Object Tracking with Recurrent Neural Networks, arXiv:1511.06425 Paper (http://arxiv.org/pdf/1511.06425.pdf) 
⟡ Francesco Visin, Kyle Kastner, Aaron Courville, Yoshua Bengio, Matteo Matteucci, and Kyunghyun Cho, ReSeg: A Recurrent Neural Network for Object Segmentation, arXiv:1511.07053 Paper 
(http://arxiv.org/pdf/1511.07053.pdf) 
⟡ Juergen Schmidhuber, On Learning to Think: Algorithmic Information Theory for Novel Combinations of Reinforcement Learning Controllers and Recurrent Neural World Models, arXiv:1511.09249 Paper  
(http://arxiv.org/pdf/1511.09249)
⟡ Francesco Visin, Kyle Kastner, Aaron Courville, Yoshua Bengio, Matteo Matteucci, and Kyunghyun Cho, ReSeg: A Recurrent Neural Network for Object Segmentation, arXiv:1511.07053 Paper (http://arxiv.org/pdf/1511.07053.pdf) 
⟡ Juergen Schmidhuber, On Learning to Think: Algorithmic Information Theory for Novel Combinations of Reinforcement Learning Controllers and Recurrent Neural World Models, arXiv:1511.09249 Paper  (http://arxiv.org/pdf/1511.09249)
Datasets
⟡ Speech Recognition
@@ -442,8 +425,8 @@
* The SimpleQuestions dataset - ****Paper** (http://arxiv.org/abs/1506.02075)**  
  ⟡ SQuAD (https://stanford-qa.com/) - Stanford Question Answering Dataset : Paper (http://arxiv.org/pdf/1606.05250) 
⟡ Image Question Answering
  ⟡ DAQUAR (https://www.mpi-inf.mpg.de/departments/computer-vision-and-multimodal-computing/research/vision-and-language/visual-turing-challenge/) - built upon NYU Depth v2 
(http://cs.nyu.edu/~silberman/datasets/nyu_depth_v2.html) by N. Silberman et al.
  ⟡ DAQUAR (https://www.mpi-inf.mpg.de/departments/computer-vision-and-multimodal-computing/research/vision-and-language/visual-turing-challenge/) - built upon NYU Depth v2 (http://cs.nyu.edu/~silberman/datasets/nyu_depth_v2.html) by N.
Silberman et al.
  ⟡ VQA (http://www.visualqa.org/) - based on MSCOCO (http://mscoco.org/) images
  ⟡ Image QA (http://www.cs.toronto.edu/~mren/imageqa/data/cocoqa/) - based on MSCOCO images
  ⟡ Multilingual Image QA (http://idl.baidu.com/FM-IQA.html) - built from scratch by Baidu - in Chinese, with English translation