当前位置：文档之家› 2011年-2016年语音识别方法小结

2011年-2016年语音识别方法小结

2. Mohamed A, Dahl G E, Hinton G. Acoustic Modeling Using Deep Belief Networks[J]. IEEE Transactions on Audio Speech & Language Processing, 2012, 20(1):14--22.
2011年-2016年语音识别（SPEECH RECOGNITION）研究方法小结
说明：我从谷歌学术上以“SPEECH RECOGNITION”等关键词搜索最近5年国际上发表的相关论文，对部分论文所采用的模型/方法进行了非常简要的总结。
2011年
采用的模型/方法有： 1. The context-independent Deep Belief Network
4. Toth L. Combining time- and frequency-domain convolution in convolutional neural network-based phone recognition[C]// 2014:190-194.
5. Sainath T N, Kingsbury B, Saon G, et al. Deep Convolutional Neural Networks for Largescale Speech Tasks[J]. Neural Networks, 2015, 64:39-48.
1. Dahl G E, Dong Y, Li D, et al. Large vocabulary continuous speech recognition with context-dependent DBN-HMMS[C]// IEEE International Conference on Acoustics, Speech & Signal Processing. IEEE, 2011:4688-4691.Deep Belief Networks (DBNs)
2. Hannun A, Case C, Casper J, et al. Deep Speech: Scaling up end-to-end speech recognition[J]. Eprint Arxiv, 2014.
3. Chorowski J, Bahdanau D, Cho K, et al. End-to-end Continuous Speech Recognition using Attention-based Recurrent NN: First Results[J]. Eprint Arxiv, 2014.
in Convolutional Neural NetworБайду номын сангаасs(CNNs) 5. Deep Convolutional Neural Networks(CNNs) 6. Bi-Directional Recurrent DNNs
2014年
方法/模型对应的论文：
1. A. Graves, N. Jaitly. Towards end-to-end speech recognition with recurrent neural networks[C]// International Conference on Machine Learning. 2014:1764-1772.
2015年
采用的模型/方法有： 1. Sequence-to-Sequence Neural Net Models 2. Attention-based Recurrent Networks 3. Long Short-Term Memory (LSTM) recurrent neural
networks (RNNs)
3. Deng L, Tur G, He X, et al. Use of kernel deep convex networks and end-to-end learning for spoken language understanding[C]// Spoken Language Technology Workshop. 2012:210-215.
Networks(RNNs) 4. Deep Neural Networks(DNNs) (by substituting the
logistic units with rectified linear units.)
2013年
方法/模型对应的论文：
1. Graves, A, Mohamed, A.-R, Hinton, G. Speech recognition with deep recurrent neural networks[C]// 2013:257-264 vol.1.
2. Sainath T N, Kingsbury B, Ramabhadran B, et al. Making Deep Belief Networks effective for large vocabulary continuous speech recognition[C]// Automatic Speech Recognition and Understanding. IEEE, 2010:30-35.
6. Hannun A Y, Maas A L, Jurafsky D, et al. First-Pass Large Vocabulary Continuous Speech Recognition using Bi-Directional Recurrent DNNs[J]. Eprint Arxiv, 2014.
2015年
方法/模型对应的论文：
1. Yao K, Zweig G. Sequence-to-Sequence Neural Net Models for Grapheme-to-Phoneme Conversion[J]. Computer Science, 2015.
2. Chorowski J, Bahdanau D, Serdyuk D, et al. Attention-Based Models for Speech Recognition[J]. Computer Science, 2015.
2013年
采用的模型/方法有： 1. Deep Recurrent Neural Networks(RNNs) 2. Deep Convolutional Neural Networks(CNNs) 3. Deep Bidirectional LSTM (DBLSTM) Recurrent Neural
3. A）Rao K, Peng F, Sak H, et al. Grapheme-to-phoneme conversion using Long Short-Term Memory recurrent neural networks[C]// IEEE International Conference on Acoustics, Speech and Signal Processing. IEEE, 2015.
3. B）Haşim Sak, Andrew Senior, Kanishka Rao, et al. Fast and Accurate Recurrent Neural Network Acoustic Models for Speech Recognition[J]. Computer Science, 2015.
Neural Nets and Kernel Acoustic Models for Speech Recognition[J]. Respiratory Physiology & Neurobiology, 2016, 161(2):214-7.
2012年
采用的模型/方法有： 1. Deep Neural Networks(DNNs) 2. Deep Belief Networks(DBNs), neural networks 3. Kernel deep convex networks
2012年
方法/模型对应的论文：
1. Hinton G, Deng L, Yu D, et al. Deep Neural Networks for Acoustic Modeling in Speech Recognition[J]. IEEE Signal Processing Magazine, 2012, 29(6):82 - 97.
3. B）Dahl G E, Yu D, Deng L, et al. Context-Dependent Pre-trained Deep Neural Networks for Large Vocabulary Speech Recognition[J]. IEEE Transactions on Audio Speech & Language Processing, 2012, 20(1):30 - 42.
4. Zeiler, M.D, Ranzato M, Monga R, et al. On rectified linear units for speech processing[J]. 2013, 32(3):3517-3521.
2014年
采用的模型/方法有： 1. LSTM Recurrent Neural Networks(RNNs) 2. A well-optimized RNN training system 3. Attention-based RNN 4. Combining time and frequency-domain convolution
2. Sainath T N, Mohamed A R, Kingsbury B, et al. Deep convolutional neural networks for LVCSR[J]. 2013:8614-8618.

e商务文档

2011年-2016年语音识别方法小结

相关文档推荐：