Deep Belief Network & Dialog Systems


Deep Belief Network

Notes:

  • Deep neural networks

Wikipedia:

See also:

100 Best Deep Learning Videos


Context-dependent pre-trained deep neural networks for large-vocabulary speech recognition GE Dahl, D Yu, L Deng, A Acero – Audio, Speech, and …, 2012 – ieeexplore.ieee.org … Abstract—We propose a novel context-dependent (CD) model for large vocabulary speech recognition (LVSR) that leverages recent advances in using deep belief networks for phone recog- nition. … II. DEEP BELIEF NETWORKS … Cited by 565 Related articles All 13 versions

Towards deeper understanding: deep convex networks for semantic utterance classification G Tur, L Deng, D Hakkani-Tur… – Acoustics, Speech and …, 2012 – ieeexplore.ieee.org … In the last decade, a variety of practi- cal goal-oriented spoken dialog systems have been built for limited domains. … ma- chines (SVMs) [4], or maximum entropy models [5]. It is only very recently that researchers have started experiment- ing with deep belief networks (DBNs) for … Cited by 33 Related articles All 11 versions

Recent advances in deep learning for speech research at Microsoft L Deng, J Li, JT Huang, K Yao, D Yu… – … , Speech and Signal …, 2013 – ieeexplore.ieee.org … model,” Interspeech, 20 I O. [39] A. Mohamed, D.Yu, L. Deng, “Investigation of full-sequence training of deep belief networks for speech … and R. De-Mori, “Cache neural network language models based on long distance dependencies for a spoken dialog system,” ICASSP, 2012. … Cited by 61 Related articles All 10 versions

Use of kernel deep convex networks and end-to-end learning for spoken language understanding L Deng, G Tur, X He… – … Workshop (SLT), 2012 …, 2012 – ieeexplore.ieee.org … 6.1 Experimental Setup In order to perform experiments with the DCNs, we compile a dataset of utterances from the users of a spoken dialog system. … 17, 2009 [20] A. Mohamed, GE Dahl, and GE Hinton, “Acoustic modeling using deep belief networks,” IEEE Trans. … Cited by 28 Related articles All 12 versions

Exploiting deep neural networks for detection-based speech recognition SM Siniscalchi, D Yu, L Deng, CH Lee – Neurocomputing, 2013 – Elsevier … For example, high-accuracy phoneme recognition results have been reported by using several MLPs arranged in a hierarchical structure (eg, [11], [52] and [53]). A remarkable performance has been achieved on the TIMIT task [54] using deep belief networks [31]. … Cited by 30 Related articles All 11 versions

The deep tensor neural network with applications to large vocabulary speech recognition D Yu, L Deng, F Seide – Audio, Speech, and Language …, 2013 – ieeexplore.ieee.org Page 1. Copyright (c) 2011 IEEE. Personal use is permitted. For any other purposes, permission must be obtained from the IEEE by emailing pubs-permissions@ieee.org. This article has been accepted for publication in a future issue of this journal, but has not been fully edited. … Cited by 36 Related articles All 11 versions

Calibration of confidence measures in speech recognition D Yu, J Li, L Deng – Audio, Speech, and Language Processing, …, 2011 – ieeexplore.ieee.org … maximum entropy model with distribution constraints, the artificial neural network, and the deep belief network. We compare these approaches and demonstrate the importance of key features exploited: the generic … distribution, deep belief network I. INTRODUCTION … Cited by 13 Related articles All 14 versions

Application of deep belief networks for natural language understanding R Sarikaya, GE Hinton, A Deoras – IEEE/ACM Transactions on Audio, …, 2014 – dl.acm.org … Deep belief networks (DBNs) have yielded impressive classi- fication performance on several benchmark classification tasks, beating the state-of-the-art in … Ruhi Sarikaya is a principal scientist and the manager of language understanding and dialog systems group at Microsoft. … Cited by 5 Related articles All 7 versions

Tensor deep stacking networks B Hutchinson, L Deng, D Yu – Pattern Analysis and Machine …, 2013 – ieeexplore.ieee.org … on the training set than the previous block. In contrast to other deep architectures (eg, the deep belief network [9]), the DSN does not aim to discover transformed feature representations. Due to this restrictive nature of building … Cited by 31 Related articles All 21 versions

Distant speech recognition in reverberant noisy conditions employing a microphone array JA Morales-Cordovilla, M Hagmuller… – … 2013 Proceedings of …, 2014 – ieeexplore.ieee.org … The distant interaction of a speaker with a dialogue system, which controls a home automation system, is a difficult challenge because of many … array and segment it in multichannel utterances by means of a voice activity detector block based on deep belief networks (VAD-DBN … Cited by 4 Related articles All 10 versions

Feature enhancement by deep LSTM networks for ASR in reverberant multisource environments F Weninger, J Geiger, M Wöllmer, B Schuller… – Computer Speech & …, 2014 – Elsevier This article investigates speech feature enhancement based on deep bidirectional recurrent neural networks. The Long Short-Term Memory (LSTM) architecture is us. Cited by 9 Related articles All 4 versions

Phone recognition on the TIMIT database C Lopes, F Perdigão – Speech Technologies/Book, 2011 – cdn.intechweb.org … Speech, as “the” communication mode, has seen the successful development of quite a number of applications using automatic speech recognition (ASR), including command and control, dictation, dialog systems for people with impairments, translation, etc. … Cited by 11 Related articles All 6 versions

Ten recent trends in computational paralinguistics B Schuller, F Weninger – Cognitive Behavioural Systems, 2012 – Springer … One main applica- tion is to increase efficiency and hence, user satisfaction in task oriented dialogue systems by enabling … is supervised generation of features through evolutionary algorithms [74] or unsupervised learning of features, eg, through deep belief networks or sparse … Cited by 5 Related articles All 9 versions

Deep architectures for automatic emotion recognition based on lip shape B Popovi?, S Ostrogonac, V Deli?, M Janev… – … , Jahorina, Bosnia and …, 2013 – infoteh.rs.ba … This research work has been supported by the Serbian Ministry of Education, Science and Technological Development, and it has been realized as a part of “Development of Dialogue Systems for Serbian and Other South … [4] GE Hinton, “Deep belief networks”, Scholarpedia, vol … Cited by 2 Related articles All 4 versions

A state-clustering based multiple deep neural networks modelling approach for speech recognition P Zhou, H Jiang, LR Dai, Y Hu, Q Liu – 2013 – ieeexplore.ieee.org Page 1. 2329-9290 (c) 2015 IEEE. Personal use is permitted, but republication/ redistribution requires IEEE permission. See http://www.ieee.org/ publications_standards/publications/rights/index.html for more information. This … Cited by 4 Related articles

Noise-robust whispered speech recognition using a non-audible-murmur microphone with VTS compensation. CY Yang, G Brown, L Lu, J Yamagishi, S King – ISCSLP, 2012 – cstr.ed.ac.uk … title = {Evaluating language understanding accuracy with respect to objective outcomes in a dialogue system}, url = {http … Portland, Oregon, USA}, month = {September}, year = {2012}, keywords = {Articulatory inversion, deep neural network, deep belief network, deep regression … Cited by 4 Related articles All 2 versions

Room localization for distant speech recognition JA Morales-Cordovilla, H Pessentheiner… – … Annual Conference of …, 2014 – 193.6.4.39 … not only interesting because of the enhancement of the signal (by means of beamforming, etc.) but also because it can help the dialog system to distinguish … Figure 2: VAD estimations by the deep belief network (DBN) for the five rooms of the apartment for the signal sim2 of Dev1 … Cited by 2 Related articles All 14 versions

An artificial neural network approach to automatic speech processing SM Siniscalchi, T Svendsen, CH Lee – Neurocomputing, 2014 – Elsevier An artificial neural network (ANN) is a powerful mathematical framework used to either model complex relationships between inputs and outputs or find patterns i. Cited by 3 Related articles All 4 versions

Large vocabulary continuous speech recognition based on WFST structured classifiers and deep bottleneck features Y Kubo, T Hori, A Nakamura – Acoustics, Speech and Signal …, 2013 – ieeexplore.ieee.org … is also promising since a WFST constitutes a common framework for several application fields such as speech summarization, speech translation, and dialogue systems. … [6] A. Mohamed, G. Dahl, and G. Hinton, “Acoustic modeling using deep belief networks,” Audio, Speech … Cited by 3 Related articles All 2 versions

Word-level acoustic modeling with convolutional vector regression AL Maas, SD Miller, TM O’neil, AY Ng… – ICML Workshop on …, 2012 – stanford.edu … Our model attempts to project an acoustic input directly into such word vector spaces. Mapping acoustics to semantics has potential applications not only in LVCSR, but also for dialogue systems such as voice search, and recognizing speaker characteristics like emotive state. … Cited by 2 Related articles All 7 versions

Speaker adaptation of deep neural network based on discriminant codes S Xue, O Abdel-Hamid, H Jiang, L Dai, Q Liu – 2014 – ieeexplore.ieee.org Page 1. 2329-9290 (c) 2013 IEEE. Translations and content mining are permitted for academic research only. Personal use is also permitted, but republication/ redistribution requires IEEE permission. See http://www.ieee.org … Cited by 2 Related articles

Challenges Of Natural Language Communication With Machines. V DELIC, M SECUJSKI, N JAKOVLJEVIC… – DAAAM International …, 2013 – daaam.info … Key words: automatic speech recognition, text-to-speech synthesis, emotions in human-machine interaction, human-machine dialogue systems, challenges for the future … as deep belief networks, graphical models and sparse representation. … Cited by 1 Related articles All 2 versions

Deep Neural Networks For Spoken Dialog Systems C MAIN – 2014 – macs.hw.ac.uk … 51 References 52 6 Page 7. 1 Introduction 1.1 Motivation One of the major issues in Spoken Dialog Systems (SDS) is that the automatic speech … The architecture described here is known as a Deep Belief Network which are generative architectures composed of multiple … Related articles

Deep Generative and Discriminative Models for Speech Recognition L Deng – wissap.iiit.ac.in … Deep Auto-encoder, in Interspeech, Sept. 2010. • A. Mohamed, Dong Yu, and Li Deng, Investigation of Full-Sequence Training of Deep Belief Networks for Speech Recognition, in Interspeech, Sept. 2010. • D. Yu, Li Deng, and … Related articles

Generating Questions from Web Community Contents. B Wang, B Liu, C Sun, X Wang, D Zhang – COLING (Demos), 2012 – Citeseer … in the interaction oriented systems (Rus et al., 2007; Harabagiu et al., 2005)(eg, computer aided education, help desk, dialog systems, etc … A deep belief network (DBN) is proposed to generate the essential elements of the questions according to the answers, based on the joint … All 6 versions

24th International Conference on Computational Linguistics M Kay, C Boitet – Proceedings of COLING, 2012 – aclweb.org … 569 Learning Semantics with Deep Belief Network for Cross-Language Information Retrieval Jungi Kim, Jinseok Nam and Iryna Gurevych … 1039 A Hierarchical Domain Model-Based Multi-Domain Selection Framework for Multi-Domain Dialog Systems Seonghan Ryu … All 7 versions

NIPS 2009: The 23rd Annual Conference on Neural Information Processing Systems O Obst – 2010 – Springer … Geoff Hinton’s presentation started off with a nice in- troduction to deep belief networks, and in the course of his one … Several approaches have been made con- cerning emotion recognition, emotion modelling, genera- tion of emotional user interfaces and dialogue systems. … All 3 versions

Using Recurrent Neural Networks for Slot Filling in Spoken Language Understanding G Mesnil, Y Dauphin, K Yao, Y Bengio, L Deng… – 2013 – ieeexplore.ieee.org … One way of building a deep model for slot filling is to stack several neural network layers on top of each other. This approach was taken in [27], which used deep belief networks (DBNs), and showed superior results to a CRF baseline on ATIS. … Cited by 1 Related articles All 3 versions

Phone classification using HMM/SVM system and normalization technique MS Yakoub, R Nkambou… – Signal Processing and …, 2013 – ieeexplore.ieee.org … of spoken dialogue system. Table 2. Reported results on TIMIT phone classification. System Accuracy (%) HMM [9] 66.08 CDHMM [10] 72.90 TRAPs, temporal context division + lattice rescoring [11] 79.04 GMMs trained as SVMs [12] 69.90 Deep Belief Networks [13] 79.30 … Related articles

[BOOK] Speech and Computer: 15th International Conference, SPECOM 2013, Pilsen, Czech Republic, September 1-5, 2013. Proceedings M Železný, I Habernal, A Ronzhin – 2013 – library.wur.nl … The papers are organized in topical sections on speech recognition and understanding, spoken language processing, spoken dialogue systems, speaker identification and diarization, speech forensics and security, language identification, text-to-speech systems, speech …

Automatic language recognition using deep neural networks AL D?ez – 2013 – atvs.ii.uam.es … 17 2.6. Training algorithm for a deep belief network [extracted from Hinton et al … these systems can be used for filtering telephone calls and retaining only those in the language of interest, or for preprocessing the input speech signal in multilingual dialog systems [Ambikairajah et … Related articles All 4 versions

Intelligent Systems’ Holistic Evolving Analysis of Real-Life Universal Speaker Characteristics B Schuller, Y Zhang, F Eyben, F Weninger – mmk.e-technik.tu-muenchen.de … As a more recent ap- proach to machine learning from unsupervisedly generated features, Deep Belief Networks (DBNs) have been applied to affect … spite them being crucial for real-life applications such as retrieval, dialogue systems and computer-mediated human- to-human … Cited by 1 Related articles All 2 versions

Data collection and language understanding of food descriptions M Korpusik, N Schmidt, J Drexler, S Cyphers… – Proc. SLT, 2014 – groups.csail.mit.edu … 5, pp. 79–86. [5] Y. Chen, W. Wang, and A. Rudnicky, “Unsupervised induction and filling of semantic slots for spoken dialogue systems using frame-semantic parsing,” in Proc. ASRU, 2013, pp. 120–125. … 3771–3775. [8] A. Deoras and R. Sarikaya, “Deep belief network based se … Cited by 1 Related articles All 4 versions

Improving Domain-independent Cloud-based Speech Recognition with Domain-dependent Phonetic Post-processing J Twiefel, T Baumann… – Twenty-Eighth …, 2014 – nats-www.informatik.uni-hamburg.de … systems (including cloud-based recognizers) are compared in terms of their applicability for develop- ing domain-restricted dialogue systems. … provide better trained acoustic models and possi- bly more advanced frontend processing, eg based on deep belief networks (Jaitly et … Cited by 1 Related articles All 5 versions

Automatic speech recognition for under-resourced languages: A survey L Besacier, E Barnard, A Karpov, T Schultz – Speech Communication, 2014 – Elsevier … Artificial Neural Networks (ANN) including single hidden layer NN and multiple hidden layers NN (Deep Neural Networks DNN or Deep Belief Networks DBN) are also used for ASR subtasks such as acoustic modeling (Mohamed et al., 2012 and Seide et al., 2011) and … Cited by 21 Related articles All 6 versions

Exploiting Vocal-Source Features to Improve ASR Accuracy for Low-Resource Languages R Fernandez, J Cui, A Rosenberg… – … Conference of the …, 2014 – mazsola.iit.uni-miskolc.hu … J. Edlund, and M. Heldner, “An instantaneous vec- tor representation of delta pitch for speaker-change prediction in conversational dialogue systems,” in ICASSP … [26] TN Sainath, B. Kingsbury, and B. Ramabhadran, “Improving training time of deep belief networks through hybrid … Related articles All 4 versions

Inaugural Editorial: Riding the Tidal Wave of Human-Centric Information Processing— PM SPM – 2012 – research.microsoft.com … as I advo- cated some time ago while managing SPM [3]. Further, modern research and commercial systems developed in our field (eg, robust speech translation, spoken dialogue systems, etc.) have … [5] G. Hinton et al., “Acoustic modeling using deep belief networks (Spe- cial … Cited by 2 Related articles All 7 versions

Correcting phoneme recognition errors in learning word pronunciation through speech interaction X Zuo, T Sumii, N Iwahashi, M Nakano… – Speech …, 2013 – Elsevier … Learning the pronunciation (phoneme sequence) of out-of-vocabulary (OOV) words is a serious problem for speech recognition systems and spoken dialogue systems in practical use because the developer cannot prepare all the words beforehand that would be used in the … Cited by 1 Related articles All 5 versions

An attribute detection based approach to automatic speech processing SM Siniscalchi, CH Lee – Loquens, 2014 – loquens.revistas.csic.es … This number, often referred to as a CM, serves as a reference guide for the dialogue system to provide an appropriate response to its users, just as an intelligent human being is expected to do when interacting with other people. … Related articles

Speak Correct: Phonetic Editor Approach H Al-Barhamtoshy, K Jambi, W Al-Jedaibi… – Life Science …, 2014 – lifesciencesite.com Page 1. Life Science Journal 2014;11(8) http://www.lifesciencesite.com 626 Speak Correct: Phonetic Editor Approach Hassanin Al-Barhamtoshy1, Kamal Jambi1, Wajdi Al-Jedaibi1, Diaa Motaweh2, Sherif Abdou3, Mohsen Rashwan4 … Related articles

Convolutional neural networks for speech recognition O Abdel-Hamid, AR Mohamed, H Jiang… – IEEE/ACM Transactions …, 2014 – dl.acm.org Page 1. IEEE/ACM TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING, VOL. 22, NO. 10, OCTOBER 2014 1533 Convolutional Neural Networks for Speech Recognition Ossama Abdel-Hamid, Abdel-rahman … Cited by 5 Related articles All 7 versions

Factorial lda: Sparse multi-dimensional text models M Paul, M Dredze – Advances in Neural Information Processing …, 2012 – papers.nips.cc … 0, 1). This is a common approximation used in other models, such as artificial neural networks and deep belief networks. … side of speech processing (phonological, prosodic, etc.) while (SPEECH,APPLICATIONS,EMPIRICAL) is predominantly about dialogue systems and speech … Cited by 14 Related articles All 9 versions

Multi-band long-term signal variability features for robust voice activity detection. A Tsiartas, T Chaspari, N Katsamanis, PK Ghosh… – …, 2013 – cvsp.cs.ntua.gr … tool to a wide range of speech applications, including automatic speech recognition, language identification, spoken dialog systems and emotion … be interesting to explore the potential of these features with various machine learning al- gorithms including deep belief networks. … Cited by 7 Related articles All 7 versions

Fast adaptation of deep neural network based on discriminant codes for speech recognition S Xue, O Abdel-Hamid, H Jiang, L Dai… – IEEE/ACM Transactions on …, 2014 – dl.acm.org Page 1. IEEE/ACM TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING, VOL. 22, NO. 12, DECEMBER 2014 1713 Fast Adaptation of Deep Neural Network Based on Discriminant Codes for Speech Recognition … Cited by 2 Related articles

Neural network based feature extraction for speech and image recognition C Plahl – 2014 – darwin.bth.rwth-aachen.de … speech) to written text (recognized words). The recognized word sequence can be further processed by a machine translation system, a dialog system or any other text based system. Depending on the given task, automatic … Related articles All 6 versions

Machine learning methods for articulatory data JJ Berry – 2012 – arizona.openrepository.com … 2005). The development of the translational Deep Belief Network in Chapter 3 will be of interest to researchers in machine learning. Standard DBNs are trained in … dialogue systems). One approach to dealing with context effects is to explicitly … Cited by 2 Related articles All 5 versions

Spoofing and countermeasures for speaker verification: a survey Z Wu, N Evans, T Kinnunen, J Yamagishi, F Alegre… – Speech …, 2015 – Elsevier While biometric authentication has advanced significantly in recent years, evidence shows the technology can be susceptible to malicious spoofing attacks. The r. Cited by 4 Related articles All 4 versions

Project Periodic Report J Shawe-Taylor – 2013 – complacs.org … Strategic Impact The modern machine learning methods are sufficiently mature to address challenging control problems that arise in the context of intelligent cognitive systems, such as home robotics, swarm intelligence, smart human-machine interfaces and dialogue systems. … Related articles

2013 Index IEEE Transactions on Audio, Speech, and Language Processing Vol. 21 TD Abhayapala, C Agon, A Ahlen, S Ahmed, MT Akhtar… – ieeexplore.ieee.org … Jeong, M., Kim, K., Ryu, S., and Lee, GG, Unsupervised Spoken Language Understanding for a Multi-Domain Dialog System; TASL Nov … Ling, Z.-H., Deng, L., and Yu, D., Modeling Spectral Envelopes Using Restricted Boltzmann Machines and Deep Belief Networks for Statistical …

A bottom-up modular search approach to large vocabulary continuous speech recognition SM Siniscalchi, T Svendsen… – Audio, Speech, and …, 2013 – ieeexplore.ieee.org Page 1. Copyright (c) 2011 IEEE. Personal use is permitted. For any other purposes, permission must be obtained from the IEEE by emailing pubs-permissions@ieee.org. This article has been accepted for publication in a future issue of this journal, but has not been fully edited. … Cited by 13 Related articles All 3 versions

Machine learning paradigms for speech recognition: An overview L Deng, X Li – IEEE Transactions on Audio, Speech and Language …, 2013 – 131.107.65.14 Page 1. IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING, VOL. 21, NO. 5, MAY 2013 1 Machine Learning Paradigms for Speech Recognition: An Overview Li Deng, Fellow, IEEE, and Xiao Li, Member, IEEE … Cited by 34 Related articles All 10 versions

Joint uncertainty decoding for noise robust subspace Gaussian mixture models L Lu, KK Chin, A Ghoshal, S Renals… – IEEE Transactions on …, 2012 – cstr.ed.ac.uk … 8.4\% less time steps and 7.7\% higher reward).}, categories = {reinforcement learning, spoken dialogue systems} } @inproceedings{hochberg … ed.ac.uk/downloads/publications/2011/ articulatory_inversion.pdf}, abstract = {In this work, we implement a deep belief network for the … Cited by 2 Related articles All 2 versions

Development and Evaluation of Semantically Constrained Speech Recognition Architectures S Wermter, J Twiefel, T Baumann – 2014 – informatik.uni-hamburg.de … approach. ASR system are able to transform acoustic data to text and can be used for dialogues between humans and machines. In combination with a text- to-speech system, useful dialogue systems can be created. Speech … Related articles

Foundations and Trends in Signal Processing L Deng, Y Dong – Signal Processing, 2014 – research.microsoft.com Page 1. the essence of knowledge FnT SIG 7:3-4 Deep Learning; Methods and Applications Li Deng and Dong Y u Foundations and Trends® in Signal Processing 7:3-4 Deep Learning Methods and Applications Li Deng and Dong Yu now now Page 2. 7.1. … Cited by 1 Related articles All 8 versions

[BOOK] Automatic speech signal analysis for clinical diagnosis and assessment of speech disorders L Baghai-Ravary, SW Beet – 2012 – books.google.com … Some of the topics covered in this series include the presentation of real life commercial deployment of spoken dialog systems, con- temporary methods of speech parameterization, developments in information security for automated speech, forensic speaker recognition, use … Cited by 6 Related articles All 6 versions

A Survey on perceived speaker traits: Personality, likability, pathology, and the first challenge B Schuller, S Steidl, A Batliner, E Nöth… – Computer Speech & …, 2015 – Elsevier The INTERSPEECH 2012 Speaker Trait Challenge aimed at a unified test-bed for perceived speaker traits – the first challenge of this kind: personality in the f. Cited by 2 Related articles All 6 versions

An Information-Extraction Approach to Speech Processing: Analysis, Detection, Verification, and Recognition CH Lee, SM Siniscalchi – Proceedings of the IEEE, 2013 – ieeexplore.ieee.org Page 1. This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. INVITED PAPER … Related articles All 2 versions

SpringerBriefs in Electrical and Computer Engineering Speech Technology A Neustein – Springer … Some of the topics covered in this series include the presentation of real life commercial deployment of spoken dialog systems, con- temporary methods of speech parameterization, developments in information security for automated speech, forensic speaker recognition, use … Related articles