HTK (Hidden Markov Model Toolkit) & Dialog Systems

Notes:

The Hidden Markov Model (HMM) Toolkit is a set of software tools and algorithms for working with hidden Markov models (HMMs). HMMs are a type of statistical model that is used to represent and analyze temporal data, particularly data that is partially observable or noisy. The HMM Toolkit provides a set of functions and algorithms for working with HMMs, including functions for training HMMs, testing HMMs, and estimating model parameters.

HMMs are used in a wide range of applications, including speech recognition, natural language processing, machine learning, and bioinformatics. The HMM Toolkit is designed to be a flexible and powerful tool for working with HMMs in these and other applications. It is often used in research and development projects, as well as in commercial software applications.

The Hidden Markov Model (HMM) Toolkit can be used to build dialog systems that use HMMs to model and analyze the temporal structure of dialogues. In a dialog system, HMMs can be used to represent the different states of the dialogue, the different actions or responses that can be taken by the system, and the probabilities of transitioning between these states and actions.

For example, an HMM-based dialog system might use an HMM to represent the different states of a conversation, such as opening, greeting, asking questions, providing information, and closing. The system might use the HMM to model the probabilities of transitioning between these states based on the input from the user, and to generate appropriate responses based on the current state of the conversation.

In addition to modeling the temporal structure of the dialog, the HMM Toolkit can also be used to estimate the probability of different sequences of states and actions, which can be useful for evaluating the performance of the dialog system and identifying areas for improvement. The HMM Toolkit can also be used to train HMMs on large datasets of dialogues in order to improve the accuracy and effectiveness of the dialog system.

Wikipedia:

HTK (software)

Speech input from older users in smart environments: Challenges and perspectives R Vipperla, M Wolters, K Georgila, S Renals – Universal Access in Human- …, 2009 – Springer … They thanked it for providing in- formation, or provided information that was not specified in the task definition and could not be processed by the dialogue system. In … 5 HTK version 3.4. http://htk.eng.cam.ac.uk Page 5. Speech … Cited by 30 Related articles All 9 versions

Statistical methods for building robust spoken dialogue systems in an automobile P Tsiakoulis, M Gašic, M Henderson… – Proceedings of the …, 2012 – mi.eng.cam.ac.uk … dialogue system is implemented using the ATK platform (http://mi.eng.cam.ac.uk/research/dialogue/ atk_home.html). ATK is an API designed to facilitate building experimental applications for systems trained using the HTK speech recognition toolkit (see http://htk.eng.cam.ac.uk). … Cited by 7 Related articles All 6 versions

Emotion recognition and adaptation in spoken dialogue systems J Pittermann, A Pittermann, W Minker – International Journal of Speech …, 2010 – Springer … Emotion recognition and adaptation in spoken dialogue systems Johannes Pittermann · Angela Pittermann · Wolfgang Minker … 1 Introduction A spoken language dialogue system (SLDS) typically con- sists of the following main components as depicted in Fig. … Cited by 13 Related articles All 7 versions

Analysis of Czech Web 1T 5-gram corpus and its comparison with czech national corpus data V Procházka, P Pollák – Text, Speech and Dialogue, 2010 – Springer … The first recognizers of singular commands or simple dialogue systems are joined by dictation machines converting voice input into written form … cz 8. Young, S., et al.: The Hidden Markov Model Toolkit (HTK), Version 3.4.1, Cambridge (2009), http://htk.eng.cam.ac.uk 9. Moreno, A … Cited by 3 Related articles All 7 versions

The Speech Recognition Virtual Kitchen: An Initial Prototype. F Metze, E Fosler-Lussier – INTERSPEECH, 2012 – 20.210-193-52.unknown.qala.com. … … believe that this infrastructure may be useable by fields other than core ASR that are data intensive (synthesis, dialog systems, NLP, computer … Available: http://htk.eng.cam.ac.uk [3] D. Povey, A. Ghoshal, G. Boulianne, L. Burget, O. Glembek, N. Goel, M. Hannemann, P. Motl??cek … Cited by 4 Related articles All 5 versions

Large vocabulary continuous speech recognition for Urdu H Sarfraz, S Hussain, R Bokhari, AA Raza… – Proceedings of the 8th …, 2010 – dl.acm.org … For a practical dialog system, a speech recognizer should be able to recognize speech from an average speaker in a normal environment. … F. Mihelic and J. Zibert. I-Tech (ed.), Vienna, Austria. November 2008. [16] HTK, http://htk.eng.cam.ac.uk, accessed July 2010. … Cited by 9 Related articles All 5 versions

The SEMAINE API: towards a standards-based framework for building emotion-oriented systems M Schröder – Advances in human-computer interaction, 2010 – dl.acm.org … The project aims to build a multimodal dialogue system with an emphasis on nonverbal skills—detecting and emitting vocal and facial signs related to the interaction, such as backchannel signals, in order to register and express information such as continued presence … Cited by 66 Related articles All 8 versions

Ageing voices: The effect of changes in voice parameters on ASR performance R Vipperla, S Renals, J Frankel – … Journal on Audio, Speech, and Music …, 2010 – dl.acm.org … from the difference in acoustics, older people also appear to differ in linguistic characteristics when interacting with Spoken Dialogue Systems (SDS) [16]. … built a state of the art ASR system using the Hidden Markov Model Toolkit (HTK) (HTK version 3.4 http://htk.eng.cam.ac.uk/). … Cited by 19 Related articles All 17 versions

A comparison of audio-free speech recognition error prediction methods. P Jyothi, E Fosler-Lussier – INTERSPEECH, 2009 – 20.210-193-52.unknown.qala.com. … … All these results are very encouraging considering that it is useful in certain ASR applications (for example, spoken dialogue systems) to be able to correctly … [8] Young, S., “The HTK Hidden Markov Model Toolkit: Design and Philosophy”, Online: http://htk.eng.cam.ac.uk, 1993. … Cited by 6 Related articles All 3 versions

Design and Implementation of Robot Audition System’HARK’—Open Source Software for Listening to Three Simultaneous Speakers K Nakadai, T Takahashi, HG Okuno… – Advanced …, 2010 – Taylor & Francis … In addition, we also provide some sample applications like dialog systems that connect with a HARK-based ro- bot audition system. … be done in a GUI environ- ment although famous packages having feature extraction functions such as HTK (http://htk.eng.cam.ac.uk/) require text … Cited by 62 Related articles All 5 versions

The speech recognition virtual kitchen. F Metze, E Fosler-Lussier, R Bates – INTERSPEECH, 2013 – cs.cmu.edu … believe that this infrastructure may be usable by fields other than core ASR that are data intensive (synthesis, dialog systems, NLP, computer … Available: http://htk.eng.cam.ac.uk [5] D. Povey, A. Ghoshal, G. Boulianne, L. Burget, O. Glembek, N. Goel, M. Hannemann, P. Motl??cek … Cited by 2 Related articles All 4 versions

D3 toolkit: a development toolkit for daydreaming spoken dialog systems D Lee, K Kim, C Lee, J Choi, GG Lee – Spoken Dialogue Systems for …, 2010 – Springer … Speech Communication 45(4), 455–470 (2005) 9. Hidden Markov Toolkit (HTK), http://htk.eng.cam.ac.uk/ 10. … Lee, C., Jung, S., Kim, S., Lee, GG: Example-based Dialog Modeling for Practical Multi- domain Dialog System. Speech Communication 51(5), 466–484 (2009) Cited by 1 Related articles All 9 versions

Recognition of multiple language voice navigation queries in traffic situations G Sárosi, T Mozsolics, B Tarján, A Balog… – Analysis of Verbal and …, 2011 – Springer Page 1. Recognition of Multiple Language Voice Navigation Queries in Traffic Situations Gellért Sárosi1, Tamás Mozsolics1,2, Balázs Tarján1, András Balog1,2, Péter Mihajlik1,2, and Tibor Fegyó1,3 1 Department of Telecommunications … Cited by 1 Related articles All 5 versions

ASR performance analysis of an experimental call routing system T Modipa, F De Wet, M Davel – 2009 – researchspace.csir.co.za … Keywords: Call routing, proper name recognition, auto- matic speech recognition, spoken dialogue system. … Kershaw, D., Liu, X., Moore, G., Odell, J., Ollason, D., Povey, D., Valtchev, V, Woodland, P., “The ??? book”, Available, [Online], http://htk.eng.cam.ac.uk/, Accessed: Jan … Cited by 1 Related articles All 5 versions

Development of web-based voice interface to identify child users based on automatic speech recognition system R Nisimura, S Miyamori, L Kurihara… – … Interaction. Users and …, 2011 – Springer … 2 http://htk.eng.cam.ac.uk/ 3 http://www.chasen.org/~taku/software/TinySVM/ Page 5. … In: Proc. APSIPA ASC 2010, pp. 470–473 (2010) 3. Hempel, T.: Usability of telephone-based speech dialog systems as experienced by user groups of different age and background. In: Proc. … Cited by 1 Related articles All 3 versions

Design of the new prognosis wearable system-prototype for health monitoring of people at risk A Pantelopoulos, N Bourbakis – Advances in Biomedical Sensing, …, 2010 – Springer … Moreover, by incorporating an automated intelligent and interactive dialogue system, additional health-status feedback can be obtained from the user in terms of described symptoms, captured using a voice recognition module, which can further enhance the autonomous … Cited by 8 Related articles All 4 versions

A tutorial dialogue system with unrestricted spoken input. P Bell, M Dzikovska, A Isard – INTERSPEECH, 2012 – 20.210-193-52.unknown.qala.com. … … the capability for nat- ural, unrestricted spoken interaction to BEETLE II, our exist- ing typed tutorial dialogue system [5]. The … line speech parametrisation, voice activity detection and speech recognition in real-time using a multi-threaded design (though 1http://htk.eng.cam.ac.uk … Related articles All 3 versions

Person-machine Dialogue Systems R de Córdoba Herralde – die.upm.es … Towards generally develop developing models of usability with PARADISE. Natural Language Engineering: Special Is-sue on Best Practice in Spoken Dialogue Systems, 2000. … HTK (http://htk.eng.cam.ac.uk/ ) is a toolkit for estimating and using hidden Markov models. Related articles

Designing a spoken language interface for a tutorial dialogue system. P Bell, M Dzikovska, A Isard – INTERSPEECH, 2012 – 20.210-193-52.unknown.qala.com. … … simulator. 3. Language modelling In many spoken dialogue systems, ASR is performed using hand-crafted finite-state networks selected according to the di- alogue state. This is … filled 1http://htk.eng.cam.ac.uk pauses, repetitions … Cited by 1 Related articles All 8 versions

Word Activation Forces-Based Language Modeling and Smoothing M Qin, G Liu, B Li, Y Lu – Intelligent Human-Machine Systems …, 2013 – ieeexplore.ieee.org … [1] F. Zamora, MJ Castro, and R. De-Mori, “Cache neural network language models based on long-distance dependencies for a spoken dialog system,” IEEE ICASSP, 2012, pp. 4993- 4996. … [12] Introduction to the HTK tools is available at http://htk.eng.cam.ac.uk/ [13] F. Jelinek … Related articles All 2 versions

Gnome desktop management by voice A Corpas, M Cámara, G Pérez – Advances in New Technologies, …, 2011 – Springer … José Luis Pérez Pujadas, s/n 18006, Granada, Spain {alberto.corpas,mario.camara}@ juntadeandalucia.es 2 Intelligent Dialogue Systems, Avda. … Informa- tion, Communication & Society 9(3), 313–334 (2006) 3. http://htk.eng.cam.ac.uk/ 4. http://cmusphinx.sourceforge.net/ 5. http … Related articles All 5 versions

Acoustic model training for speech recognition over mobile networks J Vojtko, J Ka?ur, G Rozinaj, J K?rösi – International Journal of Signal …, 2013 – Inderscience Page 1. Copyright © 2013 Inderscience Enterprises Ltd. Int. J. Signal and Imaging Systems Engineering, Vol. 6, No. 2, 2013 65 Acoustic model training for speech recognition over mobile networks Juraj Vojtko, Juraj Ka?ur, Gregor Rozinaj and Ján K?rösi … Related articles All 4 versions

Trends, Challenges and Opportunities in Spoken Dialogue Research M McTear – Spoken Dialogue Systems Technology and Design, 2011 – Springer … Examples of widely used tools in speech research are HTK (http://htk.eng.cam.ac.uk/) for speech recognition and Festival (Black and Lenzo … toolkit (Cole, 1999), which provides an easy-to-use graphical interface for the development of simple spoken dialogue systems but which … Related articles All 2 versions

Automatic speech recognition for assistive technology devices AP Harvey, RJ McCrindle, K Lundqvist, P Parslow – ICDVRAT 2010, 2010 – icdvrat.org … http://www.fastuk.org/research/projview.php?id=216 HTK (2010), Hidden markov model toolkit, http://htk.eng.cam.ac.uk/ INSPIRE (2004 … com/ Patras (2010), Wire Communications Laboratory, http://www.wcl.ece.upatras.gr/ Patras (2010), A Dialogue System for Telephone-based … Cited by 3 Related articles All 4 versions

Survey on Speech, Machine Translation and Gestures in Ambient Assisted Living D Anastasiou – Tralogy, Session, 2011 – d-anastasiou.com … Dialog systems and their components will be also pointed out. … 1 Automatic Speech Recognition of speech recognition and synthesis, and Machine Translation (MT); speech-to-speech translation, dialog systems, and gesture recognition and localization will also be discussed. … Cited by 2 Related articles All 3 versions

Technology and Implementation L Baghai-Ravary, SW Beet – Automatic Speech Signal Analysis for Clinical …, 2013 – Springer … In: Proceedings of American association for artificial intelligence fall symposium on dialog systems for health communication, pp 104 … 3 HTK is available from Cambridge University Engineering Department, through http://?htk.?eng.?cam.?ac.?uk/?. 4 CMUSphinx is an open … Related articles All 2 versions

Survey on Speech, Machine Translation and Gestures in Ambient Assisted Living. http://webcast. in2p3. fr/videos- … D Anastasiou – 2011 – lodel.irevues.inist.fr … We cover initiatives and tools both from Academia and Industry. We also refer to speech-to-speech translation systems which combine speech recognition, machine translation, and text-to-speech. Dialog systems and their components will be also pointed out. … Dialog systems. … Related articles All 2 versions

Speech recognition for resource deficient languages using frugal speech corpus A Imran, K Sunil – Signal Processing, Communication and …, 2012 – ieeexplore.ieee.org … of Oriental COCOSDA, Hsinchu, Taiwan, Oct 2011. [9] M. Plauche, O. Cetin, and U. Nallasamy, How to build a spoken dialog system with limited or no language re- sources. India: ICFAI University Press, 2008. … [17] HTK, “Htk homepage,” http://htk.eng.cam.ac.uk/. … Cited by 4 Related articles

Real world utterance collection using voice-enabled web system for child speaker identification S Miyamori, R NISIMURA, L KURIHARA… – Proc. Oriental …, 2010 – desceco.org … The number 2http://htk.eng.cam.ac.uk/ 20 40 60 80 100 C o rre ct rate (% ) Automatic classification Subjective evaluation 0 20 40 60 80 100 … [3] A. Raux, et al., “Doing Research on a Deployed Spoken Dialogue System: One Year of Let’s Go! Experience,” Proc. … Cited by 2 Related articles

Ageing Voices: The Effect of Changes in Voice Parameters on ASR Performance V Ravichander, R Steve, F Joe – EURASIP Journal on …, 2010 – downloads.hindawi.com … from the difference in acoustics, older people also appear to differ in linguistic characteristics when interacting with Spoken Dialogue Systems (SDS) [16]. … built a state of the art ASR system using the Hidden Markov Model Toolkit (HTK) (HTK version 3.4 http://htk.eng.cam.ac.uk/). … Related articles

Expressive speech synthesis database for emergent messages and warnings generation in critical situations M Rusko, S Darjaa, M Trnka, M Cer?ak – Language Resources for Public … – lrec-conf.org … The role of Expressive TTS The goal of the “Expressive speech synthesis” activity is to perform basic and applied research and to develop a system which would be capable of generating information system messages and dialogue system replies in naturally sounding speech … Related articles All 3 versions

Using multiple versions of speech input in phone recognition M Liberman, J Yuan, A Stolcke… – Acoustics, Speech and …, 2013 – ieeexplore.ieee.org … [7] Bohus, D., Zweig, G., Nguyen, P., Li, X., “ Joint N-best rescoring for repeated utterances in spoken dialog systems,” Proceedings of … dictionary: http://www.speech.cs.cmu.edu/cgi-bin/cmudict [14] The Hidden Markov Model Toolkit (HTK): http://htk.eng.cam.ac.uk/ [15] Garofolo … Related articles All 19 versions

Lattice parsing to integrate speech recognition and rule-based machine translation S Köprü, A Yazici – Proceedings of the 12th Conference of the European …, 2009 – dl.acm.org Page 1. Proceedings of the 12th Conference of the European Chapter of the ACL, pages 469–477, Athens, Greece, 30 March – 3 April 2009. cO2009 Association for Computational Linguistics Lattice Parsing to Integrate Speech Recognition and Rule-Based Machine Translation … Cited by 4 Related articles All 7 versions

Discriminative training of the hidden vector state model for semantic parsing D Zhou, Y He – … and Data Engineering, IEEE Transactions on, 2009 – ieeexplore.ieee.org Page 1. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, VOL. 6, NO. 1, JANUARY 2008 1 Discriminative Training of the Hidden Vector State Model for Semantic Parsing Deyu Zhou and Yulan He Abstract … Cited by 24 Related articles All 12 versions

Adapting HMMs of distant-talking ASR systems using feature-domain reverberation models A Sehr, M Gardill, W Kellermann – Proc. EUSIPCO, 2009 – eurasip.org … 1. INTRODUCTION Robust distant-talking Automatic Speech Recognition (ASR) is desirable for many applications, like seamless hu- man/machine interfaces, speech dialogue systems, and au- tomatic meeting transcription. … [10] “HTK webpage,” http://htk.eng.cam.ac.uk/. … Cited by 6 Related articles All 3 versions

Spoken Dialogue Systems K Jokinen, M McTear – Synthesis Lectures on Human …, 2009 – morganclaypool.com Page 1. Spoken Dialogue Systems Page 2. Page 3. iii Synthesis Lectures on Human … Series ISSN ISSN 1947-4040 print ISSN 1947-4059 electronic Page 5. Spoken Dialogue Systems Kristiina Jokinen University of Helsinki Michael McTear University of Ulster … Cited by 36 Related articles All 7 versions

Dynamic programming prediction errors of recurrent neural fuzzy networks for speech recognition CF Juang, CL Lai, CC Tu – Expert Systems with Applications, 2009 – Elsevier … in recent years. Extensive applications in ASR include voice dialog systems for telephone numbers, account numbers, and portable wireless devices (Brems and Wattenbarger, 1994, Hunt, 2001 and Viikki, 2001). In an intelligent … Cited by 4 Related articles All 5 versions

Low Complexity On-Line Adaptation Techniques in Context of Assamese Spoken Query System S Shahnawazuddin, KT Deepak, BD Sarma… – Journal of Signal …, 2014 – Springer Page 1. Low Complexity On-Line Adaptation Techniques in Context of Assamese Spoken Query System S. Shahnawazuddin & KT Deepak & BD Sarma & A. Deka & SRM Prasanna & Rohit Sinha Received: 11 March 2014 /Revised … Related articles

Automatic optimization of speech decoder parameters A El Hannani, T Hain – Signal Processing Letters, IEEE, 2010 – ieeexplore.ieee.org … In both cases the correlation with computational complexity is ignored, but in [13] the basis is a dialogue system and one can assume that an RTF less than one is required. … 3Written by Gunnar Evermann, available at http://htk.eng.cam.ac.uk. 1070-9908/$26.00 © 2009 IEEE … Cited by 14 Related articles All 3 versions

Identifying problematic dialogs in a human-computer dialog system HC Truong – 2010 – espace.etsmtl.ca … BY Hoang Cuong TRUONG INDENTIFYING PROBLEMATIC DIALOG IN A HUMAN-COMPUTER DIALOG SYSTEM MONTREAL, DECEMBER 10 2010 … Page 5. IDENTIFYING PROBLEMATIC DIALOGS IN A HUMAN-COMPUTER DIALOG SYSTEM Hoang Cuong TRUONG … Related articles

Affirmative cue words in task-oriented dialogue A Gravano, J Hirschberg, Š Be?uš – Computational Linguistics, 2012 – MIT Press … These words pose a challenge for spoken dialogue systems because of their ambiguity: They may be used for agreeing with what the interlocutor has said, indicating continued attention, or for cueing the start of a new topic, among other meanings. … Cited by 18 Related articles All 6 versions

Automatic detection of known advertisements in radio broadcast with data-driven ALISP transcriptions H Khemiri, G Chollet… – Multimedia Tools and …, 2013 – Springer … HTK: Hidden Markov Model ToolKit v 3.2.1. http://htk.eng.cam.ac.uk. … His main research interests are in phonetics, automatic audio-visual speech processing, speech dialog systems, multimedia, pattern recognition, biometrics, digital signal processing, etc. … Cited by 8 Related articles All 8 versions

Real-time speech-driven lip synchronization K Mu, J Tao, J Che, M Yang – … Symposium (IUCS), 2010 4th …, 2010 – ieeexplore.ieee.org … as input to a facial animation system: text and audio, ie, text and speech- driven face animation [3]. Text-driven face animation is often applied to an automatic and intelligent dialogue system, in which … [17] S. Young, et al., The HTK book, Citeseer, 2000, http://htk.eng.cam.ac.uk/. … Cited by 2 Related articles All 2 versions

Noise suppression method for preprocessor of time-lag speech recognition system based on bidirectional optimally modified log spectral amplitude estimation Y Obuchi, R Takeda, M Togami – Acoustical Science and Technology, 2013 – jlc.jst.go.jp Page 1. Noise suppression method for preprocessor of time-lag speech recognition system based on bidirectional optimally modified log spectral amplitude estimation Yasunari ObuchiÃ, Ryu Takeda and Masahito Togami Central … Related articles All 3 versions

Comparison of Methods for Topic Classification of Spoken Inquiries R Torres, H Kawanami, T Matsui, H Saruwatari… – Information and Media …, 2013 – jlc.jst.go.jp Page 1. Information and Media Technologies 8(2): 438-448 (2013) reprinted from: Journal of Information Processing 21(2): 157-167 (2013) © Information Processing Society of Japan 438 Regular Paper Comparison of Methods for Topic Classification of Spoken Inquiries … Related articles All 6 versions

Cognitive aspects of communicating information with conversational fillers in Slovak S Benus – … ), 2013 IEEE 4th International Conference on, 2013 – ieeexplore.ieee.org … Improved understanding of these aspects brings great potential for improving the quality and naturalness of human- machine interactions through dialogue systems and interactive voice response applications. … [25] http://htk.eng.cam.ac.uk/ 276 Š. … Related articles All 2 versions

State of Research of Speech Recognition M Sarma, KK Sarma – … -Based Speech Segmentation using Hybrid Soft …, 2014 – Springer … [ 69 ] where a spoken dialog system is designed to use in agricultural commodities task domain using real-world speech data collected from two linguistically similar languages of India, Hindi and Marathi. … Available via http://?htk.?eng.?cam.?ac.?uk/?. 51. … Related articles

Robot Audition: Missing Feature Theory Approach and Active Audition HG Okuno, K Nakadai, HD Kim – Robotics Research, 2011 – Springer … A set of examples of MFM are shown in Figure 7. 8 http://htk.eng.cam.ac.uk/ Page 9. Robot Audition: Missing Feature Theory Approach and Active Audition 235 Fig. … intervals (Figure 9). A speech dialog system specialized for this task was connected with the HARK. … Cited by 1 Related articles All 4 versions

Reactive Statistical Mapping: Towards the Sketching of Performative Control with Data N d’Alessandro, J Tilmanne, M Astrinaki… – Innovative and Creative …, 2014 – Springer … We think that there is a lack in the literature about how these trajectories should react according to very short-term gestures, as most of the research is focusing on a larger time window and the overall discussion of dialog systems. … Cited by 2 Related articles All 6 versions

Avoz-Speech Recognition Of Elderly Voices VK Hedayati – 2013 – inesc-id.pt … 53 A.4 List of appreciation sentences predefined in Dialog System . . . . . … use Hidden Markov Models to find the best path through the combined constraints of the acoustic, lexical, and language models. 1www.htk.eng.cam.ac.uk Page 28. 8 CHAPTER 2. OVERVIEW OF ASR … Related articles

Emotion recognition of affective speech based on multiple classifiers using acoustic-prosodic information and semantic labels CH Wu, WB Liang – Affective Computing, IEEE Transactions on, 2011 – ieeexplore.ieee.org … However, the applications of SDSs are still limited to simple informational dialog systems, such as navigation systems and air travel information systems [1], [2]. To enable more complex applications (eg, home nursing [3], educational/tutoring, and chatting [4]), new capabilities … Cited by 41 Related articles All 6 versions

Using Wizard of Oz to Collect Interaction Data for Voice Controlled Home Care and Communication Services S Schlogl, G Chollet, P Milhorat, J Deslis… – Proc. IASTED …, 2013 – vassist.cure.at … A gradual integration of components is planned so that by the end of next year all the modules of a spoken dialogue system (ie ASR, NLU, Dialogue Management, Natural Language Genera- tion, TTS) are available and can be combined. … 15http://htk.eng.cam.ac.uk/ … Cited by 1 Related articles All 3 versions

Recognising realistic emotions and affect in speech: State of the art and lessons learnt from the first challenge B Schuller, A Batliner, S Steidl, D Seppi – Speech Communication, 2011 – Elsevier More than a decade has passed since research on automatic recognition of emotion from speech has become a new field of research in line with its ‘big brothers. Cited by 190 Related articles All 10 versions

The MoveOn database: motorcycle environment speech and noise database for command and control applications T Kostoulas, T Winkler, T Ganchev, N Fakotakis… – Language resources …, 2013 – Springer … Keywords. Speech database Noise database Spoken dialogue interaction Open-air noise environment. 1 Introduction. One of the most challenging tasks in the design of dialogue systems operating in outdoor environment is the development of a noise-robust interaction interface … Related articles All 5 versions

Automatic speech recognition, with large vocabulary, robustness, independence of speaker and multilingual processing DRS Caon – portais4.ufes.br … validation. In order to perform a new demonstration in Dutch, acoustic and language models were built and the system was integrated with other auxiliary modules (such as voice activity detector and the dialogue system). Results … Related articles All 3 versions

Learning, generation and recognition of motions by reference-point-dependent probabilistic models K Sugiura, N Iwahashi, H Kashioka… – Advanced …, 2011 – Taylor & Francis Page 1. Advanced Robotics 25 (2011) 825–848 brill.nl/ar Full paper Learning, Generation and Recognition of Motions by Reference-Point-Dependent Probabilistic Models Komei Sugiura ? , Naoto Iwahashi, Hideki Kashioka … Cited by 13 Related articles All 10 versions

Unifying Speech Resources for Tone Languages: A Computational Perspective ME Ekpenyong, EAE Urua, VJ Ekong, OU Obot… – International Journal of …, 2011 – ijcis.info … (ii) Training and test data: for speech recognition and synthesis researchers (iii) Corpora and corpora processing tools: for researchers working on dialog systems (iv) Corpora from spoken domain: computational linguists working with text. … Related articles All 3 versions

Hybrid approach to real-time speech driven facial gesturing of virtual characters Goranka Zori? University of Zagreb Faculty of Electrical Engineering and Computing Sveu?ilište u Zagrebu Fakultet elektrotehnike i ra?unarstva G Zori? – ieee.hr … manually and given as input. Text driven systems for facial gesturing There are numerous visual text-to-speech (VTTS) and dialogue systems based on the … in [38] developed a new talking head with the purpose of acting as an interactive agent in a dialogue system. … Related articles All 3 versions

GF: A Multilingual Grammar Formalism A Ranta – Language and Linguistics Compass, 2009 – Wiley Online Library … different languages. They are also a powerful tool for natural language engineering, used for applications ranging from a translator of mathematical exercises to a dialogue system usable for talking with devices in a car. As a … Cited by 7 Related articles All 2 versions

The GF resource grammar library A Ranta – Linguistic Issues in Language Technology, 2009 – elanguage.net … The library can be used as a resource for language processing tasks, such as translation, multilingual generation, software localization, natural language interfaces, and spoken dialogue systems. … The TALK project on spoken dialogue systems (Ljunglöf et al., 2006) Page 16. … Cited by 73 Related articles All 9 versions

Automatic Speech Recognition K Samudravijaya – 2009 – speech.tifr.res.in … IEEE, 88, No. 8, 2000, pp. 1142-1165. 2. http://htk.eng.cam.ac.uk/docs/docs.shtml Page 16. … 17 (1995) 249-262. 13. E.Noth and A.Homdasch, “Experiences with Commercial Telephone-based Dialogue Systems”, IT Information Technology, 46 (2004) 6, pp.315-321. 14. … Related articles

TaleTUC: Automatic Speech Recognition for a Bus Route Information System R Andersstuen, CJ Marcussen – 2012 – diva-portal.org … 1.1 Background and Motivation One of the main research foundations for the work is the existing ASR system Buster (Hartvigsen et al., 2007). Buster is a spoken dialogue system which inter- acts with BusTUC, and provides route suggestions through a calling interface. … Related articles All 3 versions

Automatic speech recognition for under-resourced languages: A survey L Besacier, E Barnard, A Karpov, T Schultz – Speech Communication, 2014 – Elsevier Speech processing for under-resourced languages is an active field of research, which has experienced significant progress during the past decade. We propose, i. Cited by 12 Related articles All 4 versions

Speech synthesis based on hidden Markov models K Tokuda, Y Nankaku, T Toda, H Zen… – Proceedings of the …, 2013 – ieeexplore.ieee.org … ers, voice-over functions for the visually impaired, and communication aids for the speech impaired. More recent applications include spoken dialog systems, communica- tive robots, singing speech synthesizers, and speech-to- speech translation systems. … Cited by 19 Related articles All 3 versions

Discriminative input stream combination for conditional random field phone recognition I Heintz, E Fosler-Lussier, C Brew – Audio, Speech, and …, 2009 – ieeexplore.ieee.org Page 1. This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING, VOL. ?,NO. ?,?MONTH 2009 1 … Cited by 10 Related articles All 5 versions

Cross-modality semantic integration with hypothesis rescoring for robust interpretation of multimodal user interactions PY Hui, HM Meng – Audio, Speech, and Language Processing, …, 2009 – ieeexplore.ieee.org … 5) Ease of cross-modal integration as a front-end preprocess of an existing spoken dialog system, thereby enabling it to handle bimodal (speech and pen) inputs, as well as uni- modal (speech-only or text-only) inputs. 6) Ease of portability to different information domains. … Cited by 2 Related articles All 11 versions

Automatic Speech Recognition for ageing voices R Vipperla – 2011 – era.lib.ed.ac.uk … Springer, 2009. (Chapter 4) • Maria Wolters, Ravichander Vipperla, and Steve Renals. Age Recognition for Spoken Dialogue Systems: Do We Need It? In Proceedings of Interspeech, Brighton, 2009. (Chapter 7) • Ravichander Vipperla, Steve Renals, and Joe Frankel. … Cited by 3 Related articles All 9 versions

Automatic robust classification of speech using analytical feature techniques G Calvo Pérez – 2009 – upcommons.upc.edu … These systems enable spoken dialog systems with a range of input and output modalities for ease-of-use and flexibility in handling adverse environments where speech might not be as suitable as other input-output modalities. … Related articles All 2 versions

Spoken language identification in resource-scarce environments M Peché – 2009 – dspace.up.ac.za … S-LID systems in particular are important as they can play a useful role in data collection, enabling additional applica- tions to be developed later, including spoken dialog systems in domains such as government service delivery or healthcare [28, 29]. 1.2.3 OBJECTIVE … Related articles All 5 versions

A Software Testbed for Assessing Human-Robot Verbal Interaction H Bouraoui – 2010 – uwspace.uwaterloo.ca … Having these concepts in mind and addressing human-robot verbal issues, speech dialogue systems should be well developed to be robust enough for use in effective human-robot verbal interaction. … 1 http://htk.eng.cam.ac.uk/ last accessed: 23 March 2010 Page 27. 17 … Related articles All 3 versions

Reverberation model-based decoding in the logmelspec domain for robust distant-talking speech recognition A Sehr, R Maas, W Kellermann – Audio, Speech, and …, 2010 – ieeexplore.ieee.org … available. Voice dialogue systems, Manuscript received November 06, 2009, revised March 05, 2010. The associate editor coordinating the review of this manuscript and approving it for publication was Dr. Tomohiro Nakatani. … Cited by 43 Related articles All 5 versions

SpringerBriefs in Electrical and Computer Engineering Speech Technology A Neustein – Springer … Some of the topics covered in this series include the presentation of real life commercial deployment of spoken dialog systems, con- temporary methods of speech parameterization, developments in information security for automated speech, forensic speaker recognition, use … Related articles

Speech Recognition S RENALS, T HAIN – The Handbook of Computational Linguistics …, 2010 – books.google.com … Speech-to-text transcription has a number of applications including the dicta- tion of office documents, spoken dialogue systems for call centers, hands-free interfaces to computers, and the development of speech-to-speech translation systems. … All 13 versions

[BOOK] Cross-Word Modeling for Arabic Speech Recognition D AbuZeina, M Elshafei – 2011 – books.google.com … Some of the topics covered in this series include the presentation of real life commercial deployment of spoken dialog systems, contemporary methods of speech parameterization, developments in infor- mation security for automated speech, forensic speaker recognition, use … Related articles All 9 versions

Modelling speech dynamics with trajectory-HMMs L Zhang – 2009 – era.lib.ed.ac.uk Page 1. Modelling Speech Dynamics with Trajectory-HMMs Le Zhang T H E U NIVER S I T Y O F E DI NBU R G H Doctor of Philosophy The Centre for Speech Technology Research Institute for Communicating and Collaborative Systems School of Informatics … Cited by 5 Related articles All 6 versions

[BOOK] Automatic Speech Signal Analysis for Clinical Diagnosis and Assessment of Speech Disorders L Baghai-Ravary, SW Beet – 2012 – books.google.com … Some of the topics covered in this series include the presentation of real life commercial deployment of spoken dialog systems, con- temporary methods of speech parameterization, developments in information security for automated speech, forensic speaker recognition, use … Cited by 3 Related articles All 6 versions

Automatic oral proficiency assessment of second language speakers of South African English PFV Muller – 2010 – ir1.sun.ac.za Page 1. Automatic Oral Proficiency Assessment of Second Language Speakers of South African English by Pieter F de V Müller Thesis presented in partial fulfilment of the requirements for the degree of Master of Science in Engineering at Stellenbosch University Supervisors: … Related articles All 4 versions

Discriminative Training of the Hidden Vectors State Model for Semantic Parsing D Zhou, Y He – IEEE TRANSACTIONS ON KNOWLEDGE AND …, 2009 – cse.seu.edu.cn Page 1. Discriminative Training of the Hidden Vector State Model for Semantic Parsing Deyu Zhou and Yulan He Abstract—In this paper, we discuss how discriminative training can be applied to the hidden vector state (HVS) model in different task domains. … Cited by 1 Related articles

Robust speech recognition based on dereverberation parameter optimization using acoustic model likelihood R Gomez, T Kawahara – Audio, Speech, and Language …, 2010 – ieeexplore.ieee.org Page 1. Copyright (c) 2010 IEEE. Personal use is permitted. For any other purposes, Permission must be obtained from the IEEE by emailing pubs-permissions@ieee.org. This article has been accepted for publication in a future issue of this journal, but has not been fully edited. … Cited by 23 Related articles All 8 versions

A multimodal alignment framework for spoken documents D Mekhaldi, D Lalanne, R Ingold – Multimedia Tools and Applications, 2012 – Springer Page 1. A multimodal alignment framework for spoken documents Dalila Mekhaldi & Denis Lalanne & Rolf Ingold Published online: 13 July 2011 © Springer Science+Business Media, LLC 2011 Abstract We present a multimodal … Cited by 2 Related articles All 9 versions

[BOOK] Hierarchical Neural Network Structures for Phoneme Recognition D Vasquez, R Gruhn, W Minker – 2013 – Springer … Moreover, there are more advanced tasks such as call-centers where a fully interaction between humans and machines is required. In these tasks, a Spo- ken Dialog System is needed [Minker 04] where the ASR module plays an important role in the robustness of the system. … Related articles All 7 versions