Julius LVCSR 2013


See also:

Best CMUSphinx VideosCMUSphinx 2012 | CMUSphinx 2013 | HTK (Hidden Markov Model Toolkit) & Dialog Systems | Julius LVCSR 2014 | Julius LVCSR 2015Kaldi ASR


Robust facial expression recognition of a speaker using thermal image processing and updating of fundamental training data Y Nakanishi, Y Yoshitomi, T Asada, M Tabuse – Artificial Life and Robotics, 2013 – Springer … 3.2 Speech recognition and dynamic image analysis. We use a speech recognition system named Julius (http://?julius.?sourceforge.?jp/?) to obtain the timing positions of the start of speech and the first and last vowels in a WAV file [ 8 – 10 ]. … Cited by 3 Related articles All 5 versions

An experimental environment for analyzing collaborative learning interaction Y Hayashi, Y Ogawa, YI Nakano – Human Interface and the Management …, 2013 – Springer … The pen performs in two types of modes. One is a mobile mode such that 2 UA-1000: Roland Co., http://www.roland.com/products/en/UA-1000/ 3 Adintool: Julius development team, http://julius.sourceforge.jp/ 4 airpenPocket: Pentel Inc., http://www.airpen.jp/ Page 4. … Cited by 4 Related articles All 2 versions

Speech synthesis of emotions using vowel features K Boku, T Asada, Y Yoshitomi, M Tabuse – Software Engineering, Artificial …, 2013 – Springer … of Information Processing Society of Japan 50(3), 1181–1191 (2009) (in Japanese) 5. Open-Source Large Vocabulary CSR Engine Julius, http://julius.sourceforge.jp/en_index.php? q=index-en.html 6. Voice Sommelier Neo, http://hitachi-business.com/products/ package/sound … Cited by 1 Related articles All 4 versions

Visualization System for Analyzing Collaborative Learning Interaction Y Hayashi, Y Ogawa, YI Nakano – Procedia Computer Science, 2013 – Elsevier … 945–949 (in Japanese). [11]; Tobii Glasses Eye tracker and Tobii Studio: Tobii Technology, http://www.tobii.com/in press. [12]; Adintool: Julius development team, http://julius.sourceforge. jp/in press. [13]; airpenPocket: Pentel Inc., http://www.airpen.jp/in press. … Cited by 1 Related articles All 2 versions

Multi-step Natural Language Understanding P Milhorat, S Schlögl, G Chollet, J Boudy – SIGdial 2013: 14th Annual …, 2013 – aclweb.org … The framework includes ASR performed by the Julius Large Vocabulary Continuous Speech Recognition engine4, dialog management based on the Disco DM library (Rich, 2009; Rich 4http://julius.sourceforge.jp/en index.php 157 Page 2. … Related articles All 7 versions

High priority in highly ranked documents in spoken term detection K Konno, Y Itoh, K Kojima, M Ishigame… – … Annual Summit and …, 2013 – ieeexplore.ieee.org … distance between subwoed models”, ASJvol2, pp.239-240, 2011-3. [13] Corpus of Spontaneous Japanese, http://www.ninjal.ac.jp/csj/ [14] Hidden Markov Model Toolkit, http://htk.eng.cam.ac. uk/ [15] palmkit, http://palmkit.sourceforge.net/ [16] Julius, http://julius.sourceforge.jp/ … Related articles All 2 versions

Analysis and Combination of Forward and Backward Based Decoders for Improved Speech Transcription D Jouvet, D Fohr – Text, Speech, and Dialogue, 2013 – Springer … ASRU 2011, IEEE Workshop on Automatic Speech Recognition and Understanding, Hawa?, USA (2011) 6. Sphinx (2011), http://cmusphinx.sourceforge.net/ 7. Julius, http://julius.sourceforge.jp/ en_index.php 8. Galliano, S., Gravier, G., Chaubard, L.: The Ester 2 evaluation … Cited by 1 Related articles All 4 versions

Constructing Language Models for Spoken Dialogue Systems from Keyword Set K Komatani, S Mori, S Sato – … Challenges and Solutions in Applied Artificial …, 2013 – Springer … The utterances contained 14554 words including 1454 domain-specific words. The criteria we used was word accuracy (Acc.) for all and domain-specific words. 1 http://julius.sourceforge.jp Page 7. Constructing Language Models for Spoken Dialogue Systems from Keyword Set … Related articles All 2 versions

Incorporating semantic information to selection of web texts for language model of spoken dialogue system K Yoshino, S Mori, T Kawahara – Acoustics, Speech and Signal …, 2013 – ieeexplore.ieee.org … In the Kyoto sightseeing domain, however, the combination of the two 3http://julius. sourceforge.jp/ Fig. 2. APP by LMs with selected texts (baseball domain) Fig. 3. APP by LMs with selected texts (Kyoto domain) measures has a synergetic effect. … Cited by 2 Related articles All 6 versions

Context-based conversational hand gesture classification in narrative interaction S Okada, M Bono, K Takanashi, Y Sumi… – Proceedings of the 15th …, 2013 – dl.acm.org … armrests are not available. Thus, participants rest their hands on their thighs when not gesturing. On the other 1 Julius: http://julius.sourceforge.jp Figure 4: Coordinates relative to the center of participants hand, we define segments … Cited by 1 Related articles

Time-reversed reverberation yields lower speech recognition rate by human and machine T Arai – Acoustical Science and Technology, 2013 – jlc.jst.go.jp … Forum Acusticum, Sevilla (2002). [12] Julius homepage: http://julius.sourceforge.jp/. [13] S. Greenberg and T. Arai, ”What are the essential cues for understanding spoken language?,” IEICE Trans. Inf. Syst., E87-D, 1059–1070 (2004). … Related articles All 3 versions

Object Recognition by Integrated Information Using Web Images H Nishimura, Y Ozasa, Y Ariki… – … (ACPR), 2013 2nd IAPR …, 2013 – ieeexplore.ieee.org … Workshop on Algorithmic Learning Theory, pp. 77-86, 1992. [20] A. Lee, T. Kawahara and K.Shikano, “Julius – an Open Source Real-Time Large Vocabulary Recognition Engine,” in Proc. EUROSPEECH, pp. 1691-1694, 2001. http://julius.sourceforge.jp/ … Cited by 1 Related articles All 2 versions

Disambiguation in unknown object detection by integrating image and speech recognition confidences Y Ozasa, Y Ariki, M Nakano, N Iwahashi – Computer Vision–ACCV 2012, 2013 – Springer … Understanding in Physical Interaction. Journal of Artificial Intelligence 25(25), 670–682 (2010) 7. Julius, http://julius.sourceforge.jp/ 8. Jiang, H.: Confidence Measures for Speech Recognition: A survey. Speech Commu- nication … Cited by 3 Related articles All 5 versions

Recognition of a Baby’s Emotional Cry Towards Robotics Baby Caregiver S Yamamoto, Y Yoshitomi, M Tabuse… – Int J Adv …, 2013 – cdn.intechopen.com … Proc. of Human Interface Sympsium, 25-28 (in Japanese). [10] Open-Source Large Vocabulary CSR Engine Julius. http://julius.sourceforge.jp/en_index.php?q=index- en.html. Accessed 2012 June 3. 6 Int J Adv Robotic Sy, 2013, Vol. 10, 86:2013 www.intechopen.com Page 7. … Related articles All 3 versions

Object Recognition for Service Robots through Verbal Interaction Based on Ontology H Fukuda, S Mori, Y Kobayashi, Y Kuno… – Advances in Visual …, 2013 – Springer … humanoid. In: IEEE Int. Conf. on Robotics and Autonation, pp. 769–774 (2009) 13. Kinect for Windows, http://kinectforwindows.org/ 14. Open-Source Large Vocabulary CSR Engine Julius, http://julius.sourceforge.jp/index-en.html Related articles

Communication for task completion with heterogeneous robots D Erickson, M DeWees, J Lewis, ET Matson – … Intelligence Technology and …, 2013 – Springer … (2012), http://www.irobot.com 4. Julius Speech Recognition (2012), http://julius.sourceforge. jp 5. VoxForge (2012), http://code.google.com/p/voxforge 6. Matson, ET, Min, BC: M2M infrastructure to integrate humans, agents and robots into collectives. … Cited by 3 Related articles

A Speech Recognition Client-Server Model for Control of Multiple Robots K Shrivastava, N Singhal, PK Das, SB Nair – Proceedings of Conference …, 2013 – dl.acm.org … August 2011. Vol. 3, No. 8. Pages 2926-2934. [10] Julius. Open-Source Large Vocabulary CSR Engine Julius. Retrieved June 1, 2013 from Sourceforge: http://julius.sourceforge. jp/en_index.php [11] HTK. HTK Speech Recognition Toolkit. … Related articles

Comparison of Methods for Topic Classification of Spoken Inquiries R Torres, H Kawanami, T Matsui, H Saruwatari… – Information and Media …, 2013 – jlc.jst.go.jp Page 1. Information and Media Technologies 8(2): 438-448 (2013) reprinted from: Journal of Information Processing 21(2): 157-167 (2013) © Information Processing Society of Japan 438 Regular Paper Comparison of Methods for Topic Classification of Spoken Inquiries … Related articles All 6 versions

Expectation-Based Command Recognition Off the Shelf: Publicly Reproducible Experiments with Speech Input D Ertl, J Falb, H Kaindl, R Popp… – … (HICSS), 2013 46th …, 2013 – ieeexplore.ieee.org … When using this straight-forward approach, the command recognition of analogue modalities is only dependent on the toolkit (and its configuration) and does not take expectations of the ma- chine into account. 2http://julius.sourceforge.jp/en_index.php?q= en/index.html … Related articles All 4 versions

The SPPAS Participation to the Forced-Alignment Task of Evalita 2011 B Bigi – Evaluation of Natural Language and Speech Tools for …, 2013 – Springer … individual schedules. One of the characteristics of Spontaneous Speech is an im- portant gap between a word’s phonological form and its phonetic realisations. 2 http://julius.sourceforge.jp Page 3. 314 B. Bigi Specific realisation … Related articles All 6 versions

Automated Segmentation and Tagging of Lecture Videos R Raipuria – 2013 – it.iitb.ac.in Page 1. Automated Segmentation and Tagging of Lecture Videos Submitted in partial fulfillment of the requirements for the degree of Master of Technology by Ravi Raipuria Roll No: 113050077 Supervisor: Prof. DB Phatak Department of Computer Science and Engineering … Related articles All 2 versions

Automatic selection of compiler options using genetic techniques for embedded software design M Nagiub, W Farag – … and Informatics (CINTI), 2013 IEEE 14th …, 2013 – ieeexplore.ieee.org … 20-%20Datasheet/MIO- 5250_DS(06.11.12)20120621201706.pdf [12] Beagleboard xM C embedded computer board for ARM, http://beagleboard.org/static/BBxMSRM_latest.pdf [13] Julius Continuous Speech Recognition Engine, http://julius.sourceforge.jp/en_index.php. … Related articles

The AXES submissions at TrecVid 2013 R Aly, R Arandjelovic, K Chatfield, M Douze… – 2013 – doras.dcu.ie … The language model include online news and newswire articles as well as patents. The vocabulary uses the most frequent 130k words and provides multiple pronunci- ations. Decoding is performed by the Julius recognition en- gine (http://julius.sourceforge.jp/en_index. … Cited by 8 Related articles All 10 versions

Semi-blind algorithm for joint noise suppression and dereverberation based on higher-order statistics and acoustic model likelihood FD Aprilyanti, H Saruwatari, K Shikano… – … Annual Summit and …, 2013 – ieeexplore.ieee.org … [11] Julius, an open-source large vocabulary csr engine – http://julius.sourceforge.jp. [12] R. Miyazaki, H. Saruwatari, T. Inoue, Y. Takahashi, K. Shikano, K. Kondo, “Musical-noise-free speech enhancement based on optimized iterative spectral subtraction,” IEEE Trans. … Related articles All 2 versions

Extracting Social Semantics From Multimodal Meeting Content S Semantics – computer.org … The features are extracted from the raw audio-visual and motion data. For ex- ample, speech tone and speaking time are automatically determined using the Julius speech recognition engine (http:// julius.sourceforge.jp/en). … Related articles

Trial realization of human-centered multimedia navigation for video retrieval M Haseyama, T Ogawa – International Journal of Human- …, 2013 – Taylor & Francis … the speech recognition results. Specifically, Julius (http://julius.sourceforge.jp/), a large vocabulary continuous speech recognition system, is utilized for speech recognition in the proposed system. The language model, acoustic … Cited by 1 Related articles All 5 versions

AXES at TRECVid 2013 RBN Aly, R Arandjelovic, K Chatfield, M Douze… – 2013 – eprints.eemcs.utwente.nl … The language model include online news and newswire articles as well as patents. The vocabulary uses the most frequent 130k words and provides multiple pronunci- ations. Decoding is performed by the Julius recognition en- gine (http://julius.sourceforge.jp/en_index. … Cited by 1 Related articles All 3 versions

A gesture-centric android system for multi-party human-robot interaction Y Kondo, K Takemura… – Journal of Human- …, 2013 – humanrobotinteraction.org … How- ever, this system employs a word spotting approach with scored keywords. For example, in case of 4http://sourceforge.net/projects/opencvlibrary/ 5http://julius.sourceforge.jp/ 6http://developer.yahoo.co.jp/webapi/jlp/keyphrase/v1/extract.html 142 Page 11. … Cited by 6 Related articles All 5 versions

Effectiveness of Gaze-Based Engagement Estimation in Conversational Agents R Ishii, R Ooko, YI Nakano, T Nishida – Eye Gaze in Intelligent User …, 2013 – Springer … In: Graesser A, Gernsbacher M, Goldman S (eds) The handbook of discourse processes. Erlbaum, Hillsdale, pp 243–286. Footnotes. 1 julius-4.0.2. Available from http://?julius.? sourceforge.?jp/?forum/?viewtopic.?php??f=?13&?t=?53. … Related articles All 3 versions

Confidence estimation and keyword extraction from speech recognition result based on Web information H Kensuke, S Hideki, K Tetsuya… – … Annual Summit and …, 2013 – ieeexplore.ieee.org … S., Isahara. H., ”Spontaneous speech corpus of Japanese” Proc. LREC2000, pp.947-952, 2000. [9] NTCIR9, http://research.nii.ac.jp/ntcir/ntcir-9/ [10] Julius, http://julius.sourceforge.jp/ [11] Mainichi newspaper data, http://www.nichigai.co.jp/sales/ mainichi/mainichi-data.html Related articles All 2 versions

Evaluation Framework Design of Spoken Term Detection Study at the NTCIR-9 IR for Spoken Documents Task H Nishizaki, T Akiba, K Aikawa, T Kawahara… – Information and Media …, 2013 – jlc.jst.go.jp Page 1. Information and Media Technologies 8(1): 59-80 (2013) reprinted from: Journal of Natural Language Processing 19(4): 329-350 (2012) © The Association for Natural Language Processing 59 Evaluation Framework Design of Spoken Term Detection … Related articles All 2 versions

HARK version 1.2.0 Cookbook HG Okuno, K Nakadai, T Takahashi, R Takeda… – winnie.kuis.kyoto-u.ac.jp Page 1. HARK version 1.2.0 Cookbook Hiroshi G. Okuno Kazuhiro Nakadai Toru Takahashi Ryu Takeda Keisuke Nakamura Takeshi Mizumoto Takami Yoshida Angelica Lim Takuma Otsuka Kohei Nagira Tatsuhiko Itohara Page 2. Contents 1 Introduction 4 2 Learning HARK 6 … Related articles All 3 versions

A-STAR: Toward translating Asian spoken languages S Sakti, M Paul, A Finch, S Sakai, TT Vu… – Computer Speech & …, 2013 – Elsevier This paper outlines the first Asian network-based speech-to-speech translation system developed by the Asian Speech Translation Advanced Research (A-STAR) conso. Cited by 1 Related articles All 4 versions

Children’s Turn-Taking Behavior Adaptation in Multi-Session Interactions with a Humanoid Robot I Kruijff-Korbayová, I Baroni, M Nalin, H Cuayáhuitl… – deib.polimi.it Page 1. March 25, 2013 11:58 WSPC/INSTRUCTION FILE ijhr International Journal of Humanoid Robotics c World Scientific Publishing Company Children’s Turn-Taking Behavior Adaptation in Multi-Session Interactions with a Humanoid Robot … Cited by 1 Related articles All 2 versions

Using Wizard of Oz to Collect Interaction Data for Voice Controlled Home Care and Communication Services S Schlogl, G Chollet, P Milhorat, J Deslis… – Proc. IASTED …, 2013 – vassist.cure.at … 14http://julius.sourceforge.jp/en index.php 515 Page 6. allowing for a better control over the recognition process and more flexibility). Similar to other HMM-based toolk- its (eg HTK15) JULIUS requires an initial training period to be operational. … Cited by 1 Related articles All 3 versions

Getting Past the Language Gap: Innovations in Machine Translation R Delmonte – Mobile Speech and Advanced Natural Language …, 2013 – Springer Logo Springer. Search Options: … Cited by 1 Related articles All 3 versions

A speech retrieval system based on fuzzy logic and knowledge-base filtering M Singh, US Tiwary, TJ Siddiqui – … , Signal Processing and …, 2013 – ieeexplore.ieee.org Page 1. A Speech Retrieval System based on Fuzzy logic and Knowledge-base filtering Malay Singh, Uma Shanker Tiwary, Tanveer J. Siddiqui Abstract—The objective of this paper is two-fold. First, to adapt the earlier cognitive … Related articles

Temporally variable multi-aspect N-way morphing based on interference-free speech representations H Kawahara, M Morise, H Banno… – Signal and Information …, 2013 – ieeexplore.ieee.org Page 1. Temporally variable multi-aspect N-way morphing based on interference- free speech representations Hideki Kawahara ? , Masanori Morise † , Hideki Banno ‡ , and Verena G. Skuk § ? Department of Design Information … Cited by 2 Related articles All 2 versions

Speech Recognition Software and Vidispine T Nilsson – 2013 – diva-portal.org Page 1. Speech Recognition Software and Vidispine Tobias Nilsson April 2, 2013 Master’s Thesis in Computing Science, 30 credits Supervisor at CS-UmU: Frank Drewes Examiner: Fredrik Georgsson Ume?a University Department … Cited by 1 Related articles All 3 versions

Characterization and use of the PAC7001 camera with the posit algorithm on an AVR microcontroller Á González García – 2013 – academica-e.unavarra.es Page 1. ESCUELA TÉCNICA SUPERIOR DE INGENIEROS INDUSTRIALES Y DE TELECOMUNICACIÓN Titulación: INGENIERÍA DE TELECOMUNICACIÓN. Título del proyecto: CHARACTERIZATION AND USE OF THE PAC7001 CAMERA …

Object recognition for service robots based on human description of object attributes H Fukuda, S Mori, K Sakata, Y Kobayashi… – ??????? C (??? …, 2013 – jlc.jst.go.jp Page 1. ??????? C??????????????? IEEJ Transactions on Electronics, Information and Systems Vol.133 No.1 pp.18–27 DOI: 10.1541/ieejeiss.133.18 Paper Object Recognition for Service Robots Based on Human Description of Object Attributes … Cited by 1 Related articles All 4 versions

Admissible Stopping in Viterbi Beam Search for Unit Selection Speech Synthesis S Sakai, T Kawahara – IEICE TRANSACTIONS on Information and …, 2013 – search.ieice.org Page 1. IEICE TRANS. INF. & SYST., VOL.E96–D, NO.6 JUNE 2013 1359 PAPER Admissible Stopping in Viterbi Beam Search for Unit Selection Speech Synthesis Shinsuke SAKAI †a) and Tatsuya KAWAHARA † , Members … Related articles All 4 versions

Gaze awareness in conversational agents: Estimating a user’s conversational engagement from eye gaze R Ishii, YI Nakano, T Nishida – ACM Transactions on Interactive …, 2013 – dl.acm.org Page 1. 11 Gaze Awareness in Conversational Agents: Estimating a User’s Conversational Engagement from Eye Gaze RYO ISHII, Kyoto University, Seikei University, and NTT Corporation YUKIKO I. NAKANO, Seikei University TOYOAKI NISHIDA, Kyoto University … Related articles

Agentslang: A fast and reliable platform for distributed interactive systems O Serban, A Pauchet – Intelligent Computer Communication …, 2013 – ieeexplore.ieee.org Page 1. AgentSlang: A Fast and Reliable Platform for Distributed Interactive Systems Ovidiu Serban ?† and Alexandre Pauchet ? ? LITIS Laboratory, INSA de Rouen Avenue de l’Université – BP 8, 76801 Saint-´Etienne-du-Rouvray … Cited by 2 Related articles

Multi-label automatic indexing of music by cascade classifiers W Jiang, ZW Ras – Web Intelligence and Agent Systems, 2013 – IOS Press Page 1. Web Intelligence and Agent Systems: An International Journal 11 (2013) 149–170 149 DOI 10.3233/WIA-130268 IOS Press Multi-label automatic indexing of music by cascade classifiers Wenxin Jianga and Zbigniew … Cited by 1 Related articles All 4 versions