SRILM (SRI Language Modeling Toolkit) 2011

SRILM – The SRI Language Modeling Toolkit {related:}

Notes:

SRILM (SRI Language Modeling Toolkit) is an open source, extensible language modeling toolkit. SRILM is a C++-based toolkit for language modeling. Language models are built and interpolated using the SRILM. SRILM can be used for building local language models. SRILM is used to estimate n-grams Language Models (LM). SRILM has an API for computing word language model probabilities. Disambig is one module in SRILM. Perplexity values can be computed with SRILM. There is a standard script for “compute – best – mix” in the SRILM package. The LM weighted using SRILM has been used to train language models. With an SRILM extension, efficient estimation of maximum entropy language models with n-gram features can be achieved. Even with relatively small language models, SRILM can be used to prune the language models using an entropy criterion. N-gram models may be estimated for all of the possible combinations using SRILM. SRILM can be used to build n-gram ARPA format language models. SRILM reads and writes to a standard ARPA (Advanced Research Projects Agency) file format for n-gram models. Standard n-gram language models may be trained with the SRILM using interpolated modified Kneser-Ney smoothing. SRILM can be used to build bigram language models from various corpora, such as the English Gigaword corpus. SRILM can be used on a monolingual training corpus of 48,000,000 sentences, for example.

A bigram language model used in recognition systems was generated using the SRILM with the modified Kneser-Ney back-off discounting. Trigram LMs may be estimated using the SRILM employing the default Good-Turing discounting method. The language model is a capitalization-invariant tri-gram language model with Good-Turing discounting acquired from the training corpus using the SRI language modeling toolkit. Modified KN models may be estimated on training set count files and applied to the test set using SRILM. A 4-gram target LM with unmodified Kneser-Ney backoff discounting was generated using the SRILM. SRILM was used to train a 5-gram language model on the English sentences of FBIS (Foreign Broadcast Information Service) corpus. A 5-gram language model generated by the SRILM can be used in the cube-pruning process. VMM (variable memory modeling) may be implemented within SRILM and compared to default N-Gram models. SRILM can also be used to train a 7-gram model on training set. For instance, SRILM may be used to estimate individual language models for truthful and deceptive opinions. Translation models and generation models may be trained by the Moses toolkit. IRSTLM is another, similar language modeling toolkit. N-gram language models may be scored using z-scores. For example z-scores have been used to compare documents by examining how many standard deviations each n-gram differs from its mean occurrence in a large collection, or text corpus, of documents (which form the “background” vector).

References:

Spoken Language Understanding: Systems for Extracting Semantic Information from Speech (2011)

See also:

IRSTLM (IRST Language Modeling) Toolkit


Relevance language modeling for speech recognition [PDF] from 140.124.72.88 KY Chen… – Acoustics, Speech and Signal Processing …, 2011 – ieeexplore.ieee.org … used in this paper was estimated from a background text corpus consisting of 170 million Chinese characters collected from Central News Agency (CNA) in 2001 and 2002 (the Chinese Gigaword Corpus released by LDC) using the SRI Language Modeling Toolkit (SRILM) [17]. … Cited by 2 – Related articles – All 4 versions

[PDF] The Kaldi speech recognition toolkit [PDF] from idiap.ch D Povey, A Ghoshal, G Boulianne… – ASRU, Big Island, …, 2011 – publications.idiap.ch … 3Available from: http://hlt.fbk.eu/en/irstlm 4Available from: http://www.speech.sri.com/projects/ srilm/ … VIII. DECODERS We have several decoders, from simple to highly optimized; more will be added to handle things like on-the-fly language model rescoring and lattice generation. … Cited by 4 – View as HTML

Gender-dependent acoustic models fusion developed for automatic subtitling of parliament meetings broadcasted by the Czech TV [PDF] from cvut.cz J Vanek… – Text, Speech and Dialogue, 2011 – Springer … Good-Turing discounting. The language model was trained on about 10M tokens of normalized Czech Parliament transcriptions. The SRI Language Modeling Toolkit (SRILM) [11] was used for training. The model contains 186k … Cited by 4 – Related articles – All 5 versions

[PDF] The Karlsruhe Institute of Technology translation systems for the WMT 2011 [PDF] from aclweb.org T Herrmann, M Mediani, J Niehues… – Proceedings of the Sixth …, 2011 – aclweb.org … It is a 4-gram SRI language model using Kneser-Ney smoothing. … Probabilistic Part-of-Speech Tag- ging Using Decision Trees. In International Con- ference on New Methods in Language Processing, Manchester, UK. … 2002. SRILM – An Extensible Lan- guage Modeling Toolkit. … Cited by 3 – Related articles – View as HTML – All 9 versions

[PDF] Detecting Structural Irregularity in Electronic Dictionaries Using Language Modeling [PDF] from trojina.si P Rodrigues, D Zajic, D Doermann… – Proceedings of …, 2011 – trojina.si … For our experiments, we used the SRI Language Modeling Toolkit (SRILM) (Stolcke, 2002). … SRILM reads and writes to a standard ARPA (Advanced Research Projects Agency) file format for n-gram models. There are other language modeling toolkits available. … Cited by 1 – View as HTML

English-Latvian SMT: the challenge of translating into a free word order language [PDF] from upc.edu M Khalilov, JA Rodríguez Fonollosa, I Skadina… – 2011 – upcommons.upc.edu … A 4-gram target LM with unmodified Kneser-Ney backoff discounting was generated using the SRI Language Modeling Toolkit [20] and was used in all the experiments. … [20] A. Stolcke, “SRILM: an extensible language modeling toolkit,” in Proceedings of the International … Cited by 2 – Related articles – All 10 versions

Automatic topic identification for large scale language modeling data filtering [PDF] from zcu.cz L Skorkovská, P Ircing, A Pražák… – Text, Speech and Dialogue, 2011 – Springer … is less effective. All the language models described in the following paragraphs are trigram LMs estimated using the SRI Language Modeling Toolkit (SRILM) [6] employing the default Good-Turing discounting method. The re … Cited by 1 – Related articles – All 5 versions

Slovak Language Model from Internet Text Data J Sta?s, D Hladek, M Pleva… – … Autonomous, Adaptive, and …, 2011 – books.google.com … models (smoothed, pruned, interpolated, etc.) in the standard ARPA format [2] are generated with the vocabulary by using SRI Language Modeling Toolkit [6]. Evaluation … 73-77 (2008) ISBN 978-80-553-0066-5 6. Stolcke, A.: SRILM-An Extensible Language Modeling Toolkit. … Cited by 2 – Related articles

Slovak language model from internet text data J Staš, D Hládek, M Pleva… – Toward Autonomous, Adaptive, and …, 2011 – Springer … language models (smoothed, pruned, interpolated, etc.) in the standard ARPA format [2] are generated with the vocabulary by using SRI Language Modeling Toolkit [6]. … 73-77 (2008) ISBN 978-80-553-0066-5 6. Stolcke, A.: SRILM – An Extensible Language Modeling Toolkit. … Cited by 1 – Related articles – All 2 versions

[PDF] System Description of BJTU-NLP SMT for NTCIR-9 PatentMT [PDF] from nii.ac.jp J Jiang, J Xu, Y Lin… – Proceedings of NTCIR, 2011 – research.nii.ac.jp … Therefore, attention should first be given on phrase-based translation model. 5 http://code.google. com/p/giza-pp/ 6 http://www.speech.sri.com/projects/srilm Input Output … 6. REFERENCES [1] Andreas Stolcke. 2002. SRILM-an extensible language modeling toolkit. … Cited by 1 – View as HTML

[PDF] UPM system for the translation task [PDF] from tecnologiasaccesibles.com V López-Ludeña… – Corpus, 2011 – tecnologiasaccesibles.com … This program is a beam search decoder for phrase-based statistical machine translation models. In order to obtain a 3-gram language model, the SRI language modeling toolkit has been used (Stolcke, 2002). … “SRILM – An Extensible Language Modelling Toolkit”. … Cited by 1 – Related articles – View as HTML – All 10 versions

Comparative Analysis of Tools Available for Developing Statistical Approach Based Machine Translation System A Kumar… – Information Systems for Indian Languages, 2011 – Springer … 126-129. Association for Computational Linguistics (2006) [P7] Stolcke, A.: Srilm -An Extensible Language Modeling Toolkit, Speech Technology and Research Laboratory SRI International, Menlo Park, CA, USA (2002), http://www-speech.sri.com/papers … Related articles – All 2 versions

Discriminative language modeling for speech recognition with relevance information B Chen… – Multimedia and Expo (ICME), 2011 IEEE …, 2011 – ieeexplore.ieee.org … LVCSR system was estimated from a background text corpus consisting of 170 million Chinese characters collected from Central News Agency (CNA) in 2001 and 2002 (the Chinese Gigaword Corpus released by LDC) using the SRI Language Modeling Toolkit (SRILM) [11]. … Cited by 1 – Related articles

[CITATION] Unknown Words Modelling in Training and Using Language Models for Russian LVCSR System. M Korenevsky, A Bulusheva… Related articles

[PDF] SRILM at Sixteen: Update and Outlook [PDF] from sri.com A Stolcke, J Zheng, W Wang… – www-speech.sri.com … M. Kurimo, “Efficient estimation of maximum entropy language models with N-gram features: An SRILM extension”, in … ing agreement for SRI’s language modeling toolkit”, http://www.systransoft. com/systran/news-and-events/press-release/sri- international-and-systran-announce … Related articles – View as HTML – All 5 versions

[PDF] ATT-0: Submission to Generation Challenges 2011 Surface Realization Shared Task [PDF] from aclweb.org A Stent – … at the 13th European Workshop on Natural Language … – aclweb.org … The language model is a capitalization-invariant tri- gram language model with Good-Turing discount- ing acquired from the training corpus using the SRI language modeling toolkit (Stolcke, 2002). … 2002. SRILM – an extensible language modeling toolkit. … View as HTML

[PDF] The DCU Multi-Engine MT System for CWMT’2011 [PDF] from dcu.ie X Wu, J Li, J Jiang, Y He… – nclt.computing.dcu.ie … A 5-gram language model generated by the SRI Language Modeling toolkit (SRILM) [Stolcke, 2002] is used in the cube-pruning process. The search space is pruned with a chart cell size limit of 50. … SRILM – An Extensible Language Modeling Toolkit. … View as HTML

Empirical evaluation and combination of advanced language modeling techniques T Mikolov, A Deoras, S Kombrink… – … Annual Conference of …, 2011 – isca-speech.org … [12] Andreas Stolcke. SRILM – language modeling toolkit. http://www.speech.sri.com/projects/ srilm/ [13] Tanel Alumäe and Mikko Kurimo. Efficient Estimation of Maxi- mum Entropy Language Models with N-gram features: an SRILM extension , In: Proc. INTERSPEECH 2010. …

[PDF] Improvement of Translation Quality by Application of Translation Rules Based on Generalization of Bilingual Sentences to Phrase-Based SMT [PDF] from hokudai.ac.jp R Terashima, H Echizen-ya… – arakilab.media.eng.hokudai.ac.jp … our method is that system using it improves the translation quality by using acquired translation rules without any analytical tools in the … system At first, we construct the phrase-based SMT system based on Moses[1], GIZA++[2], and SRILM(SRI Language Modeling Toolkit)[3]. Next … Related articles – View as HTML

[PDF] CONTEXT SPECIFIC LANGUAGE MODELING FOR SPOKEN ROUTE DIRECTIONS [PDF] from cmu.edu A Pappu… – tts.speech.cs.cmu.edu … We ran different experiments with language modeling using SRI-LM[10] toolkit on both the GDir corpus and the external cor- pora. 4.1. … IEEE, 2006, vol. 1. [10] A. Stolcke, “Srilm-an extensible language modeling toolkit,” in Proceedings of the ICSLP, 2002, vol. … Related articles – View as HTML

[PDF] Documentation, Code & Data for Incremental Syntactic Language Models for Phrase-based Translation [PDF] from umn.edu L Schwartz, C Callison-Burch, W Schuler, S Wu… – www-users.cs.umn.edu … cate our results: • SRI Language Modeling Toolkit (Stolcke, 2002) … Andreas Stolcke. 2002. SRILM – an extensible lan- guage modeling toolkit. In Proceedings of the In- ternational Conference on Spoken Language Pro- cessing, September. Related articles – View as HTML

[PDF] Data Mining on Chinese Duilian [PDF] from ust.hk LU Zhongqi – 2011 – ihome.ust.hk … 1073445.1073462 [4] The Moses (2010) – A Statistical Machine Translation System Available: http://www.statmt.org/moses/ [5] SRILM (2009) – The SRI Language Modeling Toolkit Available: http://www.speech.sri.com/projects/srilm/ [6 … Related articles – View as HTML

[PDF] Moses on Windows 7 [PDF] from washington.edu A Axelrod – ssli.ee.washington.edu … SRILM can be downloaded here: http://www-speech.sri.com/projects/srilm/download.html Further instructions for running SRILM (or any … http://statmt.org/wmt11/baseline.html for a step-by-step guide with examples for preparing data, building a language model, training a … Related articles – View as HTML

[PDF] A scalable probabilistic classifier for language modeling [PDF] from aclweb.org J Lang – … for Computational Linguistics: Human Language …, 2011 – aclweb.org … All experiments were conducted using the SRI Lan- guage Modeling Toolkit (SRILM, Stolcke (2002)), ie we implemented5 the VMM within SRILM and compared to default N-Gram models supplied with SRILM. … SRILM – An Extensible Language Modeling Toolkit. … Related articles – View as HTML – All 8 versions

Speaker-clustered acoustic models evaluated on GPU for on-line subtitling of parliament meetings J Psutka, J Vanek… – Text, Speech and Dialogue, 2011 – Springer … Stolcke, A.: SRILM – An Extensible Language Modeling Toolkit. In: International Confer- ence on Spoken Language Processing (ICSLP 2002), Denver, USA (2002) 12. … Stolcke, A., et al.: The SRI March 2000 Hub-5 Conversational Speech Transcription System. In: Proc. … Related articles – All 3 versions

[PDF] Context-aware Language Modeling for Conversational Speech Translation [PDF] from cmu.edu A Saluja, I Lane… – ece.cmu.edu … SRI-LM was also used for perplexity mea- surements, evaluated on the English LMs. … 1996. Modeling long range de- pendencies in languages: Topic mixtures vs. dynamic cache models. In International Conference on Spoken Language Processing, pages 236-239. … Related articles – View as HTML – All 5 versions

Fast phonetic/lexical searching in the archives of the Czech holocaust testimonies: advancing towards the MALACH project visions [PDF] from zcu.cz J Psutka, J Švec, J Psutka, J Vanek… – Text, Speech and …, 2011 – Springer … The resulting trigram language model with modified Kneser-Ney smoothing contains 252k words (308k phonetical variants). Language models were estimated using the SRI Language Modeling Toolkit (SRILM) [10]. 3.4 Word and Phoneme Lattices … Related articles – All 5 versions

Parallelizing a machine translation decoder for multicore computer L Chen, W Huo, H Mi, Z Zhang… – … (ICNC), 2011 Seventh …, 2011 – ieeexplore.ieee.org … Chiero integrates the open source SRI Language Modeling Toolkit named SRILM. … The remaining cache misses oc- curred mainly within functions such as LHash::locate() in SRILM, strcmp() in glibc and … [5] A. Stolcke, Srilmlan extensible language modeling toolkit, in: In Pro … Related articles

Applying Grapheme, Word, and Syllable Information for Language Identification in Code Switching Sentences YL Yeong… – Asian Language Processing (IALP), 2011 …, 2011 – ieeexplore.ieee.org … Language Engineering, Singapore, 2009. [7] Documentation of “The SRI Language Modeling Toolkit”, http://www-speech.sri.com/projects/srilm/manpages/ngram- discount.7.html, accessed on 18 March 2011. [8] Tien-Ping Tan …

[PDF] MODELING GENDER DEPENDENCY IN THE SUBSPACE GMM FRAMEWORK [PDF] from google.com NT Vu, T Schultz… – sites.google.com … Page 4. with relatively small language model, we used the SRI lan- guage model toolkit to prune all the language models using an entropy criterion [10]. … [10] A. Stolcke, “SRILM-an extensible language modeling toolkit,” in Proceedings of the international conference on … Related articles – View as HTML

[PDF] Multi Domain Language Model Adaptation using Explicit Semantic Analysis [PDF] from quaero.org K Kilgour, FKS Stüker… – quaero.org … The models are built and interpolated using the SRI Language Modelling Toolkit [14]. … 24, no. 4, pp. 35-43, 2001. [14] A. Stolcke, “SRILM-an extensible language modeling toolkit,” in Seventh International Conference on Spoken Language Pro- cessing. ISCA, 2002. … Related articles – View as HTML

Word prediction for a real-time reader device for blind people [PDF] from upc.edu P Palou Llobera – 2011 – upcommons.upc.edu … 6 2.2. Statistical Language Modeling….. 7 2.2.1. Language models ….. 7 2.2.1.1. … 15 2.3.1. SRILM Toolkit ….. … Related articles – All 6 versions

[PDF] HPB SMT of FRDC Assisted by Paraphrasing for the NTCIR-9 PatentMT [PDF] from nii.ac.jp Z Zheng, N Ge, Y Meng… – research.nii.ac.jp … For the language model, we used the SRI Language Mod- eling Toolkit (SRILM) [1] to train 4-gram language models on the target portion of each training set. … Page 5. 5. REFERENCES [1] Andreas Stolcke. SRIM – An Extensible Language Modeling Toolkit. … View as HTML

[PDF] Factored Translation Models for improving a Speech into Sign Language Translation System [PDF] from uiuc.edu V López-Ludeña, R San-Segundo… – proceedings of the …, 2011 – mickey.ifp.uiuc.edu … This program is a beam search decoder for phrase-based statistical machine translation models. In order to obtain a 3- gram language model, the SRI language modeling toolkit has been used [18]. … “SRILM – An Extensible Language Modelling Toolkit”. Proc. Intl. … Related articles – All 4 versions

[PDF] Language Modeling [PDF] from mt-archive.info K Heafield – 2011 – mt-archive.info … Empirical Use part of the corpus to estimate the probability. Andreas Stolcke, SRILM’s lead author, recommends empirical. … Bigger models Conserve memory SRI doesn’t compile Distribute and compile with decoders … Example Language Model … View as HTML

[PDF] Part-of-Speech Tagging for Under-Resourced and Morphologically Rich Languages-The Case of Amharic [PDF] from aflat.org MY Tachbelie, ST Abate… – aflat.org … Since their concern was on the accuracy of the tag- gers, they used SVM-based taggers to tag their text for language modeling experiment. … Disambig is a module in SRI Language Mod- eling toolkit (SRILM) (Stolcke, 2010). … Related articles – View as HTML – All 3 versions

ENGLISH TO HINDI STATISTICAL MACHINE TRANSLATION SYSTEM [PDF] from thapar.edu N Sharma, P Bhatia… – 2011 – dspace.thapar.edu … UNIX software tools designed to facilitate Language Modeling work for research purposes. It was written by Roni Rosenfeld, and released in 1994 [26]. SRILM SRILM is a toolkit for building and applying statistical Language Models (LMs) developed by SRI Speech Technology … Related articles

[PDF] RASR-The RWTH Aachen University Open Source Speech Recognition Toolkit [PDF] from rwth-aachen.de D Rybach, S Hahn, P Lehnen, D Nolden… – www-i6.informatik.rwth-aachen.de … However, the decoder supports N-gram lan- guage models in the ARPA format, produced eg by the SRI Language Modeling Toolkit [25]. … 105-108. [25] A. Stolcke, “SRILM – an extensible language modeling toolkit,” in ICSLP, Denver, CA, USA, Sep. 2002. … View as HTML

[PDF] The 2011 KIT QUAERO Speech-to-Text System for Russian [PDF] from quaero.org Y Titov, K Kilgour, S Stüker… – quaero.org … This was done using the SRI Language Modelling Toolkit [25]. … [25] A. Stolcke, “Srilm – an extensible language modeling toolkit,” in ICSLP, 2002. [26] Q. Jin and T. Schultz, “Speaker segmentation and clustering in meetings,” in ICSLP, 2004. … Related articles – View as HTML

[PDF] Forecasting Conflicts using N-gram Models [PDF] from umass.edu A Bakhtiari, C Besse… – cs.umass.edu … (b) F1-measure Figure 1: Results for the top 11 actors with a 28 day forecast period. Using the event datasets, N-gram models were estimated for all of the possible combinations de- scribed above using the SRI Language Modeling toolkit (SRILM) [10]. … View as HTML

[PDF] LIUM’s Statistical Machine Translation System for the NTCIR Chinese/English PatentMT [PDF] from nii.ac.jp H Schwenk, S Abdul-Rauf… – research.nii.ac.jp … by keeping all observed n-grams, ie using a cut-off value of 1. The perplexity on the development data of this huge LM is 79.5, and the file occupies 40 GBytes on the disk in the binary representation of the SRI toolkit. … SRILM – an extensible language modeling toolkit. … View as HTML

[PDF] The ICT’s Patent MT System Description for NTCIR-9 [PDF] from nii.ac.jp H Xiong, L Song, F Meng, Y Lü… – Proceedings of NTCIR, 2011 – research.nii.ac.jp … 3.4 Experiments of Language Model We use the SRI Language Modeling Toolkit [11] to train the Japenese/English 5-gram, 6-gram, 7-gram language model with Kneser-Ney smoothing on the Japenese/English side of … Srilm – an extensible language modeling toolkit. … Cited by 1 – View as HTML

[PDF] Half-Context Language Models [PDF] from aclweb.org M Walsh – Computational Linguistics, 2011 – aclweb.org … test sets. A modified KN model (Chen and Goodman 1998), termed P(KN), was estimated on the training set count files and applied to the test set using srilm, the SRI language modeling toolkit (Stolcke 2002). The same count … View as HTML

Half-context language models H Schütze… – Computational Linguistics, 2011 – MIT Press … test sets. A modified KN model (Chen and Goodman 1998), termed P(KN), was estimated on the training set count files and applied to the test set using srilm, the SRI language modeling toolkit (Stolcke 2002). The same count … Related articles

Quaero Speech-to-Text and Text Translation Evaluation Systems [PDF] from quaero.org S Stüker, K Kilgour… – High Performance Computing in Science …, 2011 – Springer … For each of the text sources in Table 1 we built, using the SRI Language Modeling Toolkit [23], a modified Kneser-Ney smoothed 4 gram language model. … 23. A. Stolcke. SRILM – An Extensible Language Modeling Toolkit. … Cited by 3 – Related articles – All 2 versions

[PDF] Reordering with source language collocations [PDF] from ffzg.hr Z Liu, H Wang, H Wu, T Liu… – … Language Technologies-Volume 1, 2011 – hnk.ffzg.hr … 4.2 Settings We use the FBIS corpus (LDC2003E14) to train a Chinese-to-English phrase-based translation model. And the SRI language modeling toolkit (Stolcke, 2002) is used to train a 5-gram language model on the English sentences of FBIS corpus. … Related articles – View as HTML – All 9 versions

[PDF] Domain adaptation via pseudo in-domain data selection [PDF] from aclweb.org A Axelrod, X He… – Proc. of EMNLP, 2011 – aclweb.org … We used the SRI Language Model- ing Toolkit (Stolcke, 2002) was used for LM train- ing in all cases: corpus selection, MT tuning, and decoding. … Andreas Stolcke. 2002. SRILM – An Extensible Lan- guage Modeling Toolkit. Spoken Language Process- ing. … Cited by 6 – Related articles – View as HTML – All 7 versions

A Rule-Based Source-Side Reordering on Phrase Structure Subtrees F Liang, L Chen, N Miao Li – … on Asian Language …, 2011 – doi.ieeecomputersociety.org … All 3-gram language models with Kneser-Ney smoothing are built by the SRI language modeling toolkit [14]. And the translation model and generation model are trained by the Moses toolkit. … [14] A. Stolcke, “SRILM – an extensible language modeling toolkit,” in Proc. …

A Simplified-Traditional Chinese Character Conversion Model Based on Log-Linear Models Y Chen, X Shi… – … Language Processing (IALP), 2011 …, 2011 – ieeexplore.ieee.org … Language model was trained using SRI Language Modeling Toolkit [14] with modified Kneser-Ney smoothing [15]. We carry out two experiments. … 29, No. 1, pp. 19-51, 2003. [14] Andreas Stolcke. “Srilm – An Extensible Language Modeling Toolkit”. …

A Spoken Document Retrieval System for TV Broadcast News in Spanish and Basque [PDF] from ujaen.es A Varona, S Nieto, LJ Rodriguez-Fuentes… – … de Lenguaje Natural, 2011 – sinai.ujaen.es … Again text processing tools (lemmatization) are applied to represent the query q in the same way as segments dj, ie as a feature vector. … The SRI Language Modeling Toolkit SRILM (Stolcke, 2002) was used to estima- te n-grams Language Models (LM). … Related articles – All 3 versions

A spoken document retrieval system for TV broadcast news in Spanish and Basque [PDF] from ua.es A Varona Fernández, S Nieto Nieto… – 2011 – rua.ua.es … Again text processing tools (lemmatization) are applied to represent the query q in the same way as segments dj, ie as a feature vector. … The SRI Language Modeling Toolkit SRILM (Stolcke, 2002) was used to estima- te n-grams Language Models (LM). … Related articles

A Guide to Jane, an Open Source Hierarchical Translation Toolkit [PDF] from dfki.de D Stein, D Vilar, S Peitz, M Freitag… – The Prague Bulletin of …, 2011 – Versita … Jane supports four formats for n-gram language models: the ARPA format, the SRI toolkit binary format (Stolcke, 2002), randomized LMs as described in Talbot and Osborne (2007), using the open source … SRILM – an Extensible Language Modeling Toolkit. In Proc. … Cited by 1 – Related articles – All 6 versions

[PDF] English to Sinhala Machine Translation: Towards Better information access for Sri Lankans [PDF] from hltd.org J Liyanapathirana… – hltd.org … 6.2.1 Language Model SRILM toolkit with its language model specific tools was used for this … the Language Technology Research Laboratory of University of Colombo School of Computing, Sri Lanka for pro … An empirical study of smoothing techniques for language modelling. … Related articles – View as HTML – All 2 versions

[PDF] Source Language Categorization for improving a Speech into Sign Lan-guage Translation System [PDF] from aclweb.org V López-Ludeña, R San-Segundo, S Lutfi… – aclweb.org … Not only the data but also new practice (Forster et al., 2010) and new uses of traditional annotation tools (Cras- born et al., 2010) have been developed. … In order to obtain a 3- gram language model, the SRI language modeling toolkit has been used (Stolcke, 2002). … Related articles – View as HTML – All 6 versions

[HTML] Recognizing Temporal Information in Korean Clinical Narratives through Text Normalization [HTML] from nih.gov Y Kim… – Healthcare informatics research, 2011 – ncbi.nlm.nih.gov … We used the SRILM (SRI Language Model Tookit) to build the bigram language model from the corpus [10]. … Available from: http://tokuteicorpus.jp/result/pdf/2006_006.pdf. 10. Stolcke A. SRILM-an extensible language modeling toolkit; Proceedings of 7th International …

Lexical word similarity for re-ranking in Vietnamese-English named entity back transliteration THD Le… – … Conference on Asian Language …, 2011 – doi.ieeecomputersociety.org … This toolkit include EM alignment and Moses decoder. As there’s no reordering during transliteration, the parameter distortion limit is set to 0, eg there’s no reorder- ing required. For the language model, the SRI Language Modeling Toolkit (SRILM) 4 is used. …

The IBM 2009 GALE Arabic speech transcription system [PDF] from 140.124.72.88 B Kingsbury, H Soltau, G Saon… – … , Speech and Signal …, 2011 – ieeexplore.ieee.org … [15] A. Stolcke et al., “The SRI March 2000 Hub-5 conversational speech transcription system,” in Proc. NIST Speech Transcrip- tion Workshop, 2000. [16] A. Stolcke, “SRILM – an extensible language modeling toolkit,” in Proc. ICSLP, 2002, pp. 901-904. … Cited by 1 – Related articles – All 6 versions

[PDF] I3A Language Recognition System for Albayzin 2010 LRE [PDF] from uiuc.edu D Martinez, J Villalba, A Miguel, A Ortega… – 2011 – mickey.ifp.uiuc.edu … [18] A. Stolcke, “SRILM – An Extensible Language Modeling Toolkit”, in Proc. ICSLP, pp. 901-904, 2002. http://www.speech.sri.com/projects/srilm. [19] MA Zissman, “Comparison of Four Approaches to Auto- matic Language Identification of Telepone Speech”, IEEE Trans. … Cited by 1 – Related articles

Authorship attribution of web forum posts SR Pillay… – eCrime Researchers Summit (eCrime), …, 2011 – ieeexplore.ieee.org … Online tools like emails, news groups, blogs, and web forums provide an effective communication platform for millions of users around … The SRI Language Modeling toolkit was used both for training the language models and computing the perplexity values on the test data [14]. … Cited by 5 – Related articles

[PDF] SMT Systems in the University of Tokyo for NTCIR-9 PatentMT [PDF] from nii.ac.jp X Wu, T Matsuzaki… – Proc. NTCIR, 2011 – research.nii.ac.jp … This makes our system easily applicable to any language pairs only if the parallel training corpora are given beforehand … http://ntcir.nii.ac.jp/PatentMT/baselineSystems 4 http://giza-pp.googlecode. com/files/giza-pp-v1.0.3.tar.gz 5 http://www.speech.sri.com/projects/srilm/ 6 http … Cited by 2 – View as HTML

[PDF] Automatic Translation of WordNet Glosses [PDF] from uni-hamburg.de G Rigau – nats-www.informatik.uni-hamburg.de … See (Brown et al., 1993) for a detailed report on the mathematics of Machine Translation. 3 System Description Fortunately, we can count on a number of freely available tools to build a SMT system. We utilized the SRI Language Modeling Toolkit (SRILM) (Stolcke, 2002). … Related articles – View as HTML – All 9 versions

Optimal Translation Boundaries for BTG-Based Decoding X Duan… – … Conference on Asian Language …, 2011 – doi.ieeecomputersociety.org … final”. We use the SRI Language Modeling Toolkit [10] to train a 4-gram language model on Xinhua portion of the English Gigaword3 corpus. … 2001. [10] A. Stolcke. 2002. SRILM – an Extensible Language Modeling Toolkit. In …

Joint reranking of parsing and word recognition with automatic segmentation JG Kahn… – Computer Speech & Language, 2011 – Elsevier … here builds on several active research areas in speech and natural language processing, which … Charniak and Johnson (2001) demonstrated the usefulness of explicit edit modeling in parsing … scoring methods to address this problem with Sparseval, a parse evaluation toolkit. … Related articles – All 3 versions

Free tools and resources for Brazilian Portuguese speech recognition N Neto, C Patrick, A Klautau… – Journal of the Brazilian …, 2011 – Springer … model. 2.3 Used tools The HTK software [37] was used to build and adapt the acoustic models presented in this work. The SRI Language Modeling Toolkit (SRILM) was used to build the n-gram ARPA format language models. … Related articles – All 3 versions

[PDF] Preliminary Experiments on Using Users’ Post-Editions to Enhance a SMT System [PDF] from mt-archive.info M Potet, E Esperança-Rodier, H Blanchon… – 2011 – mt-archive.info … research. Although the fully automatic approach has shown some effectiveness, the tools are intrinsi- Mikel … sentences. 162 Page 3. the SRI Language Modeling toolkit (Stolcke, 2002) on a monolingual training corpus of 48M sentences. The … Related articles – View as HTML

[PDF] Using Language Models and Latent Semantic Analysis to Characterise the N400m Neural Response [PDF] from aclweb.org M Parviz, M Johnson, B Johnson… – … the Australasian Language …, 2011 – aclweb.org … Page 2. parser and designed to be useful in psycholinguistic modeling. … This leads us to experiment with lan- guage models trained on larger corpora. Using the SRI-LM toolkit (Stolcke, 2002) we construct a 4-gram language model based on the Gi- gaword corpus (Graff et al … View as HTML

[PDF] Statistical Machine Translation with Local Language Models [PDF] from aclweb.org C Monz – … Conference on Empirical Methods in Natural Language …, 2011 – aclweb.org … To build the local language models, we use the SRILM toolkit (Stolcke, 2002), which is … Integrating our local language modeling approach with a decoder is straightforward. Our baseline decoder already uses SRILM’s API for computing word language model probabilities. … Related articles – View as HTML – All 8 versions

[PDF] System for Fast Lexical and Phonetic Spoken Term Detec-tion in a Czech Cultural Heritage Archive [PDF] from zcu.cz J Psutka, J Švec, JV Psutka, J Vanek, A Prazák… – kky.zcu.cz … bigrams and 1.3M trigrams. Language models were estimated using the SRI Language Modeling Toolkit (SRILM) [13] employing the modified Kneser-Ney smoothing method [14]. Speech Recognition – Generation of Word and Phoneme Lattices … Related articles – View as HTML – All 2 versions

[PDF] An efficient indexer for large N-gram corpora [PDF] from ffzg.hr H Ceylan… – … Computational Linguistics: Human Language …, 2011 – hnk.ffzg.hr … of the most pop- ular toolkits that are also freely available are the CMU Statistical Language Modeling (SLM) Toolkit (Clarkson and Rosenfeld, 1997), and the SRI Lan- guage … Language Resources and Evalua- tion, 43:139-159 … SRILM – an extensible language mod- eling toolkit … Related articles – View as HTML – All 8 versions

Word-Order Issues in English-to-Urdu Statistical Machine Translation [PDF] from mt-archive.info B Jawaid, D Zeman – The Prague Bulletin of Mathematical Linguistics, 2011 – Versita … 2http://fjoch.com/GIZA++.html 3http://www-speech.sri.com/projects/srilm/ 4http://www.lancs.ac. uk/fass/projects/corpus/emille/MANUAL.htm 90 Page 5. B. Jawaid, D. Zeman English-Urdu SMT (87-106) Urdu Language Processing (CRULP) under the Creative Commons License. … Related articles – All 2 versions

[PDF] Noisy SMS machine translation in lowdensity languages [PDF] from aclweb.org V Eidelman, K Hollingshead… – Proceedings of the Sixth …, 2011 – aclweb.org … We trained a 5-gram language model using the SRI language modeling toolkit (Stolcke, 2002) from the English monolingual News Commentary and News Crawl language modeling training data pro- vided for the … SRILM – an extensible language modeling toolkit. … Cited by 1 – Related articles – View as HTML – All 8 versions

[PDF] Multi-granularity Word Alignment and Decoding for Agglutinative Language Translation [PDF] from mt-archive.info Z Wang, Y Lu… – mt-archive.info … For the language model, we use the SRI Language Mod- eling Toolkit (Stolcke, 2002) to train a 4-gram mod- el with the target side of training corpus. … Andreas Stolcke. 2002. Srilm – an extensible language modeling toolkit. In ICSLP 2002, pages 311-318. … Related articles – View as HTML – All 3 versions

[PDF] Introduction to Machine Translation [PDF] from videolectures.net JA Sánchez – videolectures.net … Maximum likelihood • Maximum entropy. Smoothing. Extensions: cache, triggers, categories, etc. Widely used toolkits for n-grams: • SRILM – The SRI Language Modeling Toolkit http://www.speech.sri.com/projects/srilm/ • The CMU Statistical Language Modeling (SLM) Toolkit … Related articles – View as HTML

[PDF] Survey on Speech, Machine Translation and Gestures in Ambient Assisted Living [PDF] from uni-sb.de D Anastasiou – ai.cs.uni-sb.de … 1 We present only some and not all of the existing tools. 2 http://research.microsoft.com/en-us/ projects/language-modeling/default.aspx 3 http://www.speech.sri.com/projects/srilm/ Page 3. … it was originally developed as a collaborative project of Language Technology lab … Related articles – View as HTML – All 2 versions

[PDF] Lexical-based Reordering Model for Hierarchical Phrase-based Machine Translation [PDF] from mt-archive.info Z Zheng, Y Meng… – mt-archive.info … final” method (Koehn, 2003). For the language model, we used the SRI Lan- guage Modeling Toolkit (SRILM) (Stolcke, 2002) to train a 4-gram model on the target portion of the training set. We used the minimum … Related articles – View as HTML

UT-Scope: Towards LVCSR under Lombard effect induced by varying types and levels of noisy background [PDF] from nthu.edu.tw H Boril… – Acoustics, Speech and Signal …, 2011 – ieeexplore.ieee.org … A triphone recognizer combining Hidden Markov Model Toolkit (HTK) based acoustic modeling and trigram language model (LM) implemented with the SRI Language Modeling Toolkit (SRILM) is trained on the TIMIT database (16kHz) [11]. … Related articles – All 11 versions

[PDF] Joint WMT submission of the QUAERO project [PDF] from statmt.org M Freitag, G Leusch, J Wuebker, S Peitz… – Proceedings of the …, 2011 – statmt.org … was employed to train word alignments, lan- guage models have been created with the SRILM toolkit (Stolcke, 2002 … We use two 4-gram SRI language models, one trained on the News Shuffle corpus and one … An empirical study of smoothing techniques for language modeling. … Cited by 1 – Related articles – View as HTML – All 11 versions

[PDF] CMU Syntax-Based Machine Translation at WMT 2011 [PDF] from aclweb.org G Hanneman… – Proceedings of the Sixth Workshop on …, 2011 – aclweb.org … We built a 5-gram language model from it with the SRI lan- guage modeling toolkit (Stolcke, 2002). … SRILM – an extensible lan- guage modeling toolkit. In Proceedings of the Seventh International Conference on Spoken Language Pro- cessing, pages 901-904, Denver, CO … Cited by 1 – Related articles – View as HTML – All 12 versions

[PDF] A Forensic Authorship Classification in SMS Messages: A Likelihood Ratio Based Approach Using N-gram [PDF] from aclweb.org S Ishihara – Proceedings of the Australasian Language Technology …, 2011 – aclweb.org … The backoff technique was used for the calculation of log probabilities (Jurafsky and Martin, 2000). An ‘open-vocabulary’ N-gram language model (N = 1,2,3) was built for each group of messages. … 5http://incubator.apache.org/opennlp/ 6http://www.speech.sri.com/projects/srilm/ … View as HTML

Powerful extensions to CRFS for grapheme to phoneme conversion [PDF] from nthu.edu.tw S Hahn, P Lehnen… – Acoustics, Speech and Signal …, 2011 – ieeexplore.ieee.org … The LM is weighted using a. The SRI LM Toolkit has been used to train the language models [7]. Experimental results are reported in the next sec- tion. … 282-289. [7] A. Stolcke, “SRILM – An Extensible Language Modeling Toolkit,” Denver, CO, USA, Sept. 2002, pp. … Cited by 2 – Related articles – All 8 versions

[PDF] Speech Recognition System of Slovenian Broadcast News [PDF] from intechopen.com MS Maucec… – intechopen.com … All bigrams were included in the model. Katz back-off with Good-Turing discounting was used for smoothing. Language models were trained using SRI LM Toolkit (Stolcke, 2002). … 405-408. Stolcke, A. (2002). SRILM an Extensible Language Modeling Toolkit. Proc. … Related articles – View as HTML

[PDF] Evaluation Methodology and Results for English-to-Arabic MT [PDF] from mt-archive.info O Hamon… – mt-archive.info … 3 http://www.statmt.org/moses/ 4 http://www-speech.sri.com/projects/srilm/ 5 http … CU (El Kholy & Habash, 2010) used a language model based on the IRSTLM toolkit (Federico et al., 2008), Moses for training and decoding and the Penn Arabic Treebank tokenization scheme … Related articles – View as HTML

[PDF] Data Sampling and Dimensionality Reduction Approaches for Reranking ASR Outputs Using Discriminative Language Models [PDF] from boun.edu.tr E Dikici, M Semerci, M Saraçlar… – Proc. Interspeech, 2011 – busim.ee.boun.edu.tr … 1http://www2.research.att.com/~fsmtools/{fsm,dcd}/ 2http://www.speech.sri.com/projects/srilm/ 3http://www.cis.hut.fi/projects … In this paper, we presented some approaches to enhance dis- criminative language modeling performance for Turkish broad- cast news transcription. … Related articles – View as HTML – All 2 versions

[PDF] Building a Web-based parallel corpus and filtering out machine-translated text [PDF] from aclweb.org A Antonova… – ACL HLT 2011, 2011 – aclweb.org … 3 http://www. statmt. org/moses/ trained on target side of the first corpus using SRI Language Modeling Toolkit (Stolcke, 2002). 2004 , . … Andreas Stolcke. 2002. SRILM-an extensible language modeling toolkit. Proceedings ICSLP, vol. 2, pp. 901-904, Denver, Sep. … Related articles – View as HTML – All 14 versions

[PDF] Detecting Levels of Interest from Spoken Dialog with Multistream Prediction Feedback and Similarity Based Hierarchical Fusion Learning [PDF] from cmu.edu WY Wang… – Proceedings of the SIGDIAL 2011 …, 2011 – cs.cmu.edu … We train trigram language models on the training set using the SRI Language Modeling Tookit (Stolcke, 2002). In the testing stage, the log likelihood and perplexity scores are used as language modeling fea- tures. … SRILM-an extensible language modeling toolkit. … Related articles – View as HTML – All 10 versions

Automatic generation of a pronunciation dictionary with rich variation coverage using SMT methods P Karanasou… – Computational Linguistics and Intelligent Text …, 2011 – Springer … It has been used by numerous laboratories. SRI, Philips Aachen, ICSI and Cambridge University have reported improving the performance of their systems using this dictionary. Page 6. … Stolcke, A.: SRILM-An extensible language modeling toolkit. Proc. … Related articles – All 2 versions

[PDF] Rich Linguistic Knowledge for Empirical Machine Translation [PDF] from upc.edu JAG Linares – lsi.upc.edu … chunk 4The GIZA++ SMT Toolkit may be freely downloaded at http://www.fjoch.com/ GIZA++.html. 5The SRI Language Modeling Toolkit may be freely downloaded at http://www.speech.sri.com/projects/srilm/download.html. 6The … Related articles – View as HTML – All 6 versions

[PDF] English to Arabic Statistical Machine Translation System Improvements using Preprocessing and Arabic Morphology Analysis [PDF] from wseas.us SA Ghaffar, MW FAKHR… – wseas.us … Introduction to Arabic Natural Language Processing. … Segmentation for English-to-Arabic Statistical Machine Translation, In Proceedings of ACL’08 [5] http://www1.ccls.columbia.edu/~cadim/MADA .html [6] http://www-speech.sri.com/projects/srilm/ [7] Franz Josef Och … View as HTML

[PDF] Using deep morphology to improve automatic error detection in Arabic handwriting recognition [PDF] from 59.108.48.12 N Habash… – … Computational Linguistics: Human Language …, 2011 – 59.108.48.12 … pos and lem information. The models are built using the SRI Language Modeling Toolkit (Stolcke, 2002). Each word in a hypothesis can then be assigned a probability by each of these nine mod- els. We reduce these probabilities … Related articles – View as HTML – All 8 versions

The TALP-UPC phrase-based translation system for EACL-WMT 2009 [PDF] from upc.edu JA Rodríguez Fonollosa, M Khalilov… – 2011 – upcommons.upc.edu … The scale factor values are automatically opti- mized to obtain the lowest perplexity ppl(w) pro- duced by the interpolated LM P(w). We used the standard script compute – best – mix from the SRI LM package (Stolcke … SRILM: an extensible language modeling toolkit. … Related articles – All 5 versions

[PDF] Feedback Selecting of Manually Acquired Rules Using Automatic Evaluation [PDF] from mt-archive.info X Li, Y Lü, Y Meng, Q Liu… – mt-archive.info … We used the SRI Language Modeling Toolkit (Stolcke, 2002) to train a 7-gram model on training set. We evaluated the translation results using case-insensitive BLEU metric (Papineni et al., 2002). … Andreas Stolcke. 2002. Srilm-an Extensible Language Modeling Toolkit. In Proc. … Related articles – View as HTML

[PDF] Contextual bearing on linguistic variation in social media [PDF] from aclweb.org S Gouws, D Metzler, C Cai… – ACL HLT 2011, 2011 – newdesign.aclweb.org … Algorithm 1 Main cleanser algorithm pseudo code. The decode () command converts the confusion network (CN) into PFSG format and decodes it us- ing the lattice-tool of the SRI-LM toolkit. … A. Stolcke. 2002. SRILM-An Extensible Language Modeling Toolkit. … Cited by 2 – Related articles – View as HTML – All 17 versions

Analyzing language samples of Spanish-English bilingual children for the automated prediction of language dominance T Solorio, M Sherman, Y Liu… – Natural Language …, 2011 – Cambridge Univ Press … of features, which in turn might lead to the discovery of new clinical markers and the development of more robust assessment tools. … of 36 N-gram LM features per retell, ((6 + 6) × 3). For training and testing these LMs we used the SRI Language Modeling Toolkit (Stolcke 2002). … Cited by 2 – Related articles – All 4 versions

Design, development and field evaluation of a Spanish into sign language translation system R San-Segundo, JM Montero, R Córdoba… – Pattern Analysis & … – Springer … sign-lang.uni-hamburg.de/esign/) [41] have been two of the most significant research efforts in developing tools for the automatic … As regards the language model, the recognition module uses statistical language modelling: 2-gram, as the database is not large enough to … Related articles

[PDF] TUD Palladian Overview [PDF] from 141.76.40.242 D Urbansky, K Muthmann, P Katz… – 2011 – 141.76.40.242 … 20. NaCTeM Software Tools [40] are programs for natural language processing and text mining that are made available by the National Centre for Text Mining. … 27. SRILM – The SRI Language Modeling Toolkit [55] is a C++-based toolkit for language modeling. … Cited by 1 – Related articles – View as HTML – All 7 versions

picoTrans: an icon-driven user interface for machine translation on mobile devices [PDF] from ntu.edu.tw W Song, AM Finch, K Tanaka-Ishii… – Proceedings of the 15th …, 2011 – dl.acm.org … In our experiments we used the SRI language modeling toolkit [20] to implement our language model … cess we use the language model in a similar manner; the user is presented with a list of the top-5 (partial … Word alignent was performed using GIZA++ [16] and MOSES [11] tools. … Related articles – All 2 versions

Pronunciation variants generation using SMT-inspired approaches [PDF] from 140.124.72.88 P Karanasou… – Acoustics, Speech and Signal …, 2011 – ieeexplore.ieee.org … based on additional phonemic contextual information ex- pressed by 5-gram phoneme LM already used by Moses for the g2p conversion. The SRI toolkit served for the reranking. … [10] A. Stolcke, “Srilm-an extensible language modeling toolkit,” ICSLP, 2002. … Cited by 1 – Related articles – All 6 versions

[PDF] Generative Models of Monolingual and Bilingual Gappy Patterns [PDF] from mt-archive.info KGNA Smith – mt-archive.info … for phrase pair extraction. On the mono- lingual side, researchers have taken inspiration from trigger-based language modeling for speech recog- nition (Rosenfeld, 1996). Recently Xiong et al. (2011) used monolingual trigger … Related articles – View as HTML – All 12 versions

[PDF] METEOR-Tuned Phrase-Based SMT: CMU French-English and Haitian-English Systems for WMT 2011 [PDF] from cmu.edu M Denkowski… – 2011 – cs.cmu.edu … A SRI 5-gram language model (Stolke, 2002) with modified Kneser-Ney smooth- ing is estimated from monolingual data. … In Proc. of AMTA 2006. Andreas Stolke. 2002. SRILM – an Extensible Language Modeling Toolkit. In Proc. of ICSLP 2002. Omar F. Zaidan. 2009. … Cited by 2 – Related articles – View as HTML – All 2 versions

Improving offline handwritten text recognition with hybrid hmm/ann models S Espana-Boquera, M Castro-Bleda… – Pattern Analysis …, 2011 – ieeexplore.ieee.org … Lines: Finally, the 6,161 IAM training lines were also added. The bigram language model used in the recognition systems was generated, using the SRI Language Modeling Toolkit [52] with the modified Kneser-Ney back-off discounting. … Cited by 19 – Related articles – All 7 versions

[PDF] Authorship Identification with Modality Specific Meta Features [PDF] from uni-weimar.de T Solorio, S Pillay… – PAN, 2011 – uni-weimar.de … For training the language models and computing perplexity values we used the SRI-LM toolkit [13]. … 13. Andreas Stolcke. SRILM – an extensible language modeling toolkit. pages 901-904, 2002. 14. Ian H. Witten and Eibe Frank. … Related articles – View as HTML – All 3 versions

[PDF] Effective use of function words for rule generalization in forest-based translation [PDF] from aclweb.org X Wu, T Matsuzaki… – … Human Language Technologies-Volume 1, 2011 – aclweb.org … and symmetrizing strategy (Koehn et al., 2007) on the training set to obtain alignments. The SRI Language Modeling Toolkit (Stolcke, 2002) was employed to train a five-gram Japanese LM on the training set. We evaluated the Cited by 1 – Related articles – View as HTML – All 9 versions

[PDF] Speech Translation with Grammar Driven Probabilistic Phrasal Bilexica Extraction [PDF] from ust.hk M Saers, D Wu, C Lo… – 2011 – cs.ust.hk … To build the baseline we used the freely available Moses toolkit [7], with standard training settings, and an SRI trigram language model [8]. To compare systems we use BLEU [9]. This baseline … [8] A. Stolcke, “SRILM – an extensible language modeling toolkit,” in Proceedings … Related articles – All 3 versions

GREAT: open source software for statistical machine translation J González… – Machine Translation, 2011 – Springer … When a language modelling framework based on finite-state models is used, an SFSA is inferred from such a synthetic corpus. That model can also be seen as an SFST for Pr(s, t) as described in Sect. … 1 Available from http://www.speech.sri.com/projects/srilm/. 123 Page 6. … Related articles

[PDF] Quasi-Synchronous Phrase Dependency Grammars for Machine Translation [PDF] from cmu.edu KGNA Smith – cs.cmu.edu … We used a max phrase length of 7 when extracting phrases. Trigram language models were estimated using the SRI language modeling toolkit (Stolcke, 2002) with modified Kneser-Ney smoothing (Chen and Goodman, 1998). … Related articles – View as HTML – All 9 versions

[PDF] Front-End Compensation Methods for LVCSR Under Lombard Effect [PDF] from vutbr.cz H Boril, F Grézl… – 2011 – fit.vutbr.cz … 3.2. Experimental Setup A triphone recognizer combining Hidden Markov Model Toolkit (HTK) based acoustic modeling and trigram language model (LM) implemented with the SRI Language Modeling Toolkit (SRILM) is trained on the TIMIT database (16 kHz) [25]. … Related articles – All 8 versions

Utilizing gestures to improve sentence boundary detection [PDF] from jhu.edu L Chen… – Multimedia Tools and Applications, 2011 – Springer … Multimed Tools Appl (2011) 51:1035-1067 1037 … play an important role in human communication but use quite different expressive mechanisms than spoken language. … are likely to provide additional important information that can be exploited when modeling sentence structure … Related articles – All 8 versions

Finding deceptive opinion spam by any stretch of the imagination [PDF] from arxiv.org M Ott, Y Choi, C Cardie… – Arxiv preprint arXiv:1107.4557, 2011 – arxiv.org … al. (2008) are equivalent. Thus, following Zhou et al. (2008), we use the SRI Language Modeling Toolkit (Stolcke, 2002) to esti- mate individual language models, Pr( x | y = c), for truthful and deceptive opinions. We consider … Cited by 5 – Related articles – All 32 versions

Representing n-gram language models for compact storage and fast retrieval C Chelba, T Brants – US Patent 7,877,258, 2011 – Google Patents … 156-159, Oct. 1-3, Sentosa, Singapore 2003 * SRI International, “SRILM-The SRI Language Modeling Toolkit”, 2006, Menlo Park, CA (2 pages). Clarkson, et al. “Statistical Language Modeling Toolkit”, Jun. 1999, Cambridge (5 pages). … Related articles – All 3 versions

[PDF] Extracting Hierarchical Rules from a Weighted Alignment Matrix [PDF] from aclweb.org Z Tu, Y Liu, Q Liu… – aclweb.org … al., 2006). We train a 4-gram language model on the Xinhua portion of GIGA- WORD corpus using the SRI Language Model- ing Toolkit (Stolcke, 2002) with modified Kneser- Ney smoothing (Kneser and Ney, 1995). We opti … Related articles – View as HTML – All 6 versions

A Risk-Aware Modeling Framework for Speech Summarization B Chen… – Audio, Speech, and Language Processing, …, 2011 – ieeexplore.ieee.org … summarization framework are very competitive with existing summarization methods. Keywords: speech summarization, decision-making, risk minimization, loss functions, language modeling Copyright (c) 2010 IEEE. Personal use of this material is permitted. … Cited by 1 – Related articles

[PDF] The RWTH Aachen System for NTCIR-9 PatentMT [PDF] from rwth-aachen.de M Feng, C Schmidt… – Proceedings …, 2011 – www-i6.informatik.rwth-aachen.de … 2.4 Language Models All language models are standard n-gram language models trained with the SRI toolkit [17] using interpolated modified Kneser-Ney smoothing. … 2010. [17] A. Stolcke. SRILM – an extensible language modeling toolkit. In Proc. Int. Conf. … Cited by 1 – View as HTML

[PDF] Extracting Pre-ordering Rules from Chunk-based Dependency Trees for Japanese-to-English Translation [PDF] from mt-archive.info X Wu, K Sudoh, K Duh, H Tsukada… – Proceedings of MT …, 2011 – mt-archive.info … 5http://ntcir.nii.ac.jp/PatentMT/ 6http://ntcir.nii.ac.jp/PatentMT/baselineSystems 7http://www.statmt. org/moses/ 8http://giza-pp.googlecode.com/files/giza-pp-v1.0.3.tar.gz 9http://www.speech.sri.com/ projects/srilm/ 10http … Srilm-an extensible language modeling toolkit. … Cited by 1 – Related articles – View as HTML

[PDF] Training data in statistical machine translation-the more, the better [PDF] from rug.nl M Gavrila… – … of the RANLP-2011 Conference, Hissar, …, 2011 – acl.eldoc.ub.rug.nl … html. 4http://www.statmt.org/moses/, (Koehn et al., 2007). 5http://www.speech.sri.com/projects/ srilm/, (Stolcke, 2002). … 2002. Srilm – an extensible lan- guage modeling toolkit. In Proc. Intl. Conf. Spoken Language Processing, pages 901-904, Denver, Col- orado, September. … View as HTML

[PDF] A Novel Dependency-to-String Model for Statistical Machine Translation [PDF] from aclweb.org J Xie, H Mi… – Proceedings of EMNLP 2011 – aclweb.org Page 1. Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing, pages 216-226, Edinburgh, Scotland, UK, July 27-31, 2011. c 2011 Association for Computational Linguistics A Novel Dependency … Cited by 2 – Related articles – View as HTML – All 9 versions

Ncode: an Open Source Bilingual N-gram SMT Toolkit [PDF] from quaero.org JM Crego, F Yvon… – The Prague Bulletin of Mathematical …, 2011 – Versita … 1http://www.speech.sri.com/projects/srilm/ 2http://kheafield.com/code/kenlm/ 3http://www.openfst. org 4http://www … The resulting file contains all the modeling information needed by the decoder to translate sentences, with the exception of n-gram language models scores. … Related articles – All 4 versions

[PDF] Nyelvimodell-adaptáció ügyfélszolgálati beszélgetések gépi leiratozásához [PDF] from bme.hu TKKN Kft – mycite.omikk.bme.hu … 3.3 Tanítás és dekódolás A vizsgált nyelvi modellek módosított Kneser-Ney simítás [3] használatával készültek az SRI Language Modeling Toolkit (SRILM) [10] segítségével. A létrehozott 3- gram, szóalapú modellekben entrópia alapú metszést egyetlen esetben sem Page … View as HTML

[PDF] ILLC-UvA translation system for EMNLP-WMT 2011 [PDF] from statmt.org M Khalilov… – Proceedings of the Sixth Workshop on Statistical …, 2011 – statmt.org … GIZA++/mkcls (Och, 2003; Och, 1999) for word alignment. • SRI LM (Stolcke, 2002) for language model- ing. A 3-gram target language model was es- timated and smoothed with modified Kneser- Ney discounting. … 2002. SRILM: an extensible language mod- eling toolkit. … Cited by 1 – Related articles – View as HTML – All 9 versions

[PDF] Target-aware Lattice Rescoring for Dialect Recognition [PDF] from uiuc.edu R Tong, B Ma, H Li… – 2011 – mickey.ifp.uiuc.edu … The SRI lattice tool kit [11] is further used to derive n-gram counts from the lattice. … 2237-2240, 2005 [11] A. Stolcke, “SRILM – An Extensible Language Modeling Tooklit”, In ICSLP, 2002, pp. 901–904, 2002 [12] M. Bacchiani and B. Roark. … Related articles

Leveraging Kullback-Leibler divergence measures and information-rich cues for speech summarization [PDF] from ntnu.edu.tw SH Lin, YM Yeh… – Audio, Speech, and Language …, 2011 – ieeexplore.ieee.org … such as language model smoothing or topic modeling, for better sentence model estimation [28]. Nevertheless, these approaches are restricted in the context of speech summarization, since they strive … language modeling approach [28-29]. … Cited by 1 – Related articles – All 5 versions

Semantic Frame-Based Spoken Language Understanding YY Wang, L Deng… – Spoken Language …, 2011 – Wiley Online Library … of Colorado, Carnegie Mellon University, IBM, Lucent Bell Labs, MIT, and SRI participated in the … Its major objective is the development of a robust SLU toolkit for dialogue … that focuses on multilingual SLU and language specific aspects for language modeling and understanding … Cited by 1 – Related articles – All 2 versions

[PDF] Extracting Pre-ordering Rules from Predicate-Argument Structures [PDF] from aclweb.org X Wu, K Sudoh, K Duh, H Tsukada… – aclweb.org … SRILM 8 (Stolcke, 2002): version 1.5.12 for training a 5-gram language model using the target sentences in the total training set; … In particular, nearly half of the 8http://www.speech.sri.com/ projects/srilm/ 9http://homepages.inf.ed.ac.uk/jschroe1/how- to/scripts.tgz 10http … Related articles – View as HTML – All 3 versions

[PDF] Reliability-weighted acoustic model adaptation using crowd sourced transcriptions [PDF] from usc.edu K Audhkasi, P Georgiou… – Proc. Interspeech2011, …, 2011 – www-scf.usc.edu … Signal Analysis and Interpretation Lab (SAIL) Electrical Engineering Department University of Southern California, Los Angeles, CA 90089-2564, USA audhkhas@usc.edu, {georgiou, shri}@sipi.usc … [15] A. Stolcke, “SRILM – an extensible language modeling toolkit,” in Proc. … Cited by 2 – Related articles – View as HTML – All 2 versions

[PDF] Predicting Responses and Discovering Social Factors in Scientific Literature [PDF] from cmu.edu BR Routledge… – lti.cs.cmu.edu … As such, natural language processing is well-positioned to provide tools for understanding the scientific process, by analyzing the textual artifacts (papers, proceedings, etc.) that it produces. This report is about modeling col- lections of scientific documents to understand how … Related articles – View as HTML – All 2 versions

Creación de una interfaz gráfica de traducción automática para las lenguas de signos [PDF] from upc.edu J Marin Rey – 2011 – upcommons.upc.edu … MERT Minimum Error Rate Training MySQL Sistema de gestión de base de datos NIST National Institute of Standards and Technology SiGML Signing Gesture Markup Language SMT Statistical Machine Translation SRILM SRI Language Modeling Toolkit TA Traducción … All 3 versions

[PDF] A PERPLEXITY BASED COVER SONG MATCHING SYSTEM FOR SHORT LENGTH QUERIES [PDF] from ismir.net E Unal, E Chew, P Georgiou… – ismir2011.ismir.net … Narayanan3 1TÜBITAK BILGEM 2Queen Mary, University of London 3University of Southern California 1unal@uekae.tubitak.gov.tr 2elaine.chew@eecs.qmul.ac.uk 3{georgiou,shri}@sipi. usc … [20] A. Stolcke: “Srilm – an Extensible Language Modeling Toolkit,” Proceedings of … Related articles – View as HTML

[PDF] ERRON: A PHRASE-BASED MACHINE TRANSLATION APPROACH TO CUSTOMIZED SPELLING CORRECTION [PDF] from osu.edu DJ Hovermale – 2011 – ling.osu.edu … 33 3.3 Modeling JWEFL Spelling Correction as a SMT task . . . . . … 68 5.2.3 SRILM-TheSRILanguageModelingToolkit . . . . . 68 … 1 Page 17. “Processing Unexpected Input”. We noticed that Natural Language Processing (NLP) … Related articles – View as HTML – All 2 versions

Error handling approach using characterization and correction steps for handwritten document analysis [PDF] from archives-ouvertes.fr S Quiniou, M Cheriet… – International Journal on Document …, 2011 – Springer … Finally, the final output phrase is retrieved, thanks to a correc- tion step that used the characterized error hypotheses and a designed word-to-class backoff language model. … The combined language model, as well as its use on the pruned word graph, is presented in Sect. 7. … Related articles – All 2 versions

[PDF] Text Modification Methods for Natural Language Generation [PDF] from valls.name JV Vargas – josep.valls.name … in many NLP and NLG applications[5]. N-grams are a common and successful tool for language modeling[6]. An … graphical data in a text format[9]. The problem of programming computers to produce natural language explanations and … Several tools are available for these tasks. … Related articles – View as HTML

Cardinality pruning and language model heuristics for hierarchical phrase-based translation [PDF] from dfki.de D Vilar… – Machine Translation, 2011 – Springer … 123 Page 9. Cardinality pruning and language model heuristics … (12) These equations are the ones guiding the extraction process as implemented eg in the open source hierarchical toolkit Jane (Vilar et al. 2010). … 123 Page 10. D. Vilar, H. Ney 4.2 Log-linear modelling for MT …

TechWare: Mobile Media Search Resources [Best of the Web] Z Liu… – Signal Processing Magazine, IEEE, 2011 – ieeexplore.ieee.org … OpenFst (www.openfst.org): Full- featured library of finite state trans- ducers manipulation algorithms that can be used to implement large vocabulary speech recognition sys- tems as well as natural language pro- cessing systems. ¦ SRILM [www-speech.sri.com/proj … All 2 versions

Analysis of MLP-Based Hierarchical Phoneme Posterior Probability Estimator [PDF] from posterous.com J Pinto, S Garimella, M Magimai-Doss… – … , and Language …, 2011 – ieeexplore.ieee.org … SUBMITTED TO IEEE AUDIO, SPEECH AND LANGUAGE PROCESSING … SRILM toolkit [47] and phoneme recognition is performed using the weighted finite state transducer … Table III shows the phoneme recognition accuracies obtained by hierarchical modeling (system S2) in … Cited by 23 – Related articles – All 9 versions

[PDF] Integrating Source-Language Context into Log-Linear Models of Statistical Machine Translation [PDF] from dcu.ie R Haque – 2011 – computing.dcu.ie … MaTrEx Machine Translation Using Examples EM Expectation Maximization MBR Minimum Bayes Risk SRILM Stanford Research Institute Language Modeling CYK Cocke Younger Kasami IG Information Gain GR Gain Ratio IB Instance-Based … Related articles – View as HTML – All 3 versions

[PDF] ????????????????????? [PDF] from aclweb.org ???, ???, ???… – aclweb.org … Empirical Comparisons of Various Discriminative Language Models for Speech Recognition … Translation, MT)???????(Natural Language Processing, NLP)??????,?? ???????????????????????????????????? … View as HTML

Handwritten Chinese Text Recognition by Integrating Multiple Contexts QF Wang, F Yin… – IEEE Transactions on …, 2011 – doi.ieeecomputersociety.org … E-mail: {wangqf, fyin, liucl}@nlpr.ia.ac.cn. later works on the same dataset, using character clas- sifiers and statistical language models based on over- segmentation, reported a character-level correct rate of 78.44% [10] and 73.97% [11], respectively. …

[PDF] Extracting Verbs with PP/NP Variation from the Large 3-gram Corpus [PDF] from savba.sk M Kopotev, N Kochetkova… – Natural Language … – korpus.juls.savba.sk … Semantics is the content of the language. Secondly, when modeling the semantics, one must take into consideration the process of continuous language change, the change of the semantic content of which is the most important variable element. … View as HTML

Soft syntactic constraints for Arabic-English hierarchical phrase-based translation Y Marton, D Chiang… – Machine Translation, 2011 – Springer … 2005). Language models were built using the SRI Language 4 We refer the reader to Marton and Resnik (2008) for details of related Chinese to English experiments. 123 Page 9. … Modeling Toolkit (Stolcke 2002) with modified Kneser-Ney smoothing (Chen and Goodman 1998). … Related articles

ENHANCED SPEECH-TO-SPEECH TRANSLATION SYSTEM AND METHODS A Waibel… – US Patent 20,110,307,241, 2011 – freepatentsonline.com … Other MT modules could be used such as those developed by IBM Corporation, SRI, BBN or at … C. Dyer, O. Bojar, A. Constantin, and E. Herbst, ‘Moses: Open source toolkit for statistical … Human Language Technology and Empirical Methods in Natural Language Processing, pp. … Cached

[BOOK] Spoken Language Understanding: Systems for Extracting Semantic Information from Speech G Tur, R De Mori – 2011 – books.google.com … Her PhD thesis is on statistical language modeling for agglutinative languages. … on machine translation during her visit to Carnegie Mellon University, Language Technologies Institute … In 1998 and 1999, she visited SRI International, Speech Technology and Research Labs, and … Cited by 7 – Related articles – Library Search

Accurate transcription of broadcast news speech using multiple noisy transcribers and unsupervised reliability metrics [PDF] from 140.124.72.88 K Audhkhasi, P Georgiou… – Acoustics, Speech and …, 2011 – ieeexplore.ieee.org … Signal Analysis and Interpretation Lab (SAIL) Electrical Engineering Department University of Southern California, Los Angeles, CA, USA Email: audhkhas@usc.edu, {georgiou, shri}@sipi.usc.edu ABSTRACT … [11] A. Stolcke, “SRILM – an extensible language modeling toolk … Cited by 3 – Related articles – All 5 versions

CCG contextual labels in hierarchical phrase-based SMT [PDF] from dcu.ie H Almaghout, J Jiang… – 2011 – doras.dcu.ie … www1.ccls.columbia.edu/ cadim/MADA.html 3http://fjoch.com/GIZA++.html 4http://www-speech. sri.com/projects/srilm/ 5http://www … of the 2007 Joint Conference on Empirical Methods in Natural Language Process- ing and Computational Natural Language Learning, Prague … Cited by 1 – Related articles – All 6 versions

[PDF] Oracle-based Training for Phrase-based Statistical Machine Translation [PDF] from dcu.ie AK Srivastava, Y Ma… – computing.dcu.ie … tures (d1 through d7), 1 language model feature (lm), 5 translation model features (tm1 through tm5), 1 word penalty (w), and 1 unknown word penalty feature. Note that the unknown word fea- 1http://www.statmt.org/wmt09/ 2http://www-speech.sri.com/projects/srilm/ 3http://code … Related articles – View as HTML – All 5 versions

[PDF] Semi-Automatic Translation of Medical Terms from English to Swedish [PDF] from diva-portal.org CT SNOMED – 2011 – liu.diva-portal.org … CT [6]. A subset is a piece of SNOMED CT, derived for instance to get a collection of all concepts available in a specific language or … source language is translated into the interlingua and then translated from the interlingua into the target language. There … Related articles – View as HTML

[CITATION] Automatic Modeling of Logical Connectors by Statistical Analysis of Context/Modélisation automatique de connecteurs logiques par analyse statistique du … E Charton, JM Torres-Moreno – Canadian Journal of …, 2011 – University of Toronto Press Related articles

[PDF] Report on CWMT2011 MT Translation Evaluation [PDF] from ict.ac.cn H Zhao, Y Lu, G Ben, Y Huang… – nlp.ict.ac.cn … Hongmei Zhao, Yajuan Lu, Guosheng Ben, Yun Huang, Qun Liu Natural Language Processing Research Group, Key Laboratory of Intelligent Information Processing, Institute of Computing Technology, … Natural Language Processing Laboratory, Northeastern University … Related articles – View as HTML

Topic Identification TJ Hazen – Spoken Language Understanding, 2011 – Wiley Online Library … When using a mismatched ASR system, one potentially serious problem for topic ID is that many of the important content-bearing words in the domain of interest may not be included in the lexicon and language model used by the ASR system. … Related articles

[PDF] Modality Specific Meta Features for Authorship Attribution in Web Forum Posts [PDF] from aclweb.org T Solorio, S Pillay, S Raghavan… – aclweb.org … For training the language models and computing perplexity values we used the SRI-LM toolkit (Stolcke, 2002). Fre- quencies of character n-grams have also been suc- cessfully used to build author profiles (Keselj et al., 2003). … Related articles – View as HTML – All 3 versions

Evaluation and Optimisation of Incremental Processors [PDF] from elanguage.net T Baumann, O Buß… – Dialogue & Discourse, 2011 – swimt.elanguage.net … Typical processing components in dialogue systems are speech recognition, parsers (often with grammars that are more semantically than syntactically motivated), dialogue act recognition, dialogue management, language generation and text-to-speech synthesis (Allen et al. … Cited by 1 – Related articles – All 7 versions

Training Speech Translation from Audio Recordings of Interpreter-Mediated Communication M Paulik… – Computer Speech & Language, 2011 – Elsevier … The impressive advances made in ST can mostly be attributed to the statistical modeling schemes that nowadays … and MT Systems The employed ASR systems are developed with the Janus Recognition Toolkit (JRTk … The SRI Language Model Toolkit [13] is used for LM train- ing …

[PDF] The RWTH Aachen machine translation system for WMT 2011 [PDF] from statmt.org M Huck, J Wuebker, C Schmidt, M Freitag… – Proceedings of the …, 2011 – statmt.org … to train word alignments, language models have been created with the SRILM toolkit (Stolcke, 2002). … For the English and German language models, we applied the data selection method proposed in … We used a 3-gram trained with the SRI toolkit to compute the cross-entropy. … Cited by 3 – Related articles – View as HTML – All 11 versions

The TALP & I2R SMT Systems for IWSLT 2008 [PDF] from upc.edu H Li, A Aw, M Zhang, M Khalilov, MR Costa-Jussà… – 2011 – upcommons.upc.edu … and case information as proposed on the IWSLT’08 web page, using standard SRI LM[13 … and tok- enized using the Freeling toolkit[20], an open source tool for language analysis … on the next step, the case information is restored, using the dis- ambig tool from SRILM following the … Related articles – All 13 versions

Modelos Conexionistas para el Procesado del Lenguaje Natural [PDF] from upv.es FJ ZAMORA MARTÍNEZ – 2011 – riunet.upv.es Page 1. PROGRAMA DE MASTER Inteligencia Artificial, Reconocimiento de Formas e Imágen Digital Departamento de Sistemas Informáticos y Computación Universidad Politécnica de Valencia Tesis de Master: Modelos Conexionistas para el Procesado del Lenguaje Natural … Related articles

[PDF] Automatic Dialect and Accent Recognition and its Application to Speech Recognition [PDF] from columbia.edu F Biadsy – 2011 – academiccommons.columbia.edu … 28 5.1 Parallel phone recognition followed by language modeling (Parallel PRLM) for dialect recognition. … allowed direct comparison to their work. I am also thankful to Dimitra Vergyri for sharing with me the details of the SRI Arabic pronunciation dictionary. I would like to thank … Related articles – View as HTML – All 3 versions

[PDF] Ad-Hoc Meeting Transcription on Clusters of Mobile Devices [PDF] from uiuc.edu M Cossalter, P Sundararajan… – 2011 – mickey.ifp.uiuc.edu … For the AMI task, a 12k vocabulary, trigram language model was built using the SRILM toolkit [12]. … was performed applying the acoustic model described in Section 5.1 and a 64k language model trained … K. Boakye, O. Ç etin, J. Frankel, and J. Zheng, “The ICSI-SRI Spring 2006 … Related articles

[PDF] Context-Sensitive Syntactic Source-Reordering by Statistical Transduction [PDF] from uva.nl M Khalilov… – staff.science.uva.nl … model in (Li et al., 2007) is ex- plicitly aimed at long-distance reorderings (English- Chinese), prunes the alignment matrix gradually to fit the source syntactic parse and employs Maximum-Entropy modeling to choose … All language models were trained with SRI LM toolkit … Related articles – View as HTML – All 4 versions

[PDF] Social Network Analysis for Automatic Role Recognition [PDF] from epfl.ch S Favre – 2011 – biblion.epfl.ch … DDM Duration Distribution Modeling ECA Embodied Conversational Agent GMM Gaussian Mixture Models … SAN Social Affiliation Networks SLM Statistical Language Models … Nowadays, computers are leaving their original role of improved versions of old tools [Vinciarelli 09a] … Related articles – Library Search – All 5 versions

[PDF] A New Framework to Deal with OOV Words in SLT System [PDF] from aia-i.com Y Zhou, F Zhai, P Liu… – aia-i.com … Then we will explain why we can solve OOV-3 problem with Pinyin-based model. † http://www.speech.sri.com/projects/srilm/manpages/ Page 11. … In Proceedings of the International Symposium on Chinese Spoken Language Processing (ISCSLP), December 16-19, 2008. … Related articles – View as HTML

[PDF] Unsupervised Mining of Lexical Variants from Noisy Text [PDF] from aclweb.org S Gouws, D Hovy… – EMNLP 2011, 2011 – aclweb.org … The most likely (Viterbi) path through this lattice represents the decoded clean output. We use SRI-LM (Stolcke, 2002) for this. … Springer Berlin/Heidelberg. A. Stolcke. 2002. SRILM-an extensible language mod- eling toolkit. … Related articles – View as HTML – All 6 versions

Detección automática de plagio en texto [PDF] from upv.es LA Barrón Cedeño – 2011 – riunet.upv.es … evaluations. The results that we have obtained with techniques based on different concepts such as Language Models, different text comparison te- chniques, and statistical methods for the search space reduction, are promising. … Cited by 3 – Related articles – All 9 versions

High-quality bilingual subtitle document alignments with application to spontaneous speech translation A Tsiartas, P Ghosh, P Georgiou… – … Speech & Language, 2011 – Elsevier … experiments us- ing the aligned subtitle data obtained by the proposed alignment approach Email address: tsiartas@usc.edu, prasantg@usc.edu, georgiou@sipi.usc.edu, shri@sipi.usc.edu … The trigram language models were built using the SRILM toolkit (Stolcke, 2002 …

[PDF] Paraphrase and Textual Entailment Recognition and Generation [PDF] from aueb.gr P Malakasiotis – 2011 – aueb.gr … Paraphrasing methods recognize, generate, or extract phrases, sentences, or longer nat- ural language expressions that convey almost the same information. Textual entailment methods, on the other hand, recognize, generate, or extract pairs of natural language … Cited by 1 – Related articles – View as HTML – All 5 versions

[PDF] Identifying the Gist of Conversational Text: Automatic Keyword Extraction and Summarization [PDF] from utdallas.edu F Liu, Y Liu, C Busso, S Harabagiu… – 2011 – hlt.utdallas.edu … We propose to use topic labels and speaker-dependent characteristics (such as verboseness, gender, native language, role in the meeting) to improve extractive meeting summarization … and aligned them at the character level for modeling training. For Twitter topic summariza- … Related articles – View as HTML – Library Search – All 3 versions

[PDF] Recovering Capitalization and Punctuation Marks on Speech Transcriptions [PDF] from inesc-id.pt F Batista – 2011 – inesc-id.pt … special issue on “New Frontiers in Rich Transcription”, part of the IEEE Transactions on Audio, Speech, and Language Processing publication … The Metadata Extraction and Modeling task, described in the project, aims at introducing structural information into the ASR output, as a … Related articles – View as HTML – All 3 versions

Learning to tell tales: automatic story generation from Corpora [PDF] from ed.ac.uk ND McIntyre – 2011 – era.lib.ed.ac.uk … 52 3.3.3 Modelling Coherence . . . … 25 2.7 Overview of the natural language generation pipeline as described in ReiterandDale(2000). . … In addition to creating richer computer games, there is also the potential for automatic story generation to be integrated into tools for authors. … Related articles – All 3 versions