Word Lattice


A word lattice is a graphical representation of the possible ways in which a sequence of words can be parsed or interpreted. It is often used in natural language processing (NLP) tasks, particularly in tasks related to speech recognition and language understanding.

A word lattice typically consists of a series of nodes, each of which represents a word or a sequence of words. The nodes are connected by edges, which represent the relationships between the words. For example, the edges between nodes might indicate that one word is a prefix or suffix of another, or that two words are synonyms or variations on the same theme.

Word lattices can be used to represent a wide range of information about a given text or speech input. For example, they can be used to represent the possible meanings of an ambiguous word or phrase, the possible grammatical structures that a sentence might have, or the possible ways in which a word might be pronounced.

In the context of NLP tasks, word lattices can be used to help algorithms understand and interpret the meaning of a given text or speech input. For example, a chatbot might use a word lattice to help it understand the intent of a user’s message, or to generate appropriate responses to user messages. Similarly, a speech recognition system might use a word lattice to help it transcribe spoken words into text, or to identify the intended meaning of a spoken phrase.

Word lattices can be used in dialog systems to help the system understand and interpret the meaning of user messages. In a dialog system, the user’s message is typically input as a sequence of words, and the system needs to be able to understand the intended meaning of the message in order to respond appropriately.

To do this, the dialog system might use a word lattice to represent the possible meanings and interpretations of the user’s message. The word lattice can be used to represent the possible meanings of ambiguous words or phrases, the possible grammatical structures that the message might have, and the possible ways in which the words might be pronounced.

Once the word lattice has been generated, the dialog system can use it to help it understand the meaning of the user’s message. For example, the system might use the word lattice to identify the most likely intent of the user’s message, or to generate a response that is appropriate for the intended meaning of the message.

Word lattices can be particularly useful in dialog systems because they can help the system to handle ambiguities and uncertainties in the user’s message. By representing multiple possible interpretations of the message, the word lattice allows the system to consider a range of options and make a more informed decision about how to respond.

See also:

Anaphora & Dialog Systems 2011 | Bottom-up Parser & Dialog Systems | Classifier & Dialog Systems 2011 | Concept Mapping & Dialog Systems | Conversational Interfaces 2011 | Ellogon | FSG (Finite State Grammar) & Dialog Systems | GRM Library (Grammar Library) | Language Understanding Module | Natural Language Generation & Dialog Systems 2011 | Query Construction Module | Rule-based Language Modeling

Generalizing word lattice translation C Dyer, S Muresan, P Resnik – 2008 – DTIC Document Abstract: Word lattice decoding has proven useful in spoken language translation; we argue that it provides a compelling model for translation of text genres, as well. We extend lattice decoding to hierarchical phrase-based models, providing a unified treatment with phrase- … Cited by 124 Related articles All 9 versions

Language Recognition with Word Lattices and Support Vector Machines. WM Campbell, FS Richardson, DA Reynolds – ICASSP (4), 2007 – ll.mit.edu ABSTRACT Language recognition is typically performed with methods that exploit phonotactics—a phone recognition language modeling (PRLM) system. A PRLM system converts speech to a lattice of phones and then scores a language model. A standard … Cited by 58 Related articles All 9 versions

Using word lattice information for a tighter coupling in speech translation systems S Saleem, SC Jou, S Vogel… – Proc. Int. Conf. on …, 2004 – csl.anthropomatik.kit.edu Abstract In this paper we present first experiments towards a tighter coupling between Automatic Speech Recognition (ASR) and Statistical Machine Translation (SMT) to improve the overall performance of our speech translation system. In coventional speech … Cited by 35 Related articles All 12 versions

Learning word-class lattices for definition and hypernym extraction R Navigli, P Velardi – Proceedings of the 48th Annual Meeting of the …, 2010 – dl.acm.org … syntactic structures. In this paper, we propose Word- Class Lattices (WCLs), a generalization of word lattices that we use to model tex- tual definitions. Lattices are learned from a dataset of definitions from Wikipedia. Our method … Cited by 49 Related articles All 7 versions

OOV detection by joint word/phone lattice alignment H Lin, J Bilmes, D Vergyri… – … Speech Recognition & …, 2007 – ieeexplore.ieee.org … recog- nition (LVCSR) systems. Our method is based on perform- ing a joint alignment between independently generated word and phone lattices, where the word-lattice is aligned via a recognition lexicon. Based on a similarity … Cited by 38 Related articles All 23 versions

Word lattices for multi-source translation J Schroeder, T Cohn, P Koehn – Proceedings of the 12th Conference of …, 2009 – dl.acm.org Abstract Multi-source statistical machine translation is the process of generating a single translation from multiple inputs. Previous work has focused primarily on selecting from potential outputs of separate translation systems, and solely on multi-parallel corpora and … Cited by 34 Related articles All 11 versions

Word lattice reranking for Chinese word segmentation and part-of-speech tagging W Jiang, H Mi, Q Liu – Proceedings of the 22nd International Conference …, 2008 – dl.acm.org Abstract In this paper, we describe a new reranking strategy named word lattice reranking, for the task of joint Chinese word segmentation and part-of-speech (POS) tagging. As a derivation of the forest reranking for parsing (Huang, 2008), this strategy reranks on the … Cited by 36 Related articles All 12 versions

Conceptual decoding from word lattices: application to the spoken dialogue corpus MEDIA. C Servan, C Raymond, F Béchet… – …, 2006 – 20.210-193-52.unknown.qala.com. … Abstract Within the framework of the French evaluation program MEDIA on spoken dialogue systems, this paper presents the methods proposed at the LIA for the robust extraction of basic conceptual constituents (or concepts) from an audio message. The conceptual … Cited by 25 Related articles All 9 versions

From raw corpus to word lattices: robust pre-parsing processing with SXPipe B Sagot, P Boullier – Archives of Control Sciences, 2005 – hal.inria.fr Résumé: We present a robust full-featured architecture to preprocess text before parsing. This architecture, called SxPipe, converts raw noisy corpora into word lattices, one by sentence, that can be used as input by a parser. It includes sequentially named-entity … Cited by 21 Related articles All 13 versions

Head-driven parsing for word lattices C Collins, B Carpenter, G Penn – … of the 42nd Annual Meeting on …, 2004 – dl.acm.org Abstract We present the first application of the head-driven statistical parsing model of Collins (1999) as a simultaneous language model and parser for large-vocabulary speech recognition. The model is adapted to an online left to right chart-parser for word lattices, … Cited by 17 Related articles All 15 versions

Towards spoken-document retrieval for the enterprise: Approximate word-lattice indexing with text indexers F Seide, P Yu, Y Shi – Automatic Speech Recognition & …, 2007 – ieeexplore.ieee.org ABSTRACT Enterprise-scale search engines are generally designed for linear text. Linear text is suboptimal for audio search, where accuracy can be significantly improved if the search includes alternate recognition candidates, commonly represented as word lattices. … Cited by 15 Related articles All 5 versions

What’s in a word graph evaluation and enhancement of word lattices JW Amtrup, H Heine, U Jost – 2013 – scidok.sulb.uni-saarland.de Abstract During the last few years, word graphs have been gammg increasing interest within the speech community as the primary interface between speech recognizers and language processing modules. Both development and evaluation of graphproducing speech … Cited by 17 Related articles All 11 versions

Phrase-based translation of speech recognizer word lattices using loglinear model combination E Matusov, H Ney, R Schluter – Automatic Speech Recognition …, 2005 – ieeexplore.ieee.org ABSTRACT This paper presents a phrase-based speech translation system that combines phrasal lexicon, language, and acoustic model features in a loglinear model. Automatic speech recognition and machine translation are coupled by using large word lattices as … Cited by 12 Related articles All 8 versions

FBK at WMT 2010: Word lattices for morphological reduction and chunk-based reordering C Hardmeier, A Bisazza, M Federico – … of the Joint Fifth Workshop on …, 2010 – dl.acm.org Abstract FBK participated in the WMT 2010 Machine Translation shared task with phrase- based Statistical Machine Translation systems based on the Moses decoder for English- German and German-English translation. Our work concentrates on exploiting the … Cited by 9 Related articles All 15 versions

ASR word lattice translation with exhaustive reordering is possible E Matusov, B Hoffmeister… – Interspeech, …, 2008 – 20.210-193-52.unknown.qala.com. … Abstract This paper shows how ASR word lattices can be translated even when exhaustive reordering is required for good translation quality. We propose a method for labeling lattice word hypotheses with position information derived from a confusion network (CN). This … Cited by 8 Related articles All 6 versions

Conditional use of word lattices, confusion networks and 1-best string hypotheses in a sequential interpretation strategy. B Minescu, G Damnati, F Béchet… – …, 2007 – 20.210-193-52.unknown.qala.com. … Abstract Within the context of a deployed spoken dialog service, this study presents a new interpretation strategy based on the sequential use of different ASR output representations: 1-best strings, word lattices and confusion networks. The goal is to reject as early as … Cited by 7 Related articles All 7 versions

Macaon: An nlp tool suite for processing word lattices A Nasr, F Béchet, JF Rey, B Favre… – Proceedings of the 49th …, 2011 – dl.acm.org Abstract MACAON is a tool suite for standard NLP tasks developed for French. MACAON has been designed to process both human-produced text and highly ambiguous word- lattices produced by NLP tools. MACAON is made of several native modules for common … Cited by 10 Related articles All 6 versions

Solving the pinyin-to-Chinese-character conversion problem based on hybrid word lattice Z Sen – Chinese Journal of Computers, 2007 – en.cnki.com.cn The research and development of the Pinyin-to-Chinese-Character conversion is the core technique of Chinese Input system, Chinese speech recognition and Chinese information pro-cessing. First, the state-of-the-art of Pinyin-to-Chinese-Character conversion is briefly … Cited by 5 Related articles All 2 versions

Word/sub-word lattices decomposition and combination for speech recognition VB Le, S Seng, L Besacier, B Bigi – Acoustics, Speech and …, 2008 – ieeexplore.ieee.org ABSTRACT This paper presents the benefit of using multiple lexical units in the post- processing stage of an ASR system. Since the use of sub-word units can reduce the high out- of-vocabulary rate and improve the lack of text resources in statistical language modeling, … Cited by 6 Related articles All 8 versions

Combining compound recognition and PCFG-LA parsing with word lattices and conditional random fields M Constant, JL Roux, A Sigogne – ACM Transactions on Speech and …, 2013 – dl.acm.org Abstract The integration of compounds in a parsing procedure has been shown to improve accuracy in an artificial context where such expressions have been perfectly preidentified. This article evaluates two empirical strategies to incorporate such multiword units in a real … Cited by 8 Related articles All 3 versions

Using word posterior probabilities in lattice translation. V Alabau, A Sanchis, F Casacuberta – IWSLT, 2007 – 20.210-193-52.unknown.qala.com. … … The system we have developed takes advantage of an im- proved word lattice representation that uses word posterior probabilities. … In [5] the translation process is performed using as input a word lattice and acoustic recognition scores. … Cited by 5 Related articles All 7 versions

A decoding algorithm for word lattice translation in speech translation. R Zhang, G Kikui, H Yamamoto, WK Lo – IWSLT, 2005 – 20.210-193-52.unknown.qala.com. … Abstract We propose a novel statistical machine translation decoding algorithm for speech translation to improve speech translation quality. The algorithm can translate the speech recognition word lattice, where more hypotheses are utilized to bypass the misrecognized … Cited by 5 Related articles All 8 versions

Efficient determinization of tagged word lattices using categorial and lexicographic semirings I Shafran, R Sproat, M Yarmohammadi… – … (ASRU), 2011 IEEE …, 2011 – ieeexplore.ieee.org Abstract—Speech and language processing systems routinely face the need to apply finite state operations (eg, POS tagging) on results from intermediate stages (eg, ASR output) that are naturally represented in a compact lattice form. Currently, such needs are met by … Cited by 7 Related articles All 5 versions

Integration of speech recognition and machine translation: Speech recognition word lattice translation R Zhang, G Kikui – Speech communication, 2006 – Elsevier An important issue in speech translation is to minimize the negative effect of speech recognition errors on machine translation. We propose a novel statistical machine translation decoding algorithm for speech translation to improve speech translation quality … Cited by 5 Related articles All 5 versions

Using word lattices to improve translation from morphologically complex languages C Dyer – 2007-04-20). http://www. ling. umd. edu/~ redpony/ … – ling.umd.edu Page 1. Using Word Lattices to Improve Translation from Morphologically Complex Languages Chris Dyer University of Maryland Page 2. … to MT? Prior work Modeling morphology as observational ambiguity Decoding word lattices Experimental results Page 3. April 20, 2007 … Cited by 4 Related articles All 2 versions

Graphical model representations of word lattices G Ji, J Bilmes, K Kirchhoff… – … Workshop, 2006. IEEE, 2006 – ieeexplore.ieee.org ABSTRACT We introduce a method for expressing word lattices within a dynamic graphical model. We describe a variety of choices for doing this, including a technique to relax the time information associated with lattice nodes in a way that trades off hypothesis expansion … Cited by 4 Related articles All 8 versions

Topic identification from audio recordings using word and phone recognition lattices TJ Hazen, F Richardson… – … Speech Recognition & …, 2007 – ieeexplore.ieee.org … 2. Using phonetic strings generated by phonetic forced align- ment of the transcripts (as generated by the MIT SUMMIT recognizer). 3. Using word lattices automatically generated by the MIT SUM- MIT word recognition system. … Cited by 31 Related articles All 11 versions

Lattice generation with accurate word boundary in WFST framework Y Guo, Y Si, Y Liu, J Pan, Y Yan – Image and Signal Processing …, 2012 – ieeexplore.ieee.org … In traditional WFST lattice generation algorithms, the transfor- mation from context-dependent phone lattice to word lattice does not yield accurate time boundaries between words. Meanwhile, this lattice is not a Standard Lattice Format nor is it compatible with existing toolkits. … Cited by 3 Related articles

Automatic out-of-language detection based on confidence measures derived from LVCSR word and phone lattices. P Motlicek – INTERSPEECH, 2009 – 20.210-193-52.unknown.qala.com. … … In preliminary work, Cmax measure estimated from LVCSR word lattices has been shown to be the best performing confidence measure for recog- nition error detection [1]. Frame-based posterior CMs have been used to improve speech recognition performaces in hybrid HMM … Cited by 11 Related articles All 7 versions

Word-lattice based spoken-document indexing with standard text indexers F Seide, K Thambiratnam, RP Yu – … Workshop, 2008. SLT 2008. …, 2008 – ieeexplore.ieee.org ABSTRACT Indexing the spoken content of audio recordings requires automatic speech recognition, which is as of today not reliable. Unlike indexing text, we cannot reliably know from a speech recognizer whether a word is present at a given point in the audio; we can … Cited by 2 Related articles All 7 versions

Knowledge-Based Word Lattice Rescoring in a Dynamic Context. T Shore, F Faubel, H Helmke… – …, 2012 – 20.210-193-52.unknown.qala.com. … Abstract Recent advances in automatic speech recognition (ASR) technology continue to be based heavily on data-driven methods, meaning that the full benefits of such research are often not enjoyed in domains for which there is little training data available. Moreover, … Cited by 2 Related articles All 6 versions

Mining broadcast news data: robust information extraction from word lattices. B Favre, F Béchet, P Nocéra – INTERSPEECH, 2005 – lia.univ-avignon.fr Abstract Fine-grained information extraction performance from spoken corpora is strongly correlated with the Word Error Rate (WER) of the automatic transcriptions processed. Despite the recent advances in Automatic Speech Recognition (ASR) methods, high WER … Cited by 2 Related articles All 8 versions

Efficient word lattice generation for joint word segmentation and POS tagging in Japanese N Kaji, M Kitsuregawa – Proceedings of IJCNLP, 2013 – aclweb.org Abstract This paper investigates the importance of a word lattice generation algorithm in joint word segmentation and POS tagging. We conducted experiments on three Japanese data sets to demonstrate that the previously proposed pruning-based algorithm is in fact not … Cited by 4 Related articles All 3 versions

Phrase model training for statistical machine translation with word lattices of preprocessing alternatives J Wuebker, H Ney – Proceedings of the Seventh Workshop on Statistical …, 2012 – dl.acm.org Abstract In statistical machine translation, word lattices are used to represent the ambiguities in the preprocessing of the source sentence, such as word segmentation for Chinese or morphological analysis for German. Several approaches have been proposed to define … Cited by 2 Related articles All 14 versions

Maximum entropy based normalization of word posteriors for phonetic and lvcsr lattice search P Yu, D Zhang, F Seide – Acoustics, Speech and Signal …, 2006 – ieeexplore.ieee.org … method. Applied to searching LVCSR-based word lattices, the improvement is neglectable, but it is still effective when combining phonetic and word-lattice search in a hybrid mode, yielding an improvement from 46.7% to 65.8%. … Cited by 4 Related articles All 12 versions

Comparing Different Word Lattice Rescoring Approaches Towards Keyword Spotting J Pinto, H Bourlard, Z De Greve, H Hermansky – 2007 – publications.idiap.ch Résumé. In this paper, we further investigate the large vocabulary continuous speech recognition approach to keyword spotting. Given a speech utterance, recognition is performed to obtain a word lattice. The posterior probability of keyword hypotheses in the … Cited by 1 Related articles All 9 versions

A hybrid approach to robust word lattice generation via acoustic-based word detection. I Han, C Park, J Cho, J Kim – INTERSPEECH, 2010 – 20.210-193-52.unknown.qala.com. … Abstract A large-vocabulary continuous speech recognition (LVCSR) system usually utilizes a language model in order to reduce the complexity of the algorithm. However, the constraint also produces side-effects including low accuracy of the out-ofgrammar sentences and the … Cited by 1 Related articles All 2 versions

An Evaluation of Lattice Scoring using a Smoothed Estimate of Word Accuracy MK Omar, L Mangu – Acoustics, Speech and Signal Processing, …, 2007 – ieeexplore.ieee.org … ABSTRACT This paper describes a novel approach for estimating the best hy- pothesis of a given word lattice, the hypothesis lattice, using another word lattice, the reference lattice, and its application to large vocab- ulary automatic speech recognition. … Cited by 1 Related articles

Keyword Spotting in LVCSR Based Word Lattices for Large Multimedia Search Z Tychtl, A Pražák – SPECOM 2007 Proceedings, 2007 – kky.zcu.cz Abstract In the proposed paper the LVCSR based Czech keyword spotting in word lattices is presented in the context of our new scalable distributed system developed for searching the huge multimedia databases. The considered purpose of the system is to allow for the fast … All 2 versions

Statistical Machine Translation without Source-side Parallel Corpus Using Word Lattice and Phrase Extension. T Kusumoto, T Akiba – LREC, 2012 – mt-archive.info Abstract Statistical machine translation (SMT) requires a parallel corpus between the source and target languages. Although a pivot-translation approach can be applied to a language pair that does not have a parallel corpus directly between them, it requires both source– … Related articles All 4 versions

Syllable Based Keyword Search: Transducing Syllable Lattices To Word Lattices H Su, J Hieronymus, Y He, E Fosler-Lussier… – icsi.berkeley.edu ABSTRACT This paper presents a weighted finite state transducer (WFST) based syllable decoding and transduction framework for keyword search (KWS). Acoustic context dependent phone models are trained from word forced alignments. Then syllable …

Robust parsing for word lattices in Continuous Speech Recognition systems S Momtazi, H Sameti, M Fazel-Zarandi… – … Processing and Its …, 2007 – ieeexplore.ieee.org ABSTRACT One of the roles of a Natural Language Processing (NLP) model in Continuous Speech Recognition (CSR) systems is to find the best sentence hypothesis by ranking all nbest sentences according to the grammar. This paper describes a robust parsing … Related articles All 4 versions

Algorithm of Word-Lattice Parsing Based on Improved CYK-Algorithm Y Sun, L Zhou, Q He, Y Gu, L Jia – Web Information Systems …, 2010 – ieeexplore.ieee.org Abstract-Through thoroughly researching on CYK-algorithm (Cocke-Younger-Kasami) parsing to normal sentence, especially on generating algorithm about initial CYK-table of word-lattice structure, and improving CYK-algorithm by regarding the span of time … Related articles All 3 versions

Revising word lattice using support vector machine for Chinese word segmentation M Zhong, S Wang, M Wu – … of the 14th International Conference on …, 2012 – dl.acm.org Abstract This paper presents a novel Chinese word segmentation approach combining both dictionary-based and statistics-based techniques. First, we transform a linear sentence to a word lattice based on dictionary. Then we apply classification method based on support … Cited by 1 Related articles

An improved minimum word error approach to lattice rescoring and system combination H Xu, J Zhu, X Bao – TENCON 2010-2010 IEEE Region 10 …, 2010 – ieeexplore.ieee.org … presented. The approximated BR criterion is im- plemented by performing recursive edit distance computation over word lattice, while the decoding result that is obtained by traversing CN satisfies the criterion minimization. Based … Related articles All 2 versions

Word-Lattice Parsing Parallel Algorithm S Guodong, G Yuwan, S Yuqiang… – … (DBTA), 2010 2nd …, 2010 – ieeexplore.ieee.org CYK-algorithm (Cocke-Younger-Kasami) of parsing normal sentence, in particular, it is analyzed thoroughly for the generating algorithm about initial CYK-table of word-lattice structure, and CYK-algorithm is improved by the attribute of span of time sequence after … Related articles All 2 versions

Lattice-based lexical cues for word fragment detection in conversational speech K Audhkhasi, P Georgiou… – … Speech Recognition & …, 2009 – ieeexplore.ieee.org … task. We hypothesize that the confusion in the word lattice generated by the ASR system can be exploited for detecting word fragments. Two … work. In section III, we discuss the proposed word lattice-based lexical features. Section … Related articles All 4 versions

Statistical Word Lattice Translation C Dyer – ling.umd.edu Abstract This paper makes two contributions. First, I show that algorithms for parsing word lattices with context-free grammars (CFGs) can be utilized by MT systems which operate by parsing with a synchronous CFG to enable efficient machine translation of ambiguous … Related articles

Shallow parsing on word lattices B Zaborowski – Pacific Voice Conference (PVC), 2014 XXII …, 2014 – ieeexplore.ieee.org Abstract—The article presents preliminary results of a project which aims at developing a new algorithm for shallow parsing of a natural language. The main feature is that the algorithm allows efficient processing of word lattices. The main application of this feature …

Faster graphical model identification of tandem mass spectra using peptide word lattices S Wang, JT Halloran, JA Bilmes, WS Noble – arXiv preprint arXiv: …, 2014 – arxiv.org Abstract: Liquid chromatography coupled with tandem mass spectrometry, also known as shotgun proteomics, is a widely-used high-throughput technology for identifying proteins in complex biological samples. Analysis of the tens of thousands of fragmentation spectra …

Improved open-vocabulary spoken content retrieval with word and subword lattices using acoustic feature similarity H Lee, P Chou, L Lee – Computer Speech & Language, 2014 – Elsevier … in the prior works (Chen et al., 2010, Chen et al., 2011, Tu et al., 2011 and Lee and Lee, 2013) they were formulated based on a relatively limited task in which the query includes only a single in-vocabulary (IV) word, and the whole retrieval process was based on word lattices. … Cited by 3 Related articles