Neural Language Models 2016


Notes:

  • Bigram neural language model
  • Class-based neural language model
  • Conditional neural language model
  • Context-aware neural language model
  • Factored neural language model
  • Feature-based neural language model
  • Hierarchal neural language model
  • Log-bilinear neural language model
  • Multimodal neural language model
  • Neural language modeLing
  • Neural language modeLLing
  • Neural network language model
  • Probabilistic neural language model
  • Recurrent neural language model
  • Structure-content neural language model

Resources:

  • rwthlm .. a toolkit for feedforward and long short-term memory neural network language modeling

Wikipedia:

  • Convolutional neural network .. a type of feed-forward artificial neural network, and variation of multilayer perceptron, designed to minmize preprocessing.

See also:

100 Best Deep Learning Videos | 100 Best GitHub: Deep Learning | Deep Learning & Dialog Systems | DNLP (Deep Natural Language Processing)Natural Language Image Recognition | SkipGrams 2013 | Word2vec Neural Network


Character-Aware Neural Language Models.
Y Kim, Y Jernite, D Sontag, AM Rush – AAAI, 2016 – aaai.org
Abstract We describe a simple neural language model that relies only on character-level inputs. Predictions are still made at the word-level. Our model employs a convolutional neural network (CNN) and a highway network over characters, whose output is given to a

CUED-RNNLM—An open-source toolkit for efficient training and evaluation of recurrent neural network language models
X Chen, X Liu, Y Qian, MJF Gales… – Acoustics, Speech and …, 2016 – ieeexplore.ieee.org
Abstract: In recent years, recurrent neural network language models (RNNLMs) have become increasingly popular for a range of applications including speech recognition. However, the training of RNNLMs is computationally expensive, which limits the quantity of

Polyglot neural language models: A case study in cross-lingual phonetic representation learning
Y Tsvetkov, S Sitaram, M Faruqui, G Lample… – arXiv preprint arXiv …, 2016 – arxiv.org
Abstract: We introduce polyglot language models, recurrent neural network models trained to predict symbol sequences in many different languages using shared representations of symbols and conditioning on typological information about the language to be predicted. We

Efficient training and evaluation of recurrent neural network language models for automatic speech recognition
X Chen, X Liu, Y Wang, MJF Gales… – … /ACM Transactions on …, 2016 – ieeexplore.ieee.org
Abstract: Recurrent neural network language models (RNNLMs) are becoming increasingly popular for a range of applications including automatic speech recognition. An important issue that limits their possible application areas is the computational cost incurred in training

Minimum word error training of long short-term memory recurrent neural network language models for speech recognition
T Hori, C Hori, S Watanabe… – Acoustics, Speech and …, 2016 – ieeexplore.ieee.org
Abstract: This paper describes minimum word error (MWE) training of recurrent neural network language models (RNNLMs) for speech recognition. RNNLMs are usually trained to minimize a cross entropy of estimated word probabilities against the correct word sequence,

Generalizing and hybridizing count-based and neural language models
G Neubig, C Dyer – arXiv preprint arXiv:1606.00499, 2016 – arxiv.org
Abstract: Language models (LMs) are statistical models that calculate probabilities over sequences of words or other discrete symbols. Currently two major paradigms for language modeling exist: count-based n-gram models, which have advantages of scalability and test-

Two efficient lattice rescoring methods using recurrent neural network language models
X Liu, X Chen, Y Wang, MJF Gales… – IEEE/ACM Transactions …, 2016 – dl.acm.org
Abstract An important part of the language modelling problem for automatic speech recognition (ASR) systems, and many other related applications, is to appropriately model long-distance context dependencies in natural languages. Hence, statistical language

Improving neural language models with a continuous cache
E Grave, A Joulin, N Usunier – arXiv preprint arXiv:1612.04426, 2016 – arxiv.org
Abstract: We propose an extension to neural network language models to adapt their prediction to the recent history. Our model is a simplified version of memory augmented networks, which stores past hidden activations as memory and accesses them through a dot

A Text Clustering Approach of Chinese News Based on Neural Network Language Model
Z Fan, S Chen, L Zha, J Yang – International Journal of Parallel …, 2016 – Springer
Abstract Text clustering plays an important role in data mining and machine learning. After years of development, clustering technology has produced a series of theories and methods. However, in the text clustering of Chinese news, the mainstream LDA method suffers a high

Authorship Attribution Using a Neural Network Language Model.
Z Ge, Y Sun, MJT Smith – AAAI, 2016 – aaai.org
Abstract In practice, training language models for individual authors is often expensive because of limited data resources. In such cases, Neural Network Language Models (NNLMs), generally outperform the traditional non-parametric N-gram models. Here we

Semantic word embedding neural network language models for automatic speech recognition
K Audhkhasi, A Sethy… – Acoustics, Speech and …, 2016 – ieeexplore.ieee.org
Abstract: Semantic word embeddings have become increasingly important in natural language processing tasks over the last few years. This popularity is due to their ability to easily capture rich semantic information through a distributed representation and the

Incorporating Side Information into Recurrent Neural Network Language Models.
CDV Hoang, T Cohn, G Haffari – HLT-NAACL, 2016 – aclweb.org
Abstract Recurrent neural network language models (RNNLM) have recently demonstrated vast potential in modelling long-term dependencies for NLP problems, ranging from speech recognition to machine translation. In this work, we propose methods for conditioning

Using Factored Word Representation in Neural Network Language Models.
J Niehues, TL Ha, E Cho, A Waibel – WMT, 2016 – pdfs.semanticscholar.org
Abstract Neural network language and translation models have recently shown their great potentials in improving the performance of phrase-based machine translation. At the same time, word representations using different word factors have been translation quality and are

Investigation on log-linear interpolation of multi-domain neural network language model
Z Tüske, K Irie, R Schlüter, H Ney – Acoustics, Speech and …, 2016 – ieeexplore.ieee.org
Abstract: Inspired by the success of multi-task training in acoustic modeling, this paper investigates a new architecture for a multi-domain neural network based language model (NNLM). The proposed model has several shared hidden layers and domain-specific output

Poster: Learning models of a network protocol using neural network language models
B Aichernig, R Bloem, F Pernkopf, F Röck… – IEEE Secur …, 2016 – ieee-security.org
Abstract—We present an automatic method to learn models of network protocol implementations which uses only network traces for learning. We employ language modelling techniques to infer a model for legitimate network communication which can then

Multimodal representation: Kneser-ney smoothing/skip-gram based neural language model
M Song, CD Yoo – Image Processing (ICIP), 2016 IEEE …, 2016 – ieeexplore.ieee.org
Abstract: For image retrieval and caption generation, this paper considers a multimodal representation that associates image with its text description (caption) by defining a neural language model as the conditional probability of the next word given both n past words in a

Unsupervised Adaptation of Recurrent Neural Network Language Models.
SR Gangireddy, P Swietojanski, P Bell… – …, 2016 – pdfs.semanticscholar.org
Abstract Recurrent neural network language models (RNNLMs) have been shown to consistently improve Word Error Rates (WERs) of large vocabulary speech recognition systems employing ngram LMs. In this paper we investigate supervised and unsupervised

Recurrent Neural Network Language Model with Incremental Updated Context Information Generated Using Bag-of-Words Representation.
MA Haidar, M Kurimo – INTERSPEECH, 2016 – pdfs.semanticscholar.org
Abstract Recurrent neural network language model (RNNLM) is becoming popular in the state-of-the-art speech recognition systems. However, it can not remember long term patterns well due to a so-called vanishing gradient problem. Recently, Bag-of-words (BOW)

Compressing neural language models by sparse word representations
Y Chen, L Mou, Y Xu, G Li, Z Jin – arXiv preprint arXiv:1610.03950, 2016 – arxiv.org
Abstract: Neural networks are among the state-of-the-art techniques for language modeling. Existing neural language models typically map discrete words to distributed, dense vector representations. After information processing of the preceding context words by hidden

Automated structure discovery and parameter tuning of neural network language model based on evolution strategy
T Tanaka, T Moriya, T Shinozaki… – … (SLT), 2016 IEEE, 2016 – ieeexplore.ieee.org
Abstract: Long short-term memory (LSTM) recurrent neural network based language models are known to improve speech recognition performance. However, significant effort is required to optimize network structures and training configurations. In this study, we

Convolutional Neural Network Language Models.
NQ Pham, G Kruszewski, G Boleda – EMNLP, 2016 – aclweb.org
Abstract Convolutional Neural Networks (CNNs) have shown to yield very strong results in several Computer Vision tasks. Their application to language has received much less attention, and it has mainly focused on static classification tasks, such as sentence

On training bi-directional neural network language model with noise contrastive estimation
T He, Y Zhang, J Droppo, K Yu – Chinese Spoken Language …, 2016 – ieeexplore.ieee.org
Abstract: Although uni-directional recurrent neural network language model (RNNLM) has been very successful, it’s hard to train a bi-directional RNNLM properly due to the generative nature of language model. In this work, we propose to train bi-directional RNNLM with noise

Evolutionary optimization of long short-term memory neural network language model
T Tanaka, T Moriya, T Shinozaki… – The Journal of the …, 2016 – asa.scitation.org
Recurrent neural network language models (RNN-LMs) are recently proven to produce better performance than conventional N-gram based language models in various speech recognition tasks. Especially, long short-term memory recurrent neural network language

Representation of Relations by Planes in Neural Network Language Model
T Ebisu, R Ichise – International Conference on Neural Information …, 2016 – Springer
Abstract Whole brain architecture (WBA) which uses neural networks to imitate a human brain is attracting increased attention as a promising way to achieve artificial general intelligence, and distributed vector representations of words is becoming recognized as the

A Character-Word Compositional Neural Language Model for Finnish
M Lankinen, H Heikinheimo, P Takala, T Raiko… – arXiv preprint arXiv …, 2016 – arxiv.org
Abstract: Inspired by recent research, we explore ways to model the highly morphological Finnish language at the level of characters while maintaining the performance of word-level models. We propose a new Character-to-Word-to-Character (C2W2C) compositional

Automated Structure Discovery and Parameter Tuning of Neural Network Language Model Based on Evolutionary Strategy
T Takano, T Moriya, T Shinozaki, S Watanabe, T Hori – 2016 – pdfs.semanticscholar.org
Abstract Long short-term memory (LSTM) recurrent neural network based language models are known to improve speech recognition performance. However, significant effort is required to optimize network structures and training configurations. In this study, we

Optimization of topic estimation for the domain adapted neural network language model
A Hagiwara, H Ito, M Ichiki, T Mishima… – The Journal of the …, 2016 – asa.scitation.org
We present a neural network language model adapted for topics fluctuating in broadcast programs. Topic adapted n-gram language models constructed by using latent Dirichlet allocation for topic estimation are widely used. The conventional method estimates topics by

Stochastic Dropout: Activation-level Dropout to Learn Better Neural Language Models
A Nie – cs224d.stanford.edu
Abstract Recurrent Neural Networks are very powerful computational tools that are capable of learning many tasks across different domains. However, it is prone to overfitting and can be very difficult to regularize. Inspired by Recurrent Dropout [1] and Skip-connections [2], we

Opening the vocabulary of neural language models with character-level word representations
M Labeau, A Allauzen – 2016 – openreview.net
Abstract: This paper introduces an architecture for an open-vocabulary neural language model. Word representations are computed on-the-fly by a convolution network followed by pooling layer. This allows the model to consider any word, in the context or for the prediction.

Iit bombay’s english-indonesian submission at wat: Integrating neural language models with smt
S Singh, A Kunchukuttan, P Bhattacharyya – Proceedings of the 3rd …, 2016 – aclweb.org
Abstract This paper describes the IIT Bombay’s submission as a part of the shared task in WAT 2016 for English–Indonesian language pair. The results reported here are for both the direction of the language pair. Among the various approaches experimented, Operation

Multi-GPU Based Recurrent Neural Network Language Model Training
X Zhang, N Gu, H Ye – … of Young Computer Scientists, Engineers and …, 2016 – Springer
Abstract Recurrent neural network language models (RNNLMs) have been applied in a wide range of research fields, including nature language processing and speech recognition. One challenge in training RNNLMs is the heavy computational cost of the

Neural Network Language Models for Candidate Scoring in Hybrid Multi-System Machine Translation
M Rikters – The 26th International Conference on …, 2016 – anthology.aclweb.org
Abstract This paper presents the comparison of how using different neural network based language modelling tools for selecting the best candidate fragments affects the final output translation quality in a hybrid multi-system machine translation setup. Experiments were

Integrating Encyclopedic Knowledge into Neural Language Models
Y Zhang, J Niehues, A Waibel – workshop2016.iwslt.org
Abstract Neural models have recently shown big improvements in the performance of phrase-based machine translation. Recurrent language models, in particular, have been a great success due to their ability to model arbitrary long context. In this work, we integrate

IIT Bombay’s English-Indonesian submission at WAT: Integrating Neural Language Models with SMT
SSAKP Bhattacharyya – WAT 2016, 2016 – aclweb.org
Abstract This paper describes the IIT Bombay’s submission as a part of the shared task in WAT 2016 for English–Indonesian language pair. The results reported here are for both the direction of the language pair. Among the various approaches experimented, Operation

Compositional Neural Network Language Models for Agglutinative Languages.
E Arisoy, M Saraclar – INTERSPEECH, 2016 – pdfs.semanticscholar.org
Abstract Continuous space language models (CSLMs) have been proven to be successful in speech recognition. With proper training of the word embeddings, words that are semantically or syntactically related are expected to be mapped to nearby locations in the

Recurrent Neural Network Language Model Adaptation Derived Document Vector
W Li, B Kan, W Mak – arXiv preprint arXiv:1611.00196, 2016 – arxiv.org
Abstract: In many natural language processing (NLP) tasks, a document is commonly modeled as a bag of words using the term frequency-inverse document frequency (TF-IDF) vector. One major shortcoming of the frequency-based TF-IDF feature vector is that it ignores

The TALP–UPC Spanish–English WMT Biomedical Task: Bilingual Embeddings and Char-based Neural Language Model Rescoring in a Phrase-based …
C Escolano, JAR Fonollosa – pdfs.semanticscholar.org
Abstract This paper describes the TALP–UPC system in the Spanish–English WMT 2016 biomedical shared task. Our system is a standard phrase-based system enhanced with vocabulary expansion using bilingual word embeddings and a characterbased neural

The TALP–UPC Spanish–English WMT biomedical task: bilingual embeddings and char-based neural language model rescoring in a phrase-based system
M Ruiz Costa-Jussà, C España-i-Bonet… – ACL 2016: the 54th …, 2016 – upcommons.upc.edu
Abstract This paper describes the TALP–UPC system in the Spanish–English WMT 2016 biomedical shared task. Our system is a standard phrase-based system enhanced with vocabulary expansion using bilingual word embeddings and a characterbased neural

Spoken keyword detection using recurrent neural network language model
S Koike, A Lee – The Journal of the Acoustical Society of America, 2016 – asa.scitation.org
Recently, spoken keyword detection (SKD) systems that listen live audio and tries to capture user’s utterances with specific keywords has been extensively studied, in order to realize a truly usable hands-free speech interface in our life:“Okay google” in Google products,“Hey,

Multi-Language Neural Network Language Models.
A Ragni, E Dakin, X Chen, MJF Gales… – …, 2016 – pdfs.semanticscholar.org
Abstract In recent years there has been considerable interest in neural network based language models. These models typically consist of vocabulary dependent input and output layers and one, or more, hidden layers. A standard problem with these networks is that large

Research data supporting” CUED-RNNLM–An Open-Source Toolkit for Efficient Training and Evaluation of Recurrent Neural Network Language Models”
X Chen, X Liu, Y Qian, M Gales, P Woodland – 2016 – repository.cam.ac.uk
;; CTS output reformatted by base/perls/reformat_meeting.pl on Tue Jun 9 13:12:45 BST 2015. ;; Filtered STDIN. ;; by base/perls/filterctm.03.pl on Tue Jun 9 13:12:45 BST 2015. 20080505+1800+1WE+THEX+BOO 0 0.27 0.43 CORIN 0.500000 lex unknown.

Interactive Multi-System Machine Translation with Neural Language Models.
M Rikters – DB&IS (Selected Papers), 2016 – books.google.com
Abstract. The tool described in this article has been designed to help machine translation (MT) researchers to combine and evaluate various MT engine outputs through a web-based graphical user interface using syntactic analysis and language modelling. The tool supports

Neural Network Language Models–an Overview
A Popov – The Workshop on Deep Language Processing for …, 2016 – di.fc.ul.pt
Abstract. This article presents a brief overview of different approaches to training language models for natural language processing, using neural networks. It contrasts the advantages and disadvantages of traditional count-based approaches and newer, neural ones. Then it

WAT: Integrating neural language models with SMT
S Singh, A Kunchukuttan, P Bhattacharyya – 2016 – lotus.kuee.kyoto-u.ac.jp
Page 1. IIT Bombay’s English-Indonesian submission at WAT: Integrating neural language models with SMT Sandhya Singh, Anoop Kunchukuttan, Pushpak Bhattacharyya … Indonesian 44939 sentences 400 sentences 400 sentences 50000 sentences Page 5. Experiment Description

Convolutional neural network language models
G Boleda Torrent, G Kruszewski – Proceedings of the 2016 …, 2016 – repositori.upf.edu
Convolutional Neural Networks (CNNs) have shown to yield very strong results in several Computer Vision tasks. Their application to language has received much less attention, and it has mainly focused on static classification tasks, such as sentence classification for

Integrating Encyclopedic Knowledge into Neural Network Language Models
Y Zhang – 2016 – isl.anthropomatik.kit.edu
Abstract Machine translation deals with translation done by computers and has been an ongoing research topic for decades. Recently, phrase-based machine translation was revolutionized by significant improvements of neural models. Recurrent language models, in

Word Sense Disambiguation with Neural Language Models.
DYRDJ Richardson, CEE Altendorf – pdfs.semanticscholar.org
Abstract Determining the intended sense of words in text–word sense disambiguation (WSD)–is a long-standing problem in natural language processing. In this paper, we present WSD algorithms which use neural network language models to achieve state-of-the-art

Comparison of Various Neural Network Language Models in Speech Recognition
L Zuo, X Wan, J Liu – Information Science and Control …, 2016 – ieeexplore.ieee.org
Abstract: In recent years, research on language modeling for speech recognition has increasingly focused on the application of neural networks. However, the performance of neural network language models strongly depends on their architectural structure. Three

Deep-Deep Neural Network Language Models for Predicting Mild Cognitive Impairment.
SO Orimaye, JSM Wong, JSG Fernandez – BAI@ IJCAI, 2016 – pdfs.semanticscholar.org
Abstract Early diagnosis of Mild Cognitive Impairment (MCI) is currently a challenge. Currently, MCI is diagnosed using specific clinical diagnostic criteria and neuropsychological examinations. As such we propose an automated diagnostic technique

Domain Specific Author Attribution Based on Feedforward Neural Network Language Models
Z Ge, Y Sun – arXiv preprint arXiv:1602.07393, 2016 – arxiv.org
Abstract: Authorship attribution refers to the task of automatically determining the author based on a given sample of text. It is a problem with a long history and has a wide range of application. Building author profiles using language models is one of the most successful

A Vietnamese language model based on Recurrent Neural Network
VT Tran, KH Nguyen, DH Bui – Knowledge and Systems …, 2016 – ieeexplore.ieee.org
… B. Evaluation We built four neural language models on the Tensor- flow framework [18], with three variable … This paper presents a neural language model for Viet- namese … [7] T. Mikolov and G. Zweig, “Context dependent recurrent neural network language model,” in 2012 IEEE …

„Evolution Strategy Based Neural Network Optimization and LSTM Language Model for Robust Speech Recognition “
T Tanaka, T Shinozaki, S Watanabe, T Hori – cit. on, 2016 – spandh.dcs.shef.ac.uk
… of neural network language model based on evolution strategy,” in IEEE Workshop on Spoken Language Technology (SLT), 2016, (accepted) … and JR Hershey, “Minimum word error training of long short-term memory recurrent neural network language models for speech …

A hybrid language model based on a recurrent neural network and probabilistic topic modeling
MS Kudinov, AA Romanenko – Pattern Recognition and Image Analysis, 2016 – Springer
… All measures are summarized in Table 1. The recurrent neural network language model and hybrid maximum entropy model were trained on a subcorpus of the Lenta.ru corpus of ~ tokens … 1 ii t this technique to recurrent neural network language models …

Improving accented Mandarin speech recognition by using recurrent neural network based language model adaptation
H Ni, J Yi, Z Wen, B Liu, J Tao – Chinese Spoken Language …, 2016 – ieeexplore.ieee.org
… However, neural network language model (NNLM) [1, 2] can represent the non-linear relationship between input words and word probabilities through projecting input words into a continuous space and estimating word probabilities in a softmax layer …

Recurrent neural network-based language models with variation in net topology, language, and granularity
TH Yang, TH Tseng, CP Chen – Asian Language Processing …, 2016 – ieeexplore.ieee.org
… Mikolov, S. Kombrink, L. Burget, JH Cernocky, and S. Khudan- pur, ”Extensions of Recurrent Neural Network Language Model,” Proc … CUED-RNNLM – An Open-Source Toolkit for Efficient Training and Evaluation of Recurrent Neural Network Language Models,” In Proceedings …

Recurrent Neural Network Based Language Model Adaptation for Accent Mandarin Speech
H Ni, J Yi, Z Wen, J Tao – Chinese Conference on Pattern Recognition, 2016 – Springer
… However, neural network language model (NNLM) [1, 2] can represent the non-linear relationship between input words and word probabilities through projecting input words into a continuous space and estimating word probabilities in a softmax layer …

NN-grams: Unifying neural network and n-gram language models for speech recognition
B Damavandi, S Kumar, N Shazeer… – arXiv preprint arXiv …, 2016 – arxiv.org
… Unlike other neural network language model- ing approaches [4, 5, 13], there is no explicit soft-max over the vocabulary. 2.1. Model Estimation … 2007. [6] E. Arisoy, TN Sainath, B. Kingsbury, and B. Ramabhadran, “Deep neural network language models,” in Proceedings of the …

Investigation of Back off Based Interpolation Between Recurrent Neural Network and n gram Language Models (Author’s Manuscript)
X Chen, X Liu, MJF Gales, PC Woodland – 2016 – dtic.mil
… cam.ac.uk ABSTRACT Recurrent neural network language models (RNNLMs) have become an increasingly popular choice for speech and language processing tasks including automatic speech recognition (ASR). As the gener …

Fast Gated Neural Domain Adaptation: Language Model as a Case Study.
J Zhang, X Wu, A Way, Q Liu – COLING, 2016 – pdfs.semanticscholar.org
… In our experiments, we use the recurrent neural network language model (LM) as a case study … 2.2 Neural Language Model In a nutshell, the Recurrent Neural Network (RNN) language model (LM) uses the previous words to estimate the probability of the next word …

Parallel Randomized Block Coordinate Descent for Neural Probabilistic Language Model with High-Dimensional Output Targets
X Liu, J Yan, X Wang, H Zha – Chinese Conference on Pattern …, 2016 – Springer
… to the neural network with high-dimensional output targets, such as the neural language model … Morin, F., Bengio, Y.: Hierarchical probabilistic neural network language model, pp … Schwenk, H., Gauvain, JL: Training neural network language models on very large corpora, pp …

Adaptation of Language Models for SMT Using Neural Networks with Topic Information
Y Zhao, S Huang, XY Dai, J Chen – ACM Transactions on Asian and …, 2016 – dl.acm.org
… Neural network language models (LMs) are shown to be effective in improving the performance … of the state-of-the-art feedforward neural language model with topic … Key Words and Phrases: Statistical machine translation, feedforward neural network language model, topic model …

A Primer on Neural Network Models for Natural Language Processing.
Y Goldberg – J. Artif. Intell. Res.(JAIR), 2016 – jair.org
Page 1. Journal of Artificial Intelligence Research 57 (2016) 345–420 Submitted 9/15; published 11/16 A Primer on Neural Network Models for Natural Language Processing Yoav Goldberg yoav.goldberg@gmail.com Computer Science Department Bar-Ilan University, Israel …

(Visited 104 times, 1 visits today)