Topic Modeling & Natural Language Generation

Notes:

Topic modeling is a statistical method used to identify the underlying topics in a collection of documents. There are various types of topic models, including Bayesian, generative, latent, and probabilistic models. These models can be used to classify documents into predefined topics or to generate new documents based on the underlying structure of the data. Dialog modeling is a machine learning technique used to model the structure and content of dialogues between humans or between humans and machines. Query modeling is a machine learning technique used to predict the likelihood of a user issuing a specific query based on their previous queries and other contextual information. Topic summarization is the process of generating a summary of a collection of documents based on their underlying topics. There are various techniques for topic summarization, including extractive and abstractive methods.

Topic modeling can be used as a component of natural language generation systems to generate coherent and relevant text based on a given topic or theme. In this context, topic modeling is used to identify the underlying topics in a collection of documents and to estimate the probability distribution over those topics. This information can be used to guide the generation of text by selecting words and phrases that are relevant to the given topic and by ensuring that the generated text is coherent and flows naturally.

For example, a natural language generation system that is generating a news article about a specific topic might use topic modeling to identify the key themes and ideas in a collection of related articles. It could then use this information to generate a new article that covers the same themes and ideas, but with a different perspective or emphasis.

Topic modeling can also be used to improve the relevance and quality of the generated text by ensuring that the generated words and phrases are semantically and syntactically appropriate for the given topic. This can be done by training a topic model on a large dataset of documents that are relevant to the target topic and using the learned topic-specific word distributions to generate new text.

Neural architectures have shown to be effective in many natural language processing tasks, including topic modeling. In recent years, there has been a trend towards using neural networks for topic modeling because they can learn complex relationships between words and topics, and they can handle large amounts of data efficiently.

There are several approaches to using neural networks for topic modeling. One approach is to use an encoder-decoder architecture, where the encoder processes the input text and the decoder generates a topic distribution for each document. Another approach is to use a transformer architecture, which is a type of neural network that has been successful in a variety of natural language processing tasks.

It is worth noting that there is ongoing research in this area, and the performance of different approaches can vary depending on the specific task and dataset. In general, however, neural architectures have shown to be effective in topic modeling and have the potential to outperform traditional approaches.

Bayesian topic modeling is a type of statistical model that is used to identify the underlying topics in a collection of documents. It is based on Bayesian statistics, which is a branch of statistics that uses probability theory to make predictions about unknown quantities. In Bayesian topic modeling, the goal is to identify the latent topics that are present in a collection of documents and to estimate the probability distribution over these topics given the observed data. This is done by using a probabilistic graphical model, which represents the relationships between different variables in the model.
Dialog modeling is a type of machine learning that is used to model the structure and content of dialogues between humans or between humans and machines. It is used in a variety of applications, such as chatbots and intelligent assistants, to understand and generate natural language responses to user input.
Generative topic model is a type of statistical model that is used to identify the underlying topics in a collection of documents and to generate new documents that are similar to the ones in the original collection. It is called a generative model because it learns the underlying structure of the data and can generate new samples from that structure.
Latent topic model is a type of statistical model that is used to identify the underlying topics in a collection of documents without explicitly specifying what those topics are. Instead, the model learns the topics from the data and represents them as latent variables, which are hidden variables that are not directly observed. The goal is to learn the distribution over these latent variables given the observed data.
LDA (Latent Dirichlet Allocation) is a type of probabilistic topic model that is used to identify the underlying topics in a collection of documents. It is based on the assumption that each document is a mixture of a small number of topics and that each word in the document is associated with one of those topics. In an LDA model, the goal is to estimate the probability distribution over the topics for each document and the probability distribution over the words for each topic. This is done by using a generative process, in which the model first selects a topic mixture for each document and then generates each word in the document by sampling from the topic-specific word distributions.
Pattern-based topic model is a type of statistical model that is used to identify the underlying patterns or structures in a collection of documents. It is based on the idea that the words in a document are not independent of each other, but rather are related to each other in some way.
Probabilistic topic model is a type of statistical model that is used to identify the underlying topics in a collection of documents and to estimate the probability distribution over those topics given the observed data. It is called a probabilistic model because it uses probability theory to make predictions about the relationships between different variables in the model.
Query modeling is a type of machine learning that is used to predict the likelihood of a user issuing a specific query based on their previous queries and other contextual information. It is used in search engines and other information retrieval systems to improve the accuracy and relevance of search results.
Sentence-based topic model is a type of statistical model that is used to identify the underlying topics in a collection of documents by analyzing the words and phrases in each sentence. It is based on the idea that the meaning of a sentence is conveyed by the combination of the words and phrases it contains, and that these words and phrases are related to one another in some way.
Supervised topic model is a type of statistical model that is used to identify the underlying topics in a collection of labeled documents. It is called a supervised model because it requires labeled data, which means that the documents in the collection have been annotated with their corresponding topics. Supervised topic models can be used to predict the topics of new, unseen documents based on the labeled training data.
Topic model is a type of statistical model that is used to identify the underlying topics in a collection of documents. It is based on the idea that the words in a document are not independent of each other, but rather are related to one another in some way. The goal of a topic model is to identify the patterns in the data that reflect these relationships and to use those patterns to classify documents into a set of predefined topics.
Topic model clustering is a method of grouping similar documents together based on their topic distribution. It is often used in conjunction with a topic model to group together documents that are about the same topic or that share similar themes. Clustering can be performed using various techniques, such as k-means clustering or hierarchical clustering.
Topic modeling is a type of statistical model that is used to identify the underlying topics in a collection of documents. It is based on the idea that the words in a document are not independent of each other, but rather are related to one another in some way. The goal of topic modeling is to identify the patterns in the data that reflect these relationships and to use those patterns to classify documents into a set of predefined topics.
Topic modelling is another way of referring to topic modeling. It is the process of using a statistical model to identify the underlying topics in a collection of documents.
Topic models are statistical models that are used to identify the underlying topics in a collection of documents. They can be based on various approaches, such as latent Dirichlet allocation (LDA), probabilistic latent semantic analysis (PLSA), or non-negative matrix factorization (NMF).
Topic summarization is the process of generating a summary of a collection of documents based on their underlying topics. It is often used to extract the main points or themes from a large number of documents and to present them in a concise and easy-to-understand format. There are various techniques that can be used for topic summarization, such as extractive summarization, which involves selecting and condensing the most important sentences from the documents, and abstractive summarization, which involves generating a new summary by paraphrasing and synthesizing the content of the documents.

Resources:

radimrehurek.com/gensim .. topic modelling for humans

Wikipedia:

A review of natural language processing techniques for opinion mining systems
S Sun, C Luo, J Chen – Information Fusion, 2017 – Elsevier
Skip to main content …

Coherent Dialogue with Attention-Based Language Models.
H Mei, M Bansal, MR Walter – AAAI, 2017 – aaai.org
… distance memory. We promote further coherence via topic modeling-based reranking. Introduction … latent topics. We employ the learned topic model to select the best continuation based on document-level topic-matching. We …

GuessWhat?! Visual object discovery through multi-modal dialogue
H De Vries, F Strub, S Chandar, O Pietquin… – Proc. of …, 2017 – openaccess.thecvf.com
Page 1. GuessWhat?! Visual object discovery through multi-modal dialogue Harm de Vries University of Montreal mail@harmdevries.com Florian Strub Univ. Lille, CNRS, Centrale Lille, Inria, UMR 9189 CRIStAL florian.strub@inria.fr …

Text summarization techniques: A brief survey
M Allahyari, S Pouriyeh, M Assefi, S Safaei… – arXiv preprint arXiv …, 2017 – arxiv.org
… KEYWORDS text summarization, knowledge bases, topic models ACM Reference format: Mehdi Allahyari, Seyedamin Pouriyeh, Mehdi Asse , Saeid … rization methods cope with problems such as semantic represen- tation, inference and natural language generation which are …

Towards automating data narratives
Y Gil, D Garijo – Proceedings of the 22nd International Conference on …, 2017 – dl.acm.org
… The dataset to be described is a topic model as a term-topic matrix generated with a Latent Dirichlet Allocation (LDA) algorithm [6] and visualized with Termite [11] … Explanation Patterns We follow natural language generation techniques based on explanation patterns [12] …

Natural language processing in mental health applications using non-clinical texts
RA Calvo, DN Milne, MS Hussain… – Natural Language …, 2017 – cambridge.org
… Topic models created with the Twitter data improved the accuracy of life satisfaction predictions based on the demographic controls (county-by-county scores for age, sex, ethnicity, income, education) which were in turn more predictive than the prevalence of words from emotion …

Piecewise latent variables for neural variational text processing
IV Serban, AG Ororbia, J Pineau… – Proceedings of the 2017 …, 2017 – aclweb.org
… With respect to document modeling, neural ar- chitectures have been shown to outperform well- established topic models such as … Researchers have also investigated latent vari- able models for dialogue modeling and dialogue natural language generation (Bangalore et al …

Autofolding for source code summarization
J Fowkes, P Chanthirasegaran, R Ranca… – IEEE Transactions …, 2017 – ieeexplore.ieee.org
… Recent work has applied language modelling [20], [21], [22], [23], [24], natural language generation [12], [25], machine translation [26], and topic modelling [27] to the text of source code from large software projects. A main challenge in this area Fig. 1. Original source code …

SMS spam filtering and thread identification using bi-level text classification and clustering techniques
NK Nagwani, A Sharaff – Journal of Information Science, 2017 – journals.sagepub.com
SMS spam detection is an important task where spam SMS messages are identified and filtered. As greater numbers of SMS messages are communicated every day, it i…

A roadmap for natural language processing research in information systems
D Liu, Y Li, MA Thomas – … of the 50th …, 2017 – hl-128-171-57-22.library.manoa …
… 23, 2010, pp. 321-344. [8] E. Reiter, R. Dale and Z. Feng, Building natural language generation systems, MIT Press, 2000. [9] T. Winograd, “Understanding natural language”, Cognitive psychology, 3, 1972, pp. 1-191. [10] J. Cuzzola …

Training end-to-end dialogue systems with the ubuntu dialogue corpus
RT Lowe, N Pow, IV Serban, L Charlin… – Dialogue & …, 2017 – dad.uni-bielefeld.de
… Luan et al. (2016) investi- gate several models that incorporate participant roles, using topic-modelling based approaches with LDA. Li et al. (2016a) use an embedding for each separate speaker in the conversation, which is used to condition the decoder in an LSTM model …

Cross-media analysis and reasoning: advances and directions
Y Peng, W Zhu, Y Zhao, C Xu, Q Huang, H Lu… – Frontiers of Information …, 2017 – Springer
… learning techniques. The topic model is another frequently used technique in cross-media uniform representation learning tasks, assuming that heterogeneous data containing the same semantics shares some latent topics. For …

Natural language processing
K Sirts – 2017 – courses.cs.ut.ee
… for NLP • NLTK (general text processing) • Gensim (word embeddings, topic models) • Log-?linear models/CRF • Keras/tensorflow (for neural networks) 16 … Information retrieval • Machine translation • Natural language generation • Text summarization • Dialog systems 23 …

Data-driven broad-coverage grammars for opinionated natural language generation (ONLG)
T Cagan, SL Frank, R Tsarfaty – Proceedings of the 55th Annual Meeting …, 2017 – aclweb.org
… Opinionated natural language generation (ONLG) is a new, challenging, NLG task in which we aim to automatically gener- ate human-like … We also show that con- ditioning the generation on topic models makes generated responses more relevant to the document content …

Event detection and semantic storytelling: generating a travelogue from a large collection of personal letters
G Rehm, JM Schneider, A Srivastava… – Proceedings of the …, 2017 – aclweb.org
Page 1. Proceedings of the Events and Stories in the News Workshop, pages 42–51, Vancouver, Canada, August 4, 2017. cO2017 Association for Computational Linguistics Event Detection and Semantic Storytelling: Generating …

Semantic summary automatic generation in news event
W Liu, X Luo, J Zhang, R Xue… – … : Practice and Experience, 2017 – Wiley Online Library
… based approaches are consistent with the logic of human generating summarization, it is difficult to automatically generate summarization with human logic due to the limitations of natural language generation technology … 2.3 Probabilistic topic model-based approaches …

Sounding Board–University of Washington’s Alexa Prize Submission
H Fang, H Cheng, E Clark… – Alexa Prize …, 2017 – pdfs.semanticscholar.org
… The instruction speech acts are help messages depending on the dialogue state and error detection. 2.3 Natural language generation … arXiv:1602.03606 [cs.CL], 2016. [13] Radim ?Rehurek and Petr Sojka. Software framework for topic modelling with large corpora. In Proc …

Generating and Evaluating Summaries for Partial Email Threads: Conversational Bayesian Surprise and Silver Standards
J Johnson, V Masrani, G Carenini, R Ng – Proceedings of the 18th …, 2017 – aclweb.org
… Others make use of more advanced methods including topic modeling, latent semantic analysis or rhetor- ical parsing (Nagwani, 2015; Kireyev, 2008; Hirao et al., 2013) … Our silver standard system also makes use of topic modeling …

Multilingual extension and evaluation of a poetry generator
HG Oliveira, R Hervás, A Díaz… – Natural Language …, 2017 – cambridge.org
… Abstract Poetry generation is a specific kind of natural language generation where several sources of knowledge are typically exploited to handle features on different levels, such as syntax, semantics, form or aesthetics. But …

Computer Vision and Natural Language Processing: Recent Approaches in Multimedia and Robotics
P Wiriyathammabhum, D Summers-Stay… – ACM Computing …, 2017 – dl.acm.org
Page 1. 71 Computer Vision and Natural Language Processing: Recent Approaches in Multimedia and Robotics PERATHAM WIRIYATHAMMABHUM, University of Maryland, College Park DOUGLAS SUMMERS-STAY, US …

Alleviating overfitting for polysemous words for word representation estimation using lexicons
Y Ke, M Hagiwara – Neural Networks (IJCNN), 2017 …, 2017 – ieeexplore.ieee.org
… reported useful in many other tasks such as recognizing textual entailment [31], [32], measuring the semantic similarity [33]–[35], monolingual alignment [36], [37], and natural language generation [38] … 17] R. Das, M. Zaheer, and C. Dyer, “Gaussian lda for topic models with word …

Conclusion and Future Work
R Shah, R Zimmermann – … Analysis of User-Generated Multimedia Content, 2017 – Springer
… Thus, in the future, we would also like to focus on the topic modeling for different segments in lecture videos … [31, 32] used topic modeling to map videos (eg, YouTube and VideoLectures.Net) and blogs (Wikipedia and Edublogs) in the common semantic space of topics …

Variational Deep Semantic Hashing for Text Documents
S Chaidaroon, Y Fang – arXiv preprint arXiv:1708.03436, 2017 – arxiv.org
… It has a racted more and more a ention in recent years. For example, Wang et al. [36] pro- pose Semantic Hashing using Tags and Topic Modeling (SHTTM) to incorporate tags to obtain more effective hashing codes via a matrix factorization formulation …

A conceptual review of Automatic Question Generation from a given Punjabi Text
AS Gill, G kaur Virk, A Bhandari – 2017 – ijetsr.com
… research areas like machine translation, information retrieval, text summarization, question answering, information extraction, topic modeling, and opinion … 1.2 NLG AND CONCEPTS 1.2.1 Natural Language Generation (NLG) David Lindberg, 2010 explains NLG as part of …

Modeling and Prediction of People’s Needs (Vision Paper)
R Li, A Züfle, L Zhao, G Lamprianidis – … on Analytics for Local Events and …, 2017 – dl.acm.org
… One may then build temporal geographical topic models as introduced in Section 3.2 for those two groups … For other locations with the same geospatial property, natural language generation approaches [13] can be adopted and trained based on current petitions to generate …

Short Text Summarization using Topic Modeling Algorithm
N Sahu, V Chawra – ijetmas.com
… Short Text Summarization using Topic Modeling Algorithm … In general, abstraction can shorten a text more strongly than extraction summaries can do, but the programs that can do is harder to generate as they need for the usage of natural language generation technology, which …

Semantic/Content Analysis/Natural Language Processing
P Nulty – Encyclopedia of Big Data, 2017 – Springer
… Language modeling is particularly important for natural language generation and speech recognition problems … Topic modeling (Blei 2012) is a widely used generative technique to discover a set of topics that influence the generation of the texts, and explore how they are …

Building Emotional Conversation Systems Using Multi-task Seq2Seq Learning
R Zhang, Z Wang, D Mai – National CCF Conference on Natural Language …, 2017 – Springer
… We train a LDA topic model using gensim 1 , and calculate text similarity using both topic similarity and cosine similarity: $$\begin … Wen, TH, Gasic, M., Mrksic, N., Su, PH, Vandyke, D., Young, S.: Semantically conditioned LSTM-based natural language generation for spoken …

Proceedings of the 21st Conference on Computational Natural Language Learning (CoNLL 2017)
R Levy, L Specia – Proceedings of the 21st Conference on …, 2017 – aclweb.org
… 195 An Automatic Approach for Document-level Topic Model Evaluation Shraey Bhatia, Jey Han Lau and Timothy Baldwin … 432 Natural Language Generation for Spoken Dialogue System using RNN Encoder-Decoder Networks Van-Khanh Tran and Le-Minh Nguyen …

Natural Language Processing: State of The Art, Current Trends and Challenges
D Khurana, A Koli, K Khatter, S Singh – arXiv preprint arXiv:1708.05148, 2017 – arxiv.org
… 3. Natural Language Generation Natural Language Generation (NLG) is the process of producing phrases, sentences and paragraphs that are meaningful from an internal representation … Few techniques are as follows– – Bayesian Sentence based Topic Model (BSTM) uses …

Study on Multi Document Summarization by Machine Learning Technique for Clustered Documents
S Kasundra, DL Kotak – 2017 – pdfs.semanticscholar.org
… Where Abstractive summarization requires natural language processing techniques such as semantic representation, natural language generation, and compression techniques … In Paper [3] , In this paper, they proposed a novel pattern-based topic model (PBTMSum) for the task …

An assisted literature review using machine learning models to identify and build a literature corpus
R Brisebois, A Abran, A Nadembega… – International Journal of …, 2017 – ijesi.org
… For example, Carlos and Thiago [2] developed a supervised MLM-based solution for text mining scientific articles using the R language in ?Knowledge Extraction and Machine Learning? based on social network analysis, topic models and bipartite graph approaches …

Tree and word embedding based sentence similarity for evaluation of good answers in intelligent tutoring system
E Brajkovi?, D Vasi? – Software, Telecommunications and …, 2017 – ieeexplore.ieee.org
… Recently there has been large advances in Natural Language Understanding (NLU) and Natural Language Generation (NLG), these advances can be used to enhance communication between … [9] R. ?eh??ek and P. Sojka, “Software Framework for Topic Modelling with Large …

Doctoral Advisor or Medical Condition: Towards Entity-Specific Rankings of Knowledge Base Properties
S Razniewski, V Balaraman, W Nutt – International Conference on …, 2017 – Springer
… the properties that do not yet have facts are the ones that should be ranked, while for the natural language generation task, it is … To compute semantic similarity, we rely on standard latent topic models, in particular Latent Semantic Indexing (LSI) [9] and Latent Dirichlet Allocation …

I Read the News Today, Oh Boy
T Veale, H Chen, G Li – International Conference on Distributed, Ambient …, 2017 – Springer
… ?eh??ek, R., Sojka, P.: Software framework for topic modeling with large corpora. In: Proceedings of the LREC 2010 Workshop on New Challenges for NLP Frameworks, pp. 45–50 (2010)Google Scholar. Reiter, E., Dale, R.: Building Natural Language Generation Systems …

Towards Abstractive Multi-Document Summarization Using Submodular Function-Based Framework, Sentence Compression and Merging
Y Chali, M Tanvee, MT Nayeem – Proceedings of the Eighth …, 2017 – aclweb.org
… On the other hand, abstractive summarization is a way of natural language generation and using this approach, it is possible to produce human-like summaries (Rush et al., 2015; Chopra et al., 2016; … 2009. Multi-document summarization using sentence- based topic models …

Summarization Of Software Artifacts
S Gupta, SK Gupta – aircconline.com
… Fowkes et al. [24] Source Code Topic Models (extension of TopicSum) Storm, elasticSearch, Spring-framework, libgdx, bigbluebutton, netty … In Semantic Based Approach, semantic information about the document is used and is fed into the Natural Language Generation system …

Novel Methods for Natural Language Generation in Spoken Dialogue Systems
O Dušek – 2017 – dspace.cuni.cz
… Ond?ej Dušek Novel Methods for Natural Language Generation in Spoken Dialogue Systems Institute of Formal and Applied Linguistics Supervisor: Ing … iii Page 4. Page 5. Title: Novel Methods for Natural Language Generation in Spoken Dialogue Systems Author: Ond?ej Dušek …

Type-Aware Question Answering over Knowledge Base with Attention-Based Tree-Structured Neural Networks
J Yin, WX Zhao, XM Li – Journal of Computer Science and Technology, 2017 – Springer
Page 1. Yin J, Zhao WX, Li XM. Type-aware question answering over knowledge base with attention-based tree-structured neural networks. JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY 32(4): 805–813 July 2017. DOI 10.1007/s11390- 017-1761-8 …

Low-Rank RNN Adaptation for Context-Aware Language Modeling
A Jaech, M Ostendorf – arXiv preprint arXiv:1710.02603, 2017 – arxiv.org
… The em- beddings can either be learned off-line using a topic model (Mikolov and Zweig, 2012) or end-to-end as part of the adapted LM (Tang et al., 2016). Here, we use end-to-end learning, where the context embed- ding is the output of a feed-forward network with a …

Automatically Difficulty Grading Method Based on Knowledge Tree
J Zhang, C Liu, H Yang, F Feng, X Gong – International Conference on …, 2017 – Springer
… In: NTCIR, Citeseer (2013)Google Scholar. 5. Gkatzia, D., Lemon, O., Rieser, V.: Natural language generation enhances human decision-making with uncertain information … Wan, X., Wang, T.: Automatic labeling of topic models using text summaries …

Knowledge based Automatic Summarization
ATB Choi – 2017 – researchgate.net
… Statistical methods based on Latent Semantic Analysis (LSA), Bayesian topic modelling, Hidden Markov Model (HMM) and Conditional random field (CRF) derive underlying … Unlike Cyc, WordNet does not have reasoning engine and natural language generation capabilities …

Sentiment Analysis of Moroccan Tweets using Naive Bayes Algorithm
A EL ABDOULI, L HASSOUNI, H ANOUN – academia.edu
… form the training set for the Naive Bayes algorithm to identify the sentiment within the new collected tweets, then we apply topic modeling using LDA … The other is Natural Language Generation (NLG) translate information from computer databases into readable human language …

Extracting Information from Social Network using NLP
C Virmani, A Pillai, D Juneja – International Journal of …, 2017 – ripublication.com
… searching, part-of-speech tagging, named entity extraction, translation, information grouping, natural language generation, feedback analysis … It is a Java package that provides Latent Dirichlet Allocation, document classification, clustering, topic modeling, information extraction …

Opinion Summarization and Visualization
G Murray, E Hoque, G Carenini – Sentiment Analysis in Social Networks, 2017 – Elsevier
… This type of abstraction is clearly more difficult than extraction, not least because it requires a natural language generation component, but it also offers the hope of more fluent and informative summaries than extractive techniques can offer …

An improved textual storyline generating framework for disaster information management
Q Zhou, R Yuan, T Li – Intelligent Systems and Knowledge …, 2017 – ieeexplore.ieee.org
… The former one usually adopts natural language generation algorithm such as sentence compression and reformulation, and information fusion … other algorithms including latent semantic analysis, non-negative matrix factorization and sentence-based topic models, most existing …

Text Mining with R
Y Zhao – 2017 – rdatamining.com
… Latent Dirichlet Allocation (LDA): the most widely used topic model … with major points of the orignial document ? Approaches ? Extraction: select a subset of existing words, phrases or sentences to build a summary ? Abstraction: use natural language generation techniques to …

Conversational Exploratory Search via Interactive Storytelling
S Vakulenko, I Markov, M de Rijke – arXiv preprint arXiv:1709.05298, 2017 – arxiv.org
… networks (RNNs): (1) event representations are extracted from text using dependency pars- ing, stemming and topic modeling; (2) event2event … systems suggest that it is feasible to develop a computational model able to learn natural language generation and communication …

Neural relevance-aware query modeling for spoken document retrieval
TH Lo, YW Chen, KY Chen, HM Wang… – … (ASRU), 2017 IEEE, 2017 – ieeexplore.ieee.org
… [6] and the topic models [7], among … Since the former requires more sophisticated natural language processing techniques, including semantic representation and inference, as well as natural language generation, most efforts have been concentrated on launching the query …

Cognitive Approach to Natural Language Processing
B Sharp, F Sedes, W Lubaszewski – 2017 – books.google.com
… 195 Chapter 10. Benchmarking n-grams, Topic Models and Recurrent Neural Networks by Cloze Completions, EEGs and Eye Movements . . . . . 197 Markus J. HOFMANN, Chris BIEMANN and Steffen REMUS 10.1. Introduction …

Response selection from unstructured documents for human-computer conversation systems
Z Yan, N Duan, J Bao, P Chen, M Zhou, Z Li – Knowledge-Based Systems, 2017 – Elsevier
This paper studies response selection for human-computer conversation systems. Existing retrieval-based human-computer conversation systems are intended to repl.

Social Media Summarization
V Varma, LJ Kurisinkel, P Radhakrishnan – A Practical Guide to Sentiment …, 2017 – Springer
… convert the source text into an internal semantic representation which in turn is utilized by Natural Language Generation techniques to … Approaches to content analysis include generative topic models (Haghighi and Vanderwende 2009; Celikyilmaz and Hakkani-Tur 2010; Li et …

A Survey on Dialogue Systems: Recent Advances and New Frontiers
H Chen, X Liu, D Yin, J Tang – arXiv preprint arXiv:1711.01731, 2017 – arxiv.org
… It learns the next action based on current dialogue state. • Natural language generation (NLG) … 2.1.4 Natural Language Generation The natural language generation component converts an ab- stract dialogue action into natural language surface utter- ances …

A survey on: Extractive text document summarization techniques
CS Yadav, R Kumar, PSS Aydav, HP Singh – advancedjournal.com
… This produce abstract and the synthesis phase, so here generally involves Natural Language Generation from a semantic or discourse level representation … Algorithm-5 TOPIC MODEL 1. Decompose the given document D into sentences {S1, S2 …

Event phase oriented news summarization
C Wang, X He, A Zhou – World Wide Web, 2017 – Springer
… For example, Peng et al. [33] propose a Central Topic Model (CenTM) to track dynamic topics in microblog streams … Abstraction-based methods employ the technique of natural language generation to cre- ate comprehensive summaries expressed in a more natural way …

Benben: A Chinese Intelligent Conversational Robot
WN Zhang, T Liu, B Qin, Y Zhang, W Che… – Proceedings of ACL …, 2017 – aclweb.org
… Concretely, the natu- ral language understanding, dialogue managemen- t and natural language generation in spoken dia- logue systems are corresponding to the 1), 2) and 3), 4 … 2009. Labeled LDA: A su- pervised topic model for credit attribution in multi- labeled corpora …

Different Types of Automated and Semi-automated Semantic Storytelling: Curation Technologies for Different Sectors
G Rehm, J Moreno-Schneider, P Bourgonje… – … Conference of the …, 2017 – Springer
… First we perform topic modeling using a bag-of-words representation with the vectors based on tf/idf values (Sect … and the ability to automatically visualise or generate a story from this semantic representation using some form of Natural Language Generation (NLG) (Rishes et al …

A Unified Latent Variable Model for Contrastive Opinion Mining
E IBEKE, LIN Chenghua, A Wyner… – Frontiers of Computer …, 2017 – researchgate.net
… Keywords Contrastive opinion mining, Sentiment analysis, Topic modelling Received month dd, yyyy; accepted month dd, yyyy … Mukherjee and Liu [10] proposed several topic models for mining contentions from discussions and debates …

Analytics as an Enabler of Advanced Manufacturing: A Critical Review of Tools and Applications
RS Kenett, I Yahav, A Zonnenshain – 2017 – papers.ssrn.com
… According to the Forrester’s analysis (https://go.forrester.com/) and the IDC list of decision management tools, the top analytical technologies and their use in Manufacturing 4.0 are: Natural language generation, natural language processing and text mining: Producing natural …

Extractive Multi-document Summarization Using Multilayer Networks
JV Tohalino, DR Amancio – arXiv preprint arXiv:1711.02608, 2017 – arxiv.org
… In the case of English MDS, we considered the following works: DUC-best, which is the system with highest ROUGE scores for DUC conferences; BSTM [39], which uses a Bayesian sentence-based topic model for summarization; FGB [40], which proposes …

Method Level Text Summarization for Java Code Using Nano-Patterns
S Rai, T Gaikwad, S Jain… – Asia-Pacific Software …, 2017 – ieeexplore.ieee.org
… well. Key Words Java Methods, Source Code, Natural Language Generation, Text Summarization, Nano-Patterns. I … importance. Their approach was based on AST folding decisions and scoped topic model for code tokens. These …

A retrieval-based dialogue system utilizing utterance and context embeddings
A Bartl, G Spanakis – arXiv preprint arXiv:1710.05780, 2017 – arxiv.org
… services, or just for the sake of enter- tainment [4]. The traditional design of Dialogue Systems [5] follows a modular approach, splitting the system usually into a Natural Language Understanding (NLU) module, a Dialogue Manager and a Natural Language Generation (NLG) …

Automated Question Generation from Configuration Knowledge Bases
A Shehadeh, A Felfernig, M üsl üm Atas – 19 th International Configuration …, 2017 – ieseg.fr
… Mazidi et al.[14] described an approach to automatic question generation from natural language understanding (NLU) to natural language generation (NLG) … The proposed technique selects informative sentences based on topic modeling and parse structure similarity …

An Adaptive Semantic Descriptive Model for Multi-Document Representation to Enhance Generic Summarization
NA Dief, AE Al-Desouky, AA Eldin… – International Journal of …, 2017 – World Scientific
Page 1. An Adaptive Semantic Descriptive Model for Multi-Document Representation to Enhance Generic Summarization Nada A. Dief*, Ali E. Al-Desouky and Amr Aly Eldin Department of Computer Engineering and Systems …

An Assisted Literature Review using Machine Learning Models to Recommend a Relevant Reference Papers List
R Brisebois, A Abran, A Nadembega, P N’techobo – researchgate.net
… For example, Carlos and Thiago [2] developed a supervised MLM-based solution for text mining scientific articles using the R language in “Knowledge Extraction and Machine Learning” based on social network analysis, topic models and bipartite graph approaches …

Self-Guiding Multimodal LSTM-when we do not have a perfect training dataset for image captioning
Y Xian, Y Tian – arXiv preprint arXiv:1709.05038, 2017 – arxiv.org
… The third group of approaches integrates image under- standing and natural language generation into a unified pipeline. In general, image content in terms of objects, ac- tions, and attributes is represented based on a set of visual features …

Deep Learning Models For Multiword Expression Identification
W Gharbieh, V Bhavsar, P Cook – … of the 6th Joint Conference on Lexical …, 2017 – aclweb.org
… an emerging class of ma- chine learning models that have recently achieved promising results on a range of NLP tasks such as machine translation (Bahdanau et al., 2015; Sutskever et al., 2014), named entity recognition (Lample et al., 2016), natural language generation (Li et …

Twitter Data Analysis
S Chitrakala – Modern Technologies for Big Data Classification …, 2017 – books.google.com
… A few of the analytics that do this include sentiment analysis, trending topic analysis, topic modeling, information diffusion modeling … In semantic based technique, linguistics illustration of document (s) is employed to feed into natural language generation (NLG) system …

Doctoral Advisor or Medical Condition: Towards Entity-specific Rankings of Knowledge Base Properties [Extended Version]
S Razniewski, V Balaraman, W Nutt – arXiv preprint arXiv:1709.06907, 2017 – arxiv.org
… task, the properties that do not yet have facts are the ones that should be ranked, while for the natural language generation task, it is … as qualifier to the participant” for the property goals scored.5 To compute semantic similarity, we rely on standard latent topic models, in particular …

Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
R Barzilay, MY Kan – Proceedings of the 55th Annual Meeting of the …, 2017 – aclweb.org
… call attention to two key ones here. In the review process, we pioneered the use of the Toronto Paper Matching System, a topic model based approach to the assignment of reviewers to papers. We hope this decision will spur …

Automatic Neural Question Generation using Community-based Question Answering Systems
T Baghaee – 2017 – uleth.ca
Page 1. AUTOMATIC NEURAL QUESTION GENERATION USING COMMUNITY- BASED QUESTION ANSWERING SYSTEMS TINA BAGHAEE Bachelor of Science, Shahid Beheshti University, 2011 A Thesis Submitted to the …

A framework for the automated generation of paradigm-adaptive summaries of games
BJ Sandesh, G Srinivasa – International Journal of …, 2017 – inderscienceonline.com
… blogs (Lee et al., 2008). For instance, topic models relevant to a user query and time- ordered tweets that are relevant are extracted to generate a summary of an event (Chong et al., 2013). Not just language processing, but …

Deep Memory Networks for Natural Conversations
??? – 2017 – s-space.snu.ac.kr
Page 1.

Subtopic annotation and automatic segmentation for news texts in Brazilian Portuguese
PCF Cardoso, TAS Pardo, M Taboada – Corpora, 2017 – euppublishing.com
… Riedl and Biemann (2012), based on TextTiling, proposed the TopicTiling algorithm that segments documents using the Latent Dirichlet Allocation (LDA) topic model (Blei et al., 2003) … The topic model must be trained on documents similar in content to the test documents …

Creating a reference data set for the summarization of discussion forum threads
S Verberne, E Krahmer, I Hendrickx, S Wubben… – Language Resources …, 2017 – Springer
… An alternative approach to thread summarization is topic modeling (Ren et al. 2011; Llewellyn et al. 2014) … The authors find that for the task of summarizing newspaper comments topic model clustering gave the best results when compared to a human reference summary …

Event Extraction for Document-Level Structured Summarization
A Hsi – 2017 – cs.cmu.edu
… Topic models like LDA [2] have latent factors that are difficult to interpret, and do not offer insight into the fillers of arguments … Extractive summarization avoids the problem of natural language generation by instead extracting subsets of the original text in order to create a summary …

Adaptive News Video Uploading
R Shah, R Zimmermann – … Analysis of User-Generated Multimedia Content, 2017 – Springer
… 2016. Videopedia: Lecture Video Recommendation for Educational Blogs Using Topic Modeling. In Proceedings of the Springer International Conference on Multimedia Modeling, 238–250.Google Scholar. 32 … 2016. Fuzzy Clustering of Lecture Videos Based on Topic Modeling …

Lecture Video Segmentation
R Shah, R Zimmermann – … Analysis of User-Generated Multimedia Content, 2017 – Springer
In multimedia-based e-learning systems, the accessibility and searchability of most lecture video content is still insufficient due to the unscripted and spontaneous speech of the speakers. Thus, it i.

Semantic action recognition by learning a pose lexicon
L Zhou, W Li, P Ogunbona, Z Zhang – Pattern Recognition, 2017 – Elsevier
… Apart from learning latent action semantics by topic models, another approach focused on defining explicit semantic elements to … assisted video analysis: The main tasks of linguistics assisted video analysis include action recognition, natural language generation of videos …

Automatic acquisition of adjective lexicalizations of restriction classes: a machine learning approach
S Walter, C Unger, P Cimiano – Journal on Data Semantics, 2017 – Springer
… lot of work on clustering adjectives by their semantics [4, 6]. Further work has addressed the extraction of hyponyms and hypernyms from corpora [14] and of identifying the particular meaning of adjectives in adjective–noun phrase compounds, eg relying on topic models [11, 12 …

A support vector approach for cross-modal search of images and texts
Y Verma, CV Jawahar – Computer Vision and Image Understanding, 2017 – Elsevier
… both (Kuznetsova et al., 2012; Ordonez et al., 2011)). This information is then fused using some Natural Language Generation (NLG) technique to construct image descriptions. All these works have shown that though generating …

Recent advances in document summarization
J Yao, X Wan, J Xiao – Knowledge and Information Systems, 2017 – Springer
… A later work [21] also utilizes a hLDA-style model to devise a sentence-level probabilistic topic model and a … compactness and informativeness, such as paraphrasing and sentence fusion [9]. Due to the immatureness of current natural language generation techniques, some of …

Developing, evaluating, and refining an automatic generator of diagnostic multiple choice cloze questions to assess children’s comprehension while reading
J Mostow, YT Huang, H Jang, A Weinstein… – Natural Language …, 2017 – cambridge.org
Page 1. Natural Language Engineering 23 (2): 245–294. c Cambridge University Press 2016 doi:10.1017/S1351324916000024 245 Developing, evaluating, and refining an automatic generator of diagnostic multiple choice cloze questions to assess children’s comprehension …

The Reverse Association Task
R Rapp – Cognitive Approach to Natural Language Processing, 2017 – Elsevier
… It implements this by using a topic model … In our approach, we have eliminated all (for this particular task) unnecessary sophistication, such as Latent Semantic Analysis (which we used extensively in previous work) or Topic Modeling, resulting in a simple yet effective algorithm …

Understanding Knowledge Graphs
H Wu, R Denaux, P Alexopoulos, Y Ren… – Exploiting Linked Data …, 2017 – Springer
… consume. In this sense, natural language generation techniques11 can be applied here. Alternatively, a less ambitious approach can be to convert RDF data into a semi-structured representa- tion such as an (HTML) list. This …

Soundtrack Recommendation for UGVs
R Shah, R Zimmermann – … Analysis of User-Generated Multimedia Content, 2017 – Springer
Capturing videos anytime and anywhere, and then instantly sharing them online, has become a very popular activity. However, many outdoor user-generated videos (UGVs) lack a certain appeal because thei.

Evaluative language beyond bags of words: Linguistic insights and computational applications
F Benamara, M Taboada, Y Mathieu – Computational Linguistics, 2017 – MIT Press
Create a new account. Email. Returning user. Can’t sign in? Forgot your password? Enter your email address below and we will send you the reset instructions. Email. Cancel. If the address matches an existing account you will …

Semantic Analysis for Human Motion Synthesis
??? – 2017 – s-space.snu.ac.kr
… Page 24. 2: Background 16 Natural language generation is the task of generating natural language from machine rep- resentation … readable text based on sales data [4].Advancement of deep learning technology has affected also natural language generation …

A Survey of Machine Learning for Big Code and Naturalness
M Allamanis, ET Barr, P Devanbu, C Sutton – arXiv preprint arXiv …, 2017 – arxiv.org
… and natural language generation systems. Code representational models are analogous to systems for named entity recognition, text classification, and sentiment analysis in NLP. Finally, code pattern mining models are analogous to probabilistic topic models and ML tech …

A review of spatial reasoning and interaction for real-world robotics
C Landsiedel, V Rieser, M Walter, D Wollherr – Advanced Robotics, 2017 – Taylor & Francis
… a dialogue system has three modules, one each for input, output and control, as shown in Figure 1 after [1 Rieser V, Lemon O. Reinforcement learning for adaptive dialogue systems: a data-driven methodology for dialogue management and natural language generation …

Event Understanding
R Shah, R Zimmermann – … Analysis of User-Generated Multimedia Content, 2017 – Springer
The rapid growth in the amount of photos/videos online necessitates for social media companies to automatically extract knowledge structures (concepts) from photos and videos to provide diverse multim.

Learning Semantics for Image Annotation
A Tariq, H Foroosh – arXiv preprint arXiv:1705.05102, 2017 – arxiv.org
… These words may also be used to generate proper phrases generated by natural language generation techniques … These words may also be used to generate phrases if used with some natural language generation process …

Automatic Summarization of Fiction by Generating Character Descriptions
W Zhang – 2017 – digitool.library.mcgill.ca
… annotation and frames (Elsner, 2012; Bamman et al., 2014; Vala et al., 2015), narra- tive analysis (Halpin et al., 2004; Piper, 2015), topic modeling (Jockers and Mimno … 16 Page 25. Chang et al. (2009) proposed another probabilistic topic model, the Nubbi …

Tag Recommendation and Ranking
R Shah, R Zimmermann – … Analysis of User-Generated Multimedia Content, 2017 – Springer
Social media platforms such as Flickr allow users to annotate photos with descriptive keywords, called, tags with the goal of making multimedia content easily understandable, searchable, and discovera.

Workshop Program
A ANANDKUMAR, FEI SHA – 2017 – pdfs.semanticscholar.org
… Yishu Miao, Wang Ling, Tsung-Hsien Wen, Kris Cao, Daniela Gerz, Phil Blunsom, Chris Dyer C4.11, Thu Aug 10, 08:30 AM Research on natural language generation is rapidly growing due to the increasing demand for human-machine communication in natural language …

Explainable Recommendations
RC Kanjirathinkal – 2017 – cs.cmu.edu
Page 1. Thesis Proposal Explainable Recommendations Rose Catherine Kanjirathinkal November 2017 School of Computer Science Carnegie Mellon University Pittsburgh, PA 15213 Thesis Committee: Prof. William W. Cohen, Chair, Carnegie Mellon University Prof …

A systematic review and taxonomy of explanations in decision support and recommender systems
I Nunes, D Jannach – User Modeling and User-Adapted Interaction, 2017 – Springer
With the recent advances in the field of artificial intelligence, an increasing number of decision-making tasks are delegated to software systems. A key requirement for the success and adoption of suc.

Dissertations in Forestry and Natural Sciences
EA KOLOG – epublications.uef.fi
… NLG Natural Language Generation NLU Natural Language Understanding NLP Natural Language Processing NLTK Natural Language Tool Kit NRC National Research Council Ghana PAD Pleasure, Arousal and Dorminance PD Participatory Design POS Part-Of-Speech RQ …

A novel X-FEM based fast computational method for crack propagation
Z Cheng, H Wang, PMB Vitanyi, N Chater, M Barzegari… – arxiv.org
… Vision and Pattern Recognition (cs.CV). arXiv:1708.01677 [pdf, other] Title: A network approach to topic models … bio.GN). arXiv:1708.01759 [pdf, other] Title: Referenceless Quality Estimation for Natural Language Generation …

Lexicography as Feature Engineering
O Montoya – 2017 – bir.brandeis.edu
… Beyond making more useful thesauruses, semantically sensitive lexical selection is a re- quirement of effective natural language generation (NLG), and finer-grained knowledge of … although in the course of this research I have seen some clear affinities with topic modeling …

Using Syntactic Patterns to Enhance Text Analytics
BB Meyer – 2017 – search.proquest.com
… I would like to thank Dr. Kaushik for his insights and suggestions regarding different. topic modeling approaches. I would like to thank Dr. Xing Fang, for the many vibrant discussions and debates … 52. 14 Sorted Topic Model Semantic Similarity Scores …

Proceedings of the Student Research Workshop at the 15th Conference of the European Chapter of the Association for Computational Linguistics
FA Kunneman, U Iñurrieta, JJ Camilleri, MC Ardanuy – 2017 – repository.ubn.ru.nl
Page 1. PDF hosted at the Radboud Repository of the Radboud University Nijmegen The following full text is a publisher’s version. For additional information about this publication click this link. http://hdl.handle.net/2066/180459 …

Supporting Study Selection of Systematic Literature Reviews in Software Engineering with Text Mining
Q Zhong – 2017 – jultika.oulu.fi
Page 1. Supporting Study Selection of Systematic Literature Reviews in Software Engineering with Text Mining University of Oulu Information Processing Science Master’s Thesis Qianhui Zhong Date 07.04.2017 Page 2. 2 Abstract …

Machine translation of domain-specific expressions within ontologies and documents
M Arcan – 2017 – aran.library.nuigalway.ie
Page 1. Provided by the author(s) and NUI Galway in accordance with publisher policies. Please cite the published version when available. Downloaded 2017-12-09T18:04:41Z Some rights reserved. For more information, please see the item record link above …

Multimodal Analysis of User-Generated Multimedia Content
R Shah, R Zimmermann – 2017 – Springer
Page 1. Socio-Affective Computing 6 Rajiv Shah Roger Zimmermann Multimodal Analysis of User-Generated Multimedia Content Page 2. Socio-Affective Computing Volume 6 Series Editor Amir Hussain, University of Stirling, Stirling, UK …

Helping users learn about social processes while learning from users: developing a positive feedback in social computing
VSS Pillutla – 2017 – search.proquest.com
… The summary generated by abstraction. techniques is not limited to the explicit words of the text; natural language generation techniques. are used to generate the summary [120] … and the KL divergence model. We will also discuss the advantage of using topic modeling over …

Deciphering clinical text: concept recognition in primary care text notes
AD Savkov – 2017 – sro.sussex.ac.uk
Page 1. A University of Sussex PhD thesis Available online via Sussex Research Online: http://sro.sussex.ac.uk/ This thesis is protected by copyright which belongs to the author. This thesis cannot be reproduced or quoted extensively from without first …

Entity-Centric Discourse Analysis and Its Applications
X Wang – 2017 – repository.kulib.kyoto-u.ac.jp
… machine learning as we have witnessed in recent years, like the Support Vector Machine (SVM) model for classification tasks, the topic models for clustering as well as the recent boom of the neural networks for many applications. The rapid …

Query Log Anonymization by Differential Privacy
S Zhang – 2017 – repository.library.georgetown.edu
Page 1. Query Log Anonymization by Differential Privacy A Thesis submitted to the Faculty of the Graduate School of Arts and Sciences of Georgetown University in partial fulfillment of the requirements for the degree of Doctor of Philosophy in Computer Science By …

Text Mining in Financial Industry: Implementing Text Mining Techniques on Bank Policies
D Ferati – 2017 – dspace.library.uu.nl
Page 1. TEXT MINING IN FINANCIAL INDUSTRY: IMPLEMENTING TEXT MINING TECHNIQUES ON BANK POLICIES MASTER’S THESIS Masters of Business Informatics Faculty of Science, Department of Information and Computing Science …

Context-aware recommender systems in mobile environment: On the road of future research
IB Sassi, S Mellouli, SB Yahia – Information Systems, 2017 – Elsevier

An Information-theoretic Approach to Machine-oriented Music Summarization
MM Summarization – pdfs.semanticscholar.org
Page 1. 1 An Information-theoretic Approach to Machine-oriented Music Summarization Francisco Raposo, David Martins de Matos, Senior Member, IEEE, Ricardo Ribeiro Abstract—Applying generic media-agnostic summarization …

Exploring the Internal Statistics: Single Image Super-Resolution, Completion and Captioning
Y Xian – 2017 – search.proquest.com
Exploring the Internal Statistics: Single Image Super-Resolution, Completion and Captioning. Abstract. Image enhancement has drawn increasingly attention in improving image quality or interpretability. It aims to modify images …

Speech-Based Real-Time Presentation Tracking Using Semantic Matching
R Asadi – 2017 – search.proquest.com
Speech-Based Real-Time Presentation Tracking Using Semantic Matching. Abstract. Oral presentations are an essential yet challenging aspect of academic and professional life. To date, many commercial and research products …

Learning from Temporally-Structured Human Activities Data
ZC Lipton – 2017 – search.proquest.com
Learning from Temporally-Structured Human Activities Data. Abstract. Despite the extraordinary success of deep learning on diverse problems, these triumphs are too often confined to large, clean datasets and well-defined objectives …