Notes:
Statistical natural-language processing uses stochastic, probabilistic, and statistical methods. Methods for disambiguation often involve the use of corpora and Markov models. The technology for statistical NLP comes mainly from machine learning and data mining.
A statistical language model is a probability distribution over sequences of words. Topic modeling is a form of text mining, a way of identifying patterns in a corpus. You take your corpus and run it through a tool which groups words across the corpus into ‘topics’.
Wikipedia:
- Category:Language modeling
- Category:Statistical natural language processing
- Language model
- Topic model
See also:
Language Modeling & Dialog Systems 2014 | PCFG (Probabilistic Context Free Grammar) & Dialog Systems | PLSA (Probabilistic Latent Semantic Analysis) & Dialog Systems | PNN (Probabilistic Neural Network) & Dialog Systems | POMDP (Partially Observable Markov Decision Process) & Dialog Systems | Probabilistic Graphical Models & Dialog Systems | Probabilistic Parser & Dialog Systems