Statistical Natural Language Processing


Notes:

Statistical natural-language processing uses stochastic, probabilistic, and statistical methods. Methods for disambiguation often involve the use of corpora and Markov models. The technology for statistical NLP comes mainly from machine learning and data mining.

A statistical language model is a probability distribution over sequences of words. Topic modeling is a form of text mining, a way of identifying patterns in a corpus. You take your corpus and run it through a tool which groups words across the corpus into ‘topics’.

Wikipedia:

See also:

Language Modeling & Dialog Systems 2014 | PCFG (Probabilistic Context Free Grammar) & Dialog SystemsPLSA (Probabilistic Latent Semantic Analysis) & Dialog SystemsPNN (Probabilistic Neural Network) & Dialog SystemsPOMDP (Partially Observable Markov Decision Process) & Dialog SystemsProbabilistic Graphical Models & Dialog SystemsProbabilistic Parser & Dialog Systems