Notes:
In the 2000s, neural networks shifted from exploratory to influential in NLP: Bengio et al. (2001; JMLR 2003) introduced the neural probabilistic language model that jointly learned distributed word vectors with a feed-forward predictor; scalability improved via hierarchical softmax and tree factorizations (Morin & Bengio, 2005) and log-bilinear models (Mnih & Hinton, late-2000s) to lower normalization costs over large vocabularies; Collobert & Weston (2008) showed a single deep architecture with shared representations could support tagging, chunking, NER, and SRL; practical CUDA-class GPUs by decade’s end sped training despite modest datasets and models; building on 1990s connectionist work (Elman/Jordan RNNs, RAAM, TDNNs), these advances preceded reusable stand-alone embeddings, as learned vectors remained internal and task-tied, bridging toward the embedding-centric era.
See also:
[Aug 2025]
- 2001 (published 2003): Bengio et al. introduce the neural probabilistic language model, jointly learning distributed word vectors and a feed-forward predictor as a smoother alternative to n-grams.
- 2005: Morin and Bengio propose tree-based factorization (hierarchical softmax) to reduce the computational cost of normalizing over large vocabularies.
- 2006–2007: Unsupervised pretraining and renewed interest in deep architectures broaden feasibility for neural NLP, though datasets and models remain small by later standards.
- 2007–2009: Mnih and Hinton develop log-bilinear and hierarchical approaches that further improve scalability and training efficiency for neural language modeling.
- 2008: Collobert and Weston demonstrate a unified deep architecture with shared representations handling POS tagging, chunking, NER, and SRL within one framework.
- 2009: Increasing practicality of CUDA-class GPUs accelerates training and experimentation with neural models for language.
- Late 2000s: Learned vectors are primarily internal, task-tied parameters rather than portable, stand-alone embeddings, setting the stage for the embedding-centric era of the 2010s.