100 Best GitHub: N-gram


ghn-gram030814

See also:

100 Best GitHub: Ngram | 100 Best N-gram VideosConcGrams | N-gram & Tag Clouds | N-gram Dialog Systems | N-gram Grammars | N-gram Transducers (NGT)


“n-gram” [100x Aug 2014]

  • first20hours/google-10000-english .. This repo contains a list of the 10,000 most common English words in order of frequency, as determined by n-gram frequency analysis of the Google’s Trillion Word Corpus.
  • cedias/word2vec .. Tool for computing continuous distributed representations of word. Modified to learn N-Grams
  • rockymadden/stringmetric .. String metrics and phonetic algorithms for Scala (e.g. Dice/Sorensen, Hamming, Jaccard, Jaro, Jaro-Winkler, Levenshtein, Metaphone, N-Gram, NYSIIS, Overlap, Ratcliff/Obershelp, Refined NYSIIS, Refined Soundex, Soundex, Weighted Levenshtein).
  • proycon/colibri-core .. Colibri core is an NLP tool as well as a C++ and Python library for working with basic linguistic constructions such as n-grams and skipgrams (i.e patte rns with one or more gaps, either of fixed or dynamic size) in a quick and memory-efficient way. At the core is the tool…
  • Pomax/nrGrammar .. The Nihongo Resources grammar book: “An Introduction to Japanese; Syntax, Grammar & Language”
  • ai-ku/fastsubs .. Generate most likely substitutes for words in a given text based on an n-gram language model.
  • LogIN-/datagram .. datagram would be n-gram data-set categorization library for node, i can detect languages or any other data….
  • ibyron/n-gram .. Ansible playbook that installs a Hadoop multi-node cluster over GRNET ~okeanos and computes N-gram frequencies.
  • scarafoni/Opinion_Bot .. a n-gram markov chatbot that can keep track of it’s internal probability transition table
  • jaideepcoder/ngram .. A software which creates n-Gram (1-5) Maximum Likelihood Probabilistic Language Model with Laplace Add-1 smoothing and stores it in hash-able dictionary form
  • bisserlis/ngram .. A command-line application for generating sequences using n-gram models.
  • cidles/pressagio .. Pressagio is a library that predicts text based on n-gram models. For example, you can send a string and the library will return the most likely word completions for the last token in the string.
  • IriZnoj/n_grams .. Implementace efektivních datových struktur – Naimplementujte vhodné datové struktury použitelné pro efektivní vyhledávání v datech. Tato data mohou být nap?íklad n-gramy extrahované z text?, nebo sekvence DNA. Následn? tyto datové struktury otestujte na vhodné datové kolekci a…
  • bgawalt/acrostician .. A Twitter bot that seeks out n-grams to try and fill in acrostic poems given a target word
  • milk1000cc/trigram .. Compute the similarity of two strings based on the trigram (n-gram) method
  • BigFav/n-grams .. My Python n-gram Language Model from an NLP course. Since there are so public implementations, I feel free to post mine.
  • hcadavid/LanguageIdentificationTool .. This is a text documents language identifier, based on Cavnar and Trenkle paper[1]. Author: H?ctor Fabio Cadavid R. This software consists on two tools: the NGrams statistics database generator (1), and the language identifier for documents (2). Both tools provides information about the required…
  • ConstantineLignos/LingTools .. Simple tools for studying language: data preprocessing, frequency norms, n-gram models, working with CHILDES data, and more.
  • vsiivola/variKN .. A toolkit for producing n-gram language models. The highlights are the implementation of Kneser-Ney growing and revised Kneser pruning methods.
  • esafak/suggest .. An n-gram based autocompletion module for words entered with the phone keypad.
  • sunpinyin/open-gram .. an open solution for collecting n-gram Chinese lexicon and n-gram statistics
  • wlmiller/ChatBot .. A simple ChatterBot which builds random responses using n-grams.
  • Hins/Colosseum .. N-gram library with simple smoothing algorithm, now supporting Bigram and Trigram
  • vspandan/QueryExpansion .. Query Expansion is the project developed at IIIT hyderabad, as part of course work. We have gone through existing applications and proposed an approach combination of N grams and Markov model, implemented the same and tested our results on data set comprising “The Telegraph – Calcutta”…
  • yashaswinis/Language-Modeling .. An open-ended programming project to implement a collection of n-gram-based language models. Smoothing, random sentence generation and prediction of truthfulness of hotel reviews. Technology used- Python
  • matey-jack/keylayout-eval .. A simple evaluator for computer keyboard layouts. Measures typing effort for most frequent words and most frequent n-grams.
  • zvelo/ngrams .. Library for Character/Word n-gram Analysis
  • dyv/ngram_comparison .. The purpose of this project is to compare Cavnar’s out of place distance metric with Damashek’s cosine distance metric for comparing n-gram profiles for genre classification of english texts.
  • naiaden/LSALM .. Latent semantic analysis language model interpolated with n-grams
  • buerki/ngramprocessor .. The N-Gram Processor is a set of scripts and a Perl module allowing the creation and processing of n-gram lists out of text files.
  • pmyers88/tweet_roulette .. A Django app that uses n-grams to generate tweets from the corpus of tweets of a given twitter user.
  • buerki/SubString .. The SubString package is an open-source set of Unix Shell scripts used for substring reduction and frequency consolidation of word n-grams of different length. In the process, the frequencies of substrings are reduced by the frequencies of their superstrings and a consolidated list with n-grams of…
  • zab88/python-tweet-clustering .. With the help of OpenCV library for python we perform simple clustering of tweets, using character N-gram method applied for hashes