Which NLP Library is most suitable for use and further development for a text mining startup?
See answers to virtually identical question from 2011:
· Which NLP library among the ones below is most mature and should be used by a startup for its NLP needs?
Edit: > i wanted to ask more specifically which NLP libraries are most development friendly.
In terms of purely development-friendly, it would have to be NLTK in Python; however, more robust production systems seem to be made in Java these days, with OpenNLP on top, and MALLET and Stanford NLP close behind. If it’s got to be Ruby, then look at FreeLing before TreeTagger.
In light of the updated comment, I have presented a metric, see references below:
· NLTK (Natural Language Toolkit) .. Python (240)
· OpenNLP .. Java (38)
· MALLET (MAchine Learning for LanguagE Toolkit) .. Java (34)
· Stanford NLP .. Java (31)
· LingPipe .. Java (5)
· FreeLing .. Ruby (4)
· TreeTagger .. Ruby (2)
References:
· Natural Language Processing Toolkits | Meta-Guide.com
Note: I would like to know which NLP toolkits are missing from this list. And, any feedback on the validity of this Github metric would be useful.