TweetNLP

Notes:

TweetNLP is a suite of natural language processing (NLP) tools and resources that are specifically designed for analyzing and understanding tweets. It is a free and open-source toolkit that is developed and maintained by the University of Colorado Boulder, and it is designed to support a wide range of NLP tasks that are relevant to tweets, such as tokenization, part-of-speech tagging, and parsing.

TweetNLP includes a range of tools and resources that are specifically designed to support the analysis of tweets. For example, it includes a tokenizer that is capable of dividing a tweet into its individual words and other units, a part-of-speech tagger that can identify the function of each word in the tweet, and a dependency parser that can analyze the grammatical structure of the tweet.

TweetNLP also includes a number of annotated corpora and web-based annotation tools that can be used to support the development and evaluation of NLP models and algorithms for tweets. These resources provide a valuable source of data and tools for researchers and developers who are working on NLP tasks that are relevant to tweets.

Overall, TweetNLP is a valuable resource for anyone who is interested in analyzing and understanding tweets using natural language processing techniques. It provides a range of tools and resources that are specifically designed for this purpose, and it is a free and open-source toolkit that is developed and maintained by the University of Colorado Boulder.

Resources:

See also:

100 Best Earthquake Twitter Bots | 100 Best Twitter Quote Bots | Marcus Endicott’s Twitter Bot Timeline (2008 – 2015) | RapidMiner & Twitter | Third Party Twitter Tools & Services | Twitter Bot News Timeline (2009 – 2019) | Twitter Bots 2018 | Twitter Bots Meta Guide | Twitter Semantics | Twitter4J & Natural Language 2019

Towards building large-scale distributed systems for twitter sentiment analysis VN Khuc, C Shivade, R Ramnath… – Proceedings of the 27th …, 2012 – dl.acm.org … to zero. If we denote as Ci the set of indices c for which the coordinates k’ic of vector w’i are different from zero, then (1) can be rewritten as: 1http://code.google.com/p/ ark-tweet-nlp/ cosine_sim w ,w ? k ? k ? ? 2 Vector length … Cited by 11 Related articles All 2 versions

Entity-centric topic-oriented opinion summarization in twitter X Meng, F Wei, X Liu, M Zhou, S Li… – Proceedings of the 18th …, 2012 – dl.acm.org … The aim is to assess the coherency of the produced #hashtags topics. We conduct the tokenization on the evaluation cor- pus with ark-tweet-nlp [10], and then remove stopwords, numbers (1.2, 100, $5 etc.), URL and words starting with “@” (accounts in Twitter). … Cited by 16 Related articles All 2 versions

Cross-lingual geo-parsing for non-structured data J Gelernter, W Zhang – Proceedings of the 7th Workshop on Geographic …, 2013 – dl.acm.org … 12 The compiled code is on Google Code; http://code.google.com/p/ark- tweet-nlp/downloads/list and the full code is on GitHub https://github.com/brendano/ ark-tweet-nlp/ 13 http://code.google.com/p/mate-tools/downloads/list … Cited by 3 Related articles All 3 versions

Part-of-Speech is (almost) enough: SAP Research & Innovation at the# Microposts2014 NEEL Challenge D Dahlmeier, N Nandan, W Ting – Making Sense of Microposts (# …, 2014 – ceur-ws.org … We per- form tokenization and POS tagging using the Tweet NLP toolkit [4], lookup word cluster indicators for each token from the Brown clusters released by Turian et al.[6], and annotate the tweets with the DBpedia Spotlight web API. … Related articles All 2 versions

Interest mining from user tweets T Vu, V Perez – Proceedings of the 22nd ACM international conference …, 2013 – dl.acm.org … First, we consider all the pre- processing used in [6]. Second, we propose using the Tweet NLP Tagger1 with our normalization strategies to … 1http://www.ark.cs.cmu.edu/TweetNLP/ 2http://www.speech.sri.com/projects/srilm 3http://www-formal.stanford.edu/jsierra/cs1931-project … Related articles

Sarcasm Detection on Czech and English Twitter T Ptácek, I Habernal, J Hong – anthology.aclweb.org … The Ark-tweet-nlp tool (Gimpel et al., 2011) offers precisely that and although it was developed and tested in English, it yields satisfactory results in Czech as well. … Part of speech tagging was done using the Ark-tweet-nlp tool (Gimpel et al., 2011). 5 Results …

Sentiment analysis in czech social media using supervised machine learning I Habernal, T Ptácek, J Steinberger – Proceedings of the 4th Workshop …, 2013 – aclweb.org … of social media. Although Ark-tweet- nlp tool (Gimpel et al., 2011) was developed and tested in English, it yields satisfactory results in Czech as well, according to our initial experiments on the Facebook corpus. Its significant … Cited by 7 Related articles All 7 versions

Mobile sentiment analysis L Chambers, E Tromp, M Pechenizkiy… – Proceedings of the 16th …, 2012 – eprints.port.ac.uk … Future work is to include experimentation on a wider set of Android mobile phones with differing memory and Android operating systems, and inclusion of analysis on other POS taggers implemented in Java and those specifically aimed at Twitter data such as Tweet NLP. … Cited by 3 Related articles All 5 versions

Approaches to Automatically Constructing Polarity Lexicons for Sentiment Analysis on Social Networks VN Khuc – 2012 – rave.ohiolink.edu … 3.4.1. Co-occurrence matrix. Co-occurrence matrix A (N is the number of words) contains information about how many times the word i co- 1 http://code.google.com/p/ark-tweet-nlp/ Page 24. 12 occurs with the word j. The MapReduce job for calculating the co-occurrence … Related articles All 3 versions

Weakly Supervised User Profile Extraction from Twitter J Li, A Ritter, E Hovy – 2014 – cs.cmu.edu … attributes are randomly as- signed. It is interesting to note that Education ex- hibits a much stronger HOMOPHILY property than 14https://code.google.com/p/ ark-tweet-nlp/downloads/list Page 7. Job. Such affinity demonstrates that … Cited by 2 Related articles All 2 versions

How noisy social media text, how diffrnt social media sources T Baldwin, P Cook, M Lui, A MacKinlay… – Proceedings of the 6th …, 2013 – aclweb.org … In line with the findings of Read et al. (2012a) based on experimentation with a selection of sentence to- kenisers over user-generated content, we sentence- tokenise with tokenizer.4 Finally, we tokenise and POS tag the datasets using TweetNLP 0.3 (Owoputi et al., 2013). … Cited by 13 Related articles All 6 versions

Supervised sentiment analysis in Czech social media I Habernal, T Ptá?ek, J Steinberger – Information Processing & …, 2014 – Elsevier … of social media. Although Ark-tweet-nlp tool (Gimpel et al., 2011) was developed and tested in English, it yields satisfactory results in Czech as well, according to our initial experiments on the Facebook corpus. Its significant …

Mining divergent opinion trust networks through latent dirichlet allocation N Dokoohaki, M Matskin – … of the 2012 International Conference on …, 2012 – dl.acm.org … We have used Carnegie Mellon university’s TweetNLP 3 tool set [28]. In this tool set authors propose for a tag set, annotated data and features. We used TweetNLP for tokenization and part-of-speech tagging. Figure 2 shows a 3TweetNLP,http://www.ark.cs.cmu.edu/TweetNLP/ … Cited by 2 Related articles All 5 versions

Text Insights: Natural Language Analytics for Understanding Social Media Engagement F Grimm, M Hartung, P Cimiano – Proceedings of the …, 2014 – pub.uni-bielefeld.de … 2. Part-of-Speech Tagging. NLTK and Ark-Tweet-NLP [3]5 part-of-speech taggers are used to assign word classes to each token. … 5 http://www.ark.cs.cmu.edu/TweetNLP/, version 0.3.2. 6 http://www.nltk.org/api/nltk.stem.html 7 https://www.facebook.com/thehypertensionhub …

Part-of-speech tagging for Twitter: Word clusters and other advances OOBOC Chris, DKGN Schneider – 2012 – ra.adm.cs.cmu.edu … released as TweetNLP version 0.3, along with the new annotated data and large-scale word clusters at http://www.ark.cs.cmu.edu/TweetNLP. … 6Included with the tagger, and accessible online at https://github.com/brendano/ark-tweet-nlp/blob/ master/docs/annot_guidelines.md 4 … Cited by 16 Related articles All 10 versions

Automatic Domain-Specific Sentiment Lexicon Generation with Label Propagation YJ Tai, HY Kao – … of International Conference on Information Integration …, 2013 – dl.acm.org … For the goal to extract our candidate words, we use the Ark-TweetNLP [19] as our POS tagging parser. … The main difference between Ark-TweetNLP and other parsers is that Ark-TweetNLP is built from tweet corpus. They manually annotate tweet corpus and train the POS tagger. … Related articles

Semantic sentiment analysis of twitter H Saif, Y He, H Alani – The Semantic Web–ISWC 2012, 2012 – Springer … In this work, we build various NB classifiers trained using a combination of word unigrams and POS features and use them as baseline mod- els. We extract the POS features using the TweetNLP POS tagger,9 which is trained specifically from tweets. … Cited by 47 Related articles All 10 versions

Evaluation datasets for twitter sentiment analysis H Saif, M Fernandez, Y He, H Alani – Proceedings, 1st Workshop …, 2013 – researchgate.net … To extract the number of unigrams, we use the TweetNLP tokenizer [7], which is specifically built to work on tweets data.9 Note that we considered all tokens found in the tweets including words, numbers, URLs, emoticons, and speical characters (eg, question marks, intensifiers … Cited by 2 Related articles All 4 versions

Experiments with crowdsourced re-annotation of a POS tagging data set D Hovy, B Plank, A Søgaard – dirkhovy.com … 2http://www.ark.cs.cmu.edu/TweetNLP/ 3http://crowdflower.com 3 Crowdsourcing Sequential Annotation … The crowdsourced annotations 5https://code.google.com/p/ wikily-supervised-pos- tagger/ 6http://www.ark.cs.cmu.edu/TweetNLP/ Page 4. DRAFT … Related articles All 2 versions

Improved Part-of-Speech Tagging for Online Conversational Text with Word Clusters. O Owoputi, B O’Connor, C Dyer, K Gimpel, N Schneider… – HLT-NAACL, 2013 – aclweb.org … Tagging software, annotation guidelines, and large-scale word clusters are available at: http://www.ark.cs.cmu.edu/TweetNLP This paper describes release 0.3 of the “CMU Twitter Part-of-Speech Tagger” and annotated data. 1 Introduction … Cited by 78 Related articles All 10 versions

Event Detection in Twitter using Aggressive Filtering and Hierarchical Tweet Clustering. G Ifrim, B Shi, I Brigadir – SNOW-DC@ WWW, 2014 – insight-centre.org … Note the log in the denominator, allowing the current document frequency to have more weight than the pre- vious/historical average frequency. Another important focus is on tweet NLP in order to recognize named entities. … 4http://www.ark.cs.cmu.edu/TweetNLP/ Page 4. … Cited by 2 Related articles All 2 versions

Sentiment Analysis and Opinion Mining of Microblogs K Shah, N Munshi, P Reddy – 2013 – cs.uic.edu … [5] Weka 3, machine learning software suite http://www.cs.waikato.ac.nz/ml/ weka/ [6] LibSVM, a library of SVM http://www.csie.ntu.edu.tw/~cjlin/ libsvm/ [7] TweetNLP and POS tagging http://www.ark.cs.cmu.edu/ TweetNLP/ 5 Related articles

Lexical normalization for social media text B Han, P Cook, T Baldwin – … on Intelligent Systems and Technology (TIST …, 2013 – dl.acm.org Page 1. 5 Lexical Normalization for Social Media Text BO HAN, NICTA Victoria Research Laboratory and The University of Melbourne PAUL COOK, The University of Melbourne TIMOTHY BALDWIN, NICTA Victoria Research Laboratory and The University of Melbourne … Cited by 16 Related articles All 5 versions

SwatCS: Combining simple classifiers with estimated accuracy S Clark, R Wicentowski – Atlanta, Georgia, USA, 2013 – aclweb.org … tweets. For parts of this study, their data was sup- plemented with external data (Go et al., 2009). As part of pre-processing, all tweets were part-of-speech tagged using the ARK TweetNLP tools (Owoputi et al., 2013). All punctuation … Cited by 3 Related articles All 7 versions

Adapting taggers to Twitter with not-sodistant supervision B Plank, D Hovy, R McDonald, A Søgaard – 2014 – aclweb.org … 4http://www.chokkan.org/software/crfsuite/ 5http://www.ark.cs.cmu.edu/TweetNLP/ 6http://http://nlp.stanford.edu/software/CRF-NER.shtml … github.com/miso-belica/jusText 14ftp://ftp.cs.cornell.edu/pub/smart/english.stop 15https://github.com/brendano/ark-tweet-nlp … Cited by 1

Twitter Part-of-Speech Tagging for All: Overcoming Sparse and Noisy Data. L Derczynski, A Ritter, S Clark, K Bontcheva – RANLP, 2013 – aclweb.org … of tags that are accurately assigned. Where possible we report performance on “unknown” words – those that 1http://www.ark.cs.cmu.edu/TweetNLP/ 2https://github.com/aritter/twitter nlp Tagger T-dev D-dev Token Sentence … Cited by 17 Related articles All 7 versions

SU-FMI: System Description for SemEval-2014 Task 9 on Sentiment Analysis in Twitter B Velichkov, B Kapukaranov, I Grozev… – SemEval …, 2014 – anthology.aclweb.org … Sentiment of the last token; • Overall sentiment terms count. 3http://snowball. tartarus. org/ 4http://www. ark. cs. cmu. edu/TweetNLP/ cluster_viewer. html 5http://github. com/percyliang/ brown-cluster We further used the following …

A pipeline tweet contextualization system at inex 2013 K Ansary, AT Tran, NK Tran – 2013 – ims-sites.dei.unipd.it … be found at http://www. ark.cs.cmu.edu/TweetNLP/annot_guidelines.pdf. After tokenizing the tweet, we employed several heuristics to detect the key phrases as overlapping consecutive tokens. For example, we restricted that … Cited by 1 Related articles

NER from Tweets: SRI-JU System@ MSM 2013 A Das, U Burman, AR Balamurali… – Making Sense of …, 2013 – ceur-ws.org … 1 http://www. ark. cs. cmu. edu/TweetNLP/ 2 http://dsal. uchicago. edu/dictionaries/biswas- bengali/ 3 http://www. csse. unimelb. edu. au/~ tim/etc/emnlp2012-lexnorm. tgz · 5w wPHIQ· gon ept ixtr tion gh llenge· w king ense of wi roposts sss· TR Page 74. … Cited by 1 Related articles All 4 versions

Data-Mining Twitter and the Autism Spectrum Disorder: A Pilot Study A Beykikhoshk, O Arandjelovic, D Phung, S Venkatesh… – deakin.edu.au … In this context it is beneficial to have different inflections of the same word normalized and represented by a single term. In linguistics this process is referred to as lemmatization and we apply it automatically using the freely available TweetNLP soft- ware package [18]. …

Predicting the NFL using Twitter S Sinha, C Dyer, K Gimpel, NA Smith – arXiv preprint arXiv:1310.6998, 2013 – arxiv.org … have been attributed to bettors overvaluing recent success and under- valuing recent failures [24], cases in which home teams are underdogs [5], large- audience games, including Super Bowls [6], and extreme gameday temperatures [3]. 5 www.ark.cs.cmu.edu/TweetNLP … Cited by 1 Related articles All 11 versions

Indian Institute of Technology-Patna: Sentiment Analysis in Twitter V Singh, AM Khan, A Ekbal – SemEval 2014, 2014 – anthology.aclweb.org … 1http://alt. qcri. org/semeval2014/task9/ 2http://www. ark. cs. cmu. edu/TweetNLP/ 2.2 Approach Our approach is based on supervised machine learning. We explored different models such as naive Bayes, decision tree and support vector ma- chine. …

RTRGO: Enhancing the GU-MLT-LT System for Sentiment Analysis of Short Messages J Vancoppenolle, R Johansson – SemEval 2014, 2014 – aclweb.org … ark. cs. cmu. edu/TweetNLP • The word stems of the normalized tokens, reducing inflected forms of a word to a com- mon form. The stems were computed using the Porter stemmer algorithm (Porter, 1980) • The IDs of the token’s word clusters. …

GU-MLT-LT: Sentiment Analysis of Short Messages using Linguistic Features and Stochastic Gradient Descent O Wijksgatan, L Furrer – Atlanta, Georgia, USA, 2013 – aclweb.org … 3http://www. ark. cs. cmu. edu/TweetNLP 329 Page 366. 3.3 Machine Learning Methods For the classification of the messages into the posi- tive, negative and neutral classes we use three linear models, which were trained in an one-vs.-all man- ner. … Related articles All 6 versions

UT-DB: an experimental study on sentiment analysis in twitter Z Zhu, D Hiemstra, PMG Apers, A Wombacher – 2013 – eprints.eemcs.utwente.nl … So we map the words ranging from ?5 to ?1 in SentiStrength to negative in our grading system, and the words ranging from 2http://www.ark.cs.cmu.edu/TweetNLP/ 3http://sentistrength.wlv.ac. uk/ 385 Page 3. +1 to +5 to positive. The rest are mapped to neutral. … Related articles All 14 versions

Dynamic Language Models for Streaming Text D Yogatama, C Wang, BR Routledge, NA Smith… – cs.cmu.edu … We look at tweets from the period 2011-01-01 to 2012-09-30 (639 days). As a result, we have approximately 100–800 tweets per day. We tokenized the tweets using the CMU ARK TweetNLP tools,12 numerical terms are mapped to a single word, and all letters are downcased. … Related articles All 3 versions

Discriminative Lexical Semantic Segmentation with Gaps: Running the MWE Gamut NSEDC Dyer, NA Smith – cs.cmu.edu Page 1. Discriminative Lexical Semantic Segmentation with Gaps: Running the MWE Gamut Nathan Schneider Emily Danchik Chris Dyer Noah A. Smith School of Computer Science Carnegie Mellon University Pittsburgh, PA … Related articles All 5 versions

A Simple Bayesian Modelling Approach to Event Extraction from Twitter D Zhou, L Chen, Y He – Atlantis, 2011 – demeter.inf.ed.ac.uk … The generative process of LEM is shown below. 4http://www.ark.cs.cmu.edu/TweetNLP • Draw the event distribution ?e ? Dirichlet(?) • For each event e ? {1..E}, draw multinomial distri- butions ?e ? Dirichlet(?), ?e ? Dirichlet(?), ?e ? Dirichlet(?), ?e ? Dirichlet(?). … Related articles All 2 versions

TeamX: A Sentiment Analyzer with Enhanced Lexicon Mapping and Weighting Scheme for Unbalanced Data Y Miura, S Sakaki, K Hattori, T Ohkuma – SemEval 2014, 2014 – aclweb.org … A Logistic Regres- sion is trained using the features of Section 3.2 with the three polarities (positive, negative, and neutral) as labels. 6The total number of lexical features is 7× 2× 4= 56. 7http://www. ark. cs. cmu. edu/TweetNLP/ 630 Page 651. …

Part-of-speech tagging for twitter: Annotation, features, and experiments K Gimpel, N Schneider, B O’Connor, D Das… – Proceedings of the 49th …, 2011 – dl.acm.org … only 1.7% absolute. 5 Conclusion We have developed a part-of-speech tagger for Twit- ter and have made our data and tools available to the research community at http://www.ark.cs. cmu.edu/TweetNLP. More generally, we … Cited by 218 Related articles All 18 versions

A Participant-based Approach for Event Summarization Using Twitter Streams. C Shen, F Liu, F Weng, T Li – HLT-NAACL, 2013 – aclweb.org … 1154 Page 4. We formulate the participant detection in a hier- archical agglomerative clustering framework. The CMU TweetNLP tool (Gimpel et al., 2011) was used for proper noun tagging. The proper nouns (aka, mentions) are grouped into clusters in a bottom-up fashion. … Cited by 3 Related articles All 7 versions

“Translation can’t change a name”: Using Multilingual Data for Named Entity Recognition M Faruqui – arXiv preprint arXiv:1405.0701, 2014 – arxiv.org Page 1. “Translation can’t change a name”: Using Multilingual Data for Named Entity Recognition Manaal Faruqui Language Technologies Institute Carnegie Mellon University Pittsburgh, PA, 15213, USA mfaruqui@cs.cmu.edu Abstract … Related articles All 2 versions

Contextual Sentiment Analysis in Social Media Using High-Coverage Lexicon A Muhammad, N Wiratunga, R Lothian… – … and Development in …, 2013 – Springer … lemma. We use Stanford CoreNLP 1 pipeline for sentence split and lemmatization. However, we use TweetNLP 2 [ 9 ] for tokenization and PoS tagging because it recognises social media symbols such as emoticons. Stemming … Related articles All 2 versions

Bi-directional Inter-dependencies of Subjective Expressions and Targets and their Value for a Joint Model. R Klinger, P Cimiano – ACL (2), 2013 – aclweb.org … Sec- 2http://verbs.colorado.edu/jdpacorpus/ 3http://nlp.uned.es/˜damiano/datasets/ entityProfiling_ORM_Twitter.html 4In version 0.3, http://www.ark.cs.cmu.edu/ TweetNLP/ Car Camera Twitter Texts 457 178 9238 Targets 11966 4516 1418 Subjectives 15056 5128 1519 … Cited by 2 Related articles All 5 versions

ECNU: Expression-and Message-level Sentiment Orientation Classification in Twitter Using Multiple Effective Features J Zhao, M Lan, TT Zhu – SemEval 2014, 2014 – aclweb.org … ark. cs. cmu. edu/TweetNLP/ 261 Page 282. data set directly from last year, Twitter2013 is a twitter data set directly from last year, Twitter2014 is a new twitter data set and Twitter2014Sarcasm is a collection of tweets that contain sarcasm. …

Improving Twitter Retrieval by Exploiting Structural Information. Z Luo, M Osborne, S Petrovic, T Wang – AAAI, 2012 – aaai.org … The tag “URL_I” also has low F1 value. The reason is that some of links has been wrongly tokenized by Twitter 4We used a part-of-speech tweet tagger http://www.ark.cs.cmu. edu/TweetNLP 5http://flexcrfs.sourceforge.net/ 650 Page 4. tokenizer (O’Connor et al. 2010). … Cited by 3 Related articles All 5 versions

Crowdsourcing and annotating NER for Twitter# drift H Fromreide, D Hovy, A Søgaard – lrec-conf.org … 3https://github.com/percyliang/brown-cluster 4http://wacky.sslmit.unibo.it/ 5http://www.ark.cs.cmu. edu/TweetNLP/cluster viewer.html 6http://geonames.org 7http://optima.jrc.it/data/entities.gzip 0 500 1000 1500 2000 2500 3000 N 35 40 45 50 55 60 65 70 75 % prec rec f1 … Cited by 1 Related articles All 2 versions

Detecting Natural Disaster Events on Twitter across Languages. A Zielinski – IIMSS, 2013 – books.google.com … earthquake-topic 3 http://www. ark. cs. cmu. edu/TweetNLP/ 4 http://code. google. com/p/language-detection/ Page 308. 296 A. Zielinski/Detecting Natural Disaster Events on Twitter Across Languages 3.3. CLTC/Features in different … Related articles All 5 versions

A Comparison of Sequential and Topic models for Named Entity Recognition on Tweets Y Chen, X Yan, W Zhang – cs.cmu.edu … 3.1 Tokenizing The tokenizer for tweets originally comes from Ark-Tweet-NLP [16], tough we did minor modifi- cations on it. Table 2 shows an example of how the tokenizer for tweets could tokenize the special strings that appear in tweets (eg the emotional icon “:)”) … Related articles All 2 versions

Early Informal Language Structure Extraction Prototype ZL THU, SW THU, CD Date – 2012 – xlike.org … the pipeline. In particular we have assembled a new pipeline for informal English, targeted at tweets, consisting of: • Tokenization and PoS tagging, using the TweetNLP tools from Carnegie Mellon University [6]. • A named entity … Related articles

Twitter Sentiment Analysis: On Feature Engineering, Classifier Performance and Realtime Tracking N Haldenwang – 2013 – inf.uos.de Page 1. Institut für Informatik Masterarbeit Twitter Sentiment Analysis: On Feature Engineering, Classifier Performance and Realtime Tracking Nils Haldenwang September 2013 Erstgutachter: Prof. Dr. Oliver Vornberger Zweitgutachterin: Prof. Dr. Elke Pulvermüller Page 2. … Related articles All 4 versions

Towards automatic assessment of the social media impact of news content T De Nies, G Haesendonck, F Godin… – Proceedings of the …, 2013 – dl.acm.org … sen- tences. Therefore, standard approaches such as NER fail. Newly developed algorithms such as TweetNLP [6] could po- tentially offer solution here. Additionally, we need to take multiple languages into account. For example … Related articles All 7 versions

Sentiment Analysis of Microblogs T Günther – Examensarbete handlett av R. Johansson, MLT- …, 2013 – tobias.io Page 1. Thesis for the degree of Master in Language Technology Sentiment Analysis of Microblogs Tobias Günther Supervisor: Richard Johansson Examiner: Prof. Torbjörn Lager June 2013 Page 2. Contents 1 Introduction 3 1.1 Sentiment Analysis . . . . . … Cited by 2 Related articles All 2 versions

CRF to find stock price correlation with company related Twitter sentiment E SHABUNINA – 2013 – politesi.polimi.it Page 1. POLITECNICO DI MILANO Scuola di Ingegneria dell’Informazione POLO TERRITORIALE DI COMO Master of Science in Computer Engineering CRF to find stock price correlation with company-related Twitter sentiment Supervisor: Prof. Marco Brambilla … All 2 versions

KLUE: Simple and robust methods for polarity classification T Proisl, P Greiner, S Evert, B Kabashi – Proceedings of the seventh …, 2013 – aclweb.org … the SMS test set. 14http://www. ark. cs. cmu. edu/TweetNLP/ 400 Page 437. References Thorsten Brants and Alex Franz. 2006. Web 1T 5-gram Version 1. Linguistic Data Consortium, Philadelphia, PA. Kevin Gimpel, Nathan … Cited by 5 Related articles All 7 versions

Geo-Coding for the Mapping of Documents and Social Media Messages J Gelernter – 2013 – DTIC Document … Proceedings of the Annual Meeting of the Association for Computational Linguistics, Portland, OR, USA, June 2011, 42-47. Http://www.ark.cs.cmu.edu/TweetNLP Goldberg, DW, Cockburn, MG (2010a). Improving Geocode Accuracy with Candidate Selection Criteria. … Related articles

Meta-Level Sentiment Models for Big Social Data Analysis F Bravo-Marquez, M Mendoza, B Poblete – Knowledge-Based Systems, 2014 – Elsevier People react to events, topics and entities by expressing their personal opinions and emotions. These reactions can correspond to a wide range of intensities, f. Related articles All 2 versions

Propagated Opinion Retrieval in Twitter Z Luo, J Tang, T Wang – Web Information Systems Engineering–WISE …, 2013 – Springer … It shows that humans are capable of judging which tweets are propagated opinions from those which are not. 8 http://www.ark.cs.cmu.edu/TweetNLP/ Page 9. 24 Z. Luo, J. Tang, and T. Wang 6.2 Experimental Settings and Baselines … Related articles

The utility of social and topical factors in anticipating repliers in twitter conversations J Schantl, R Kaiser, C Wagner… – … of the 5th Annual ACM Web …, 2013 – dl.acm.org … information. We calculate the similarity of the concept-vector of user a and the concept vector of user c using the cosine similarity which 4http://www.ark.cs.cmu.edu/TweetNLP/ 5http://dbpedia.org 379 Page 5. is defined as follows: … Cited by 2 Related articles All 6 versions

Learning part-of-speech taggers with inter-annotator agreement loss B Plank, D Hovy, A Søgaard – Proceedings of EACL, 2014 – cst.dk … The cost-sensitive model is 5http://www.ark.cs.cmu.edu/TweetNLP/ 6http://oak.dcs.shef.ac.uk/ msm2013/ie_ challenge/ able to improve performance on two out of the three test sets, while being slightly below baseline performance on the MSM challenge data. … Cited by 3 Related articles All 4 versions

Time-aware topic-based contextualization NK Tran – Proceedings of the companion publication of the 23rd …, 2014 – dl.acm.org … events) and temporal information. 1https://inex.mmci.uni-saarland.de/tracks/qa/ 2http://www.ark.cs.cmu.edu/TweetNLP/ 3http://www.lemurproject.org/ 4http://www. summarization.com/mead/ 18 Page 5. 6. CONCLUSION In this … Related articles

A unified model for topics, events and users on Twitter Q Diao, J Jiang – 2013 – ink.library.smu.edu.sg Page 1. Singapore Management University Institutional Knowledge at Singapore Management University Research Collection School Of Information Systems School of Information Systems 10-2013 A unified model for topics, events and users on Twitter … Cited by 3 Related articles All 5 versions

Detecting Newsworthy Topics in Twitter. S Van Canneyt, M Feys, S Schockaert… – SNOW-DC@ …, 2014 – ceur-ws.org … Non-English tweets were removed using LDIG2. To calculate the term frequencies in the obtained tweets, TweetNLP [7] was used to tokenize the tweets and to remove words 2https://github.com/ shuyo/ldig Page 6. related to punctuations, URLs, determiners, etc. … Cited by 1 Related articles All 2 versions

MITEXTEXPLORER: Linked brushing and mutual information for exploratory text data analysis B O’Connor – anyall.org … 4For traditional text, the tool currently uses Stanford CoreNLP; for Twitter, CMU ARK TweetNLP. Page 8. Figure 6: MITEXTEXPLORER for paper titles in the ACL Anthology (Radev et al., 2009). Y-axis is venue (conference or journal name), X-axis is year of publication. … Cited by 1 Related articles All 3 versions

TUGAS: Exploiting Unlabelled Data for Twitter Sentiment Analysis S Amir, M Almeida, B Martins, J Filgueiras… – SemEval …, 2014 – anthology.aclweb.org … networks capable of produc- ing continuous representations of words (Mikolov 1 http://www. ark. cs. cmu. edu/TweetNLP/ 674 Page 695. Lexicon# 1-grams# 2-grams# pairs Bing Liu 6789– MPQA 8222– SentiStrength 2546– NRC …

Scat: A System For Concept Annotation Of Tweets S Sachidanandan – 2014 – web2py.iiit.ac.in … semantics contained in it, especially because, we are defining the pertinent concepts based on the hashtags, phrases and proper nouns present in the tweet. We use tweet NLP pos tagger proposed in [10] to encode the tweets based on their POS tag sequences and uses …

MLSA13-Proceedings of” Machine Learning and Data Mining for Sports Analytics”, workshop@ ECML/PKDD 2013 A Zimmermann, J Davis – CW Reports, 2013 – lirias.kuleuven.be … have been attributed to bettors overvaluing recent success and under- valuing recent failures [24], cases in which home teams are underdogs [5], large- audience games, including Super Bowls [6], and extreme gameday temperatures [3]. 5 www.ark.cs.cmu.edu/TweetNLP … Related articles All 4 versions

HG-RANK: A Hypergraph-based Keyphrase Extraction for Short Documents in Dynamic Genre A Bellaachia, M Al-Dhelaan – Making Sense of Microposts (# …, 2014 – ceur-ws.org … Then, for each word w in each document d, we resample the 6http://www. ark. cs. cmu. edu/TweetNLP/ 7http://tartarus. org/martin/PorterStemmer/ ·# Microposts2014· 4th Workshop on Making Sense of Microposts·@ WWW2014 46 Page 55. … Cited by 1 Related articles All 2 versions

Scalable topic-specific influence analysis on microblogs B Bi, Y Tian, Y Sismanis, A Balmin, J Cho – Proceedings of the 7th ACM …, 2014 – dl.acm.org Page 1. Scalable Topic-Specific Influence Analysis on Microblogs Bin Bi UCLA bbi@cs.ucla.edu Yuanyuan Tian IBM Almaden Research Center ytian@us.ibm.com Yannis Sismanis? Google yannis@google.com Andrey Balmin* GraphSQL andrey@graphsql.com … Related articles All 7 versions

Short text keyphrase extraction with hypergraphs A Bellaachia, M Al-Dhelaan – Progress in Artificial Intelligence, 2014 – Springer Page 1. Prog Artif Intell DOI 10.1007/s13748-014-0058-1 REGULAR PAPER Short text keyphrase extraction with hypergraphs Abdelghani Bellaachia · Mohammed Al-Dhelaan Received: 17 March 2014 / Accepted: 1 August 2014 © Springer-Verlag Berlin Heidelberg 2014 …

STRAP: System for Twitter Ranking and Prediction KT Chen, A Kubati, R Mead, S Paul, M Marcus – seas.upenn.edu Page 1. STRAP: System for Twitter Ranking and Prediction Kuan-Ting Chen kche@seas.upenn. edu Univ. of Pennsylvania Philadelphia, PA Alen Kubati kubatial@seas.upenn.edu Univ. of Pennsylvania Philadelphia, PA Robert Mead robmead@seas.upenn.edu Univ. … Related articles All 2 versions

Sentiment Analysis Meets Information Retrieval U Barman – 2013 – dspace.jdvu.ac.in Page 1. Sentiment Analysis Meets Information Retrieval Thesis submitted to the Faculty of Engineering & Technology, Jadavpur University In partial fulfillment of the requirements for the Degree Of Master of Technology in Computer Technology … Related articles All 2 versions

Fad or here to stay: Predicting product market adoption and longevity using large scale, social media data S Tuarob, CS Tucker – ASME 2013 …, 2013 – … .asmedigitalcollection.asme.org … punctuation. All terms in the tweet content is tagged with part-of-speech using the Carnegie Mellon ARK Twitter POS Tagger8 [9], and only noun terms 8http://www.ark. cs.cmu.edu/TweetNLP/ 6 Copyright © 2013 by ASME Downloaded … Cited by 7 Related articles All 9 versions

Social Media in Disaster Relief PM Landwehr, KM Carley – Data Mining and Knowledge Discovery for Big …, 2014 – Springer Page 1. WW Chu (ed.), Data Mining and Knowledge Discovery for Big Data, Studies in Big Data 1, 225 DOI: 10.1007/978-3-642-40837-3_7, © Springer-Verlag Berlin Heidelberg 2014 Social Media in Disaster Relief Usage Patterns … Related articles All 2 versions

Social Media in Disaster Relief Usage Patterns, Data Mining Tools, and Current Research Directions PM Landwehr, KM Carley – cs.cmu.edu Page 1. Social Media in Disaster Relief Usage Patterns, Data Mining Tools, and Current Research Directions Peter M. Landwehr and Kathleen M. Carley Carnegie Mellon University Pittsburgh, Pennsylvania e-mail: {plandweh, kathleen.carley}@cs.cmu.edu … Related articles All 2 versions

An Examination of High School and College Students’ Chatspeak Use in Twitter and Tumblr MJ Charters – 2014 – surface.syr.edu Page 1. Syracuse University SURFACE …

A framework for (under) specifying dependency syntax without overloading annotators N Schneider, B O’Connor, N Saphra… – arXiv preprint arXiv: …, 2013 – arxiv.org Page 1. A Framework for (Under)specifying Dependency Syntax without Overloading Annotators Nathan Schneider? Brendan O’Connor Naomi Saphra David Bamman Manaal Faruqui Noah A. Smith Chris Dyer School of Computer Science Carnegie Mellon University … Cited by 3 Related articles All 9 versions

A Learning Method to Geocode Location Expressions in Twitter Messages J Gelernter, W Zhang – Journal of Spatial Information Science, 2014 – josis.org Page 1. Paper under review Please do not cite JOURNAL OF SPATIAL INFORMATION SCIENCE © by the authors Licensed under Creative Commons Attribution 3.0 License A Preference Learning Method to Geocode Location Expressions in Twitter Messages … Related articles All 5 versions

Recurrent Chinese Restaurant Process with a Duration-based Discount for Event Identification from Twitter Q Diao, J Jiang – SIAM Page 1. Recurrent Chinese Restaurant Process with a Duration-based Discount for Event Identification from Twitter Qiming Diao? Jing Jiang? Abstract Due to the fast development of social media on the Web, Twitter has become … Related articles

Web Information Systems Engineering–WISE 2013 XLY Manolopoulos, DSG Huang – Springer Page 1. Xuemin Lin Yannis Manolopoulos Divesh Srivastava Guangyan Huang (Eds.) 123 LNCS 8181 14th International Conference Nanjing, China, October 2013 Proceedings, Part II Web Information Systems Engineering – WISE 2013 Page 2. … Related articles All 2 versions

An algorithm for local geoparsing of microtext J Gelernter, S Balaji – GeoInformatica, 2013 – Springer Page 1. An algorithm for local geoparsing of microtext Judith Gelernter & Shilpa Balaji Received: 22 March 2012 /Revised: 12 October 2012 Accepted: 5 November 2012 /Published online: 27 January 2013 © Springer Science+Business Media New York 2013 … Cited by 7 Related articles All 4 versions

Lexical Semantic Analysis in Natural Language Text N Schneider – 2012 – cs.cmu.edu Page 1. Lexical Semantic Analysis in Natural Language Text ph.d. thesis proposal Nathan Schneider Language Technologies Institute School of Computer Science Carnegie Mellon University November 17, 2012 0 Introduction … Related articles All 2 versions