Apache OpenNLP 2016


Notes:

OpenNLP library is a machine learning based toolkit for the processing of natural language text.  OpenNLP includes maximum entropy and perceptron based machine learning.

Resources:

Wikipedia:

See also:

100 Best Apache OpenNLP VideosApache OpenNLP & Coreference Resolution 2016Apache OpenNLP & Dialog SystemsOpenCCG (OpenNLP CCG Library)


TextImager: a Distributed UIMA-based System for NLP.
W Hemati, T Uslu, A Mehler – COLING (Demos), 2016 – aclweb.org
… This includes, for example, UIMA (Ferrucci and Lally, 2004), DKPro (Eckart de Castilho and Gurevych, 2014), OpenNLP (OpenNLP, 2010) and Gate (Cunningham et al., 2011). … OpenNLP. 2010. Apache OpenNLP, http://opennlp.apache.org. …

UIMA-Based JCoRe 2.0 Goes GitHub and Maven Central?State-of-the-Art Software Resource Engineering and Distribution of NLP Pipelines.
U Hahn, F Matthies, E Faessler, J Hellrich – LREC, 2016 – lrec-conf.org
… NLTK: https://github.com/nltk/nltk, the primary site sits on GITHUB • OPENNLP: https://github.com/apache/ opennlp, GITHUB repository is a mirror of an SVN repository – their source code is located at https://svn.apache.org/repos/asf/ opennlp/trunk/ …

Lemmatization and Morphological Tagging in German and Latin: A Comparison and a Survey of the State-of-the-art.
S Eger, R Gleim, A Mehler – LREC, 2016 – lrec-conf.org
… Page 3. User-defined features Large output spaces external resources label dependencies FLORS Lapos MarMoT Mate OpenNLP Stanford TnT TreeTagger Table 1: Systems and selected properties. … Page 4. FLORS Lapos MarMoT Mate OpenNLP Stanford TnT TreeTagger …

UIMA Installation Guide
B Rudzewitz – 2016 – sfs.uni-tuebingen.de
… In order to have examples for NLP tools, this document also explains how to install the OpenNLP UIMA tools. 1 UIMA Prerequisites • UIMA is a Java framework. … 1 Page 2. 4 NLP Tools Installation • Get the OpenNLP binaries from here …

A Novel Data Cleaning with Data Matching
KS Cheng, JL Hong – 2016 – onlinepresent.org
… The Apache OpenNLP library uses machine learning to process natural language text, such as segmenting sentences, tokenizing, and tagging part-of-speech. … We implemented certain natural language processing features using OpenNLP. …

OAUC at CLEF2016 SBS Lab: Using Appeal Elements to Improve Automatic Book Recommendation-Proof of Concept.
M Preminger, G Fludal – CLEF (Working Notes), 2016 – pdfs.semanticscholar.org
… It is these reviews (free texts) that constitute the most important data of this paper. In order to prepare the data to adjective based analysis, we have so far been taking the following steps: – POS-tagging of all free texts of the reviews using the Apache OpenNlp5 – Collecting all …

TOP10 TOOLS FOR NATURAL LANGUAGE PROCESSING (NLP)-RESEARCH AND DEVELOPMENT
A Nayyar, V Puri – The CSI Vision:” lT for Masses – csi-india.org
… Tools Available to Natural Language Processing In this section, various Top Tools available for Natural Language Processing are being highlighted: 1. Apache OpenNLP: Apache OpenNLP is a machine learning based toolkit for processing natural language text. … opennlp. …

UDPipe: Trainable Pipeline for Processing CoNLL-U Files Performing Tokenization, Morphological Analysis, POS Tagging and Parsing.
M Straka, J Hajic, J Straková – LREC, 2016 – ufal.mff.cuni.cz
… A considerable number of natural language processing pipelines are available, eg OpenNLP2 or Natural Lan- guage Processing Toolkit (NLTK), (Bird et al., 2009)3; however, our aim was to develop an extremely simple tool to be easily used by users with no language specific …

sisinflab: an Ensemble of Supervised and Unsupervised Strategies for the NEEL-IT Challenge at Evalita 2016.
V Cozza, W La Bruna, T Di Noia – CLiC-it/EVALITA, 2016 – pdfs.semanticscholar.org
… cleaning consisting of replacing URLs with the keyword URL as well emoticons with EMO; This has been implemented with ad hoc rules; (2) sentence splitter and tokenizer, implemented by the well known linguistic pipeline available for the Italian language: “openNLP”8, with its …

An Analysis Framework for Hybrid Authorship Verification.
S Mechti, M Jaoua, R Faiz, LH Belguith – Research in Computing …, 2016 – rcs.cic.ipn.mx
… proposed method. Thus, we used the Delta rule in the extraction module of the sub corpus to calculate the distance between two texts. Also, we used the OpenNLP for the extraction of the stylistic and statistical features. To calculate …

Complementarity, F-score, and NLP Evaluation.
L Derczynski – LREC, 2016 – derczynski.com
… For example, one may want to recognise names of all locations, people and System Precision Recall F1 Standard ANNIE 68.20 83.72 75.17 OpenNLP 81.45 53.29 64.43 Complementary Comp(ANNIE, OpenNLP) 78.15 14.22 24.06 Comp(OpenNLP, ANNIE) 20.00 56.04 31.67 …

Open NLP based Refinement of Software Requirements
M Mohanan, P Samuel – … Journal of Computer Information Systems and …, 2016 – mirlabs.net
… Here a neoteric approach is proposed to generate object oriented items from SRS. For NL processes like sentence detection, tokenization, parts of speech tagging and parsing of requirement specifications we incorporate an open natural language processing (OpenNLP) …

Learning to answer biomedical questions: Oaqa at bioasq 4b
ZYYZE Nyberg – ACL 2016, 2016 – aclweb.org
… of each token, the semantic type of each concept in the question, the dependency label of each token, combination of semantic type labels and depen- dency labels, etc., where the concepts are identi- fied from MetaMap, LingPipe NER, and Apache OpenNLP Chunker12 (noun …

Corp: Coreference resolution for portuguese
E Fonseca, R Vieira, A Vanin – Proceedings of the International …, 2016 – ontolp.inf.pucrs.br
… As we are developing a system in Java, we have used Java based open source tools such as Cogroo [2] and OpenNLP1 . OpenNLP provides POS tagging and named entities recognition, while Cogroo provides noun phrase chunks and shal- low structure. …

An Automatic Approach for Discovering and Geocoding Locations in Domain-Specific Web Data
CA Mattmann, M Sharan – memex.jpl.nasa.gov
… Our approach builds upon the Apache Tika, Apache OpenNLP, and Apache Lucene frameworks. Tika is used to extract text and metadata from any file. The text and metadata are pro- vided to Apache OpenNLP and its location entity extraction model. …

New York University 2016 system for KBP event nugget: A deep learning approach
TH Nguyen, A Meyers, R Grishman – Proceedings of Ninth Text Analysis …, 2016 – tac.nist.gov
… In order to prepare the input documents for neu- ral networks, our preprocessing steps include sen- tence detection and tokenization using the OpenNLP toolkit1, and dependency parsing for the detected sentences using the CoreNLP toolkit2 from Stanford University. …

UTHealth at SemEval-2016 Task 12: an End-to-End System for Temporal Information Extraction from Clinical Notes.
HJ Lee, H Xu, J Wang, Y Zhang, S Moon… – SemEval@ NAACL …, 2016 – m-mitchell.com
… Please note that we utilized the following tools to construct our system: 1) CLAMP toolkit (http: //clinicalnlptool.com/index.php) for tokenization, 2) OpenNLP toolkit (http:// opennlp.sourceforge. net/) for Part-Of- Speech (POS) tagging and consitituency pars- ing, and 3) ClearNLP …

CDE-IIITH at SemEval-2016 Task 12: Extraction of Temporal Information from Clinical documents using Machine Learning techniques.
VR Chikka – SemEval@ NAACL-HLT, 2016 – anthology.aclweb.org
… POS and Chunk tags: Parts of speech and chunk tags of the word. OpenNLP tagger is used for POS and Chunk tagging (Baldridge, 2005). … In Proceedings of NAACL-HLT. Jason Baldridge. 2005. The opennlp project. URL: http://opennlp. apache. org/index. …

Nanomedicine Entity Extraction.
BT McInnes, R Murphy, GN Jones, M Hodson, T Izadi… – AMIA, 2016 – people.vcu.edu
… Results Evaluation • 10-fold cross validation • Evaluation Metrics: Precision; Recall; F-measure • Compare with state-of-the-art entity extractions: StanfordNER and OpenNLP … Table 2. F-measure results using the StanfordNER and OpenNLP entity extractors …

Opportunities for analyzing hardware specifications with NLP techniques
A Rago, C Marcos, A Diaz-Pace – 3rd Workshop on Design …, 2016 – alejandrorago.com.ar
… Page 3. Figure 2. Linguistic Analyses made with the UIMA Framework available. Well-known packages include: OpenNLP2, Stanford CoreNLP3, and Mate-Tools4. For example, OpenNLP provides algorithms for sentence splitting, token splitting and POS tagging. …

Exploring the Intersection of Short Answer Assessment, Authorship Attribution, and Plagiarism Detection.
B Rudzewitz – BEA@ NAACL-HLT, 2016 – m-mitchell.com
… Task NLP Tool Sentence Detection OpenNLP (Baldridge, 2005) Tokenization OpenNLP (Baldridge, 2005) Lemmatization TreeTagger (Schmid, 1994) Spell Checking Edit distance (Levenshtein, 1966) igerman98 word list POS Tagging TreeTagger (Schmid, 1994) NP Chunking …

Domain-specific Entity Spotting: Curation Technologies for Digital Humanities and Text Analytics
P Bourgonje, J Moreno-Schneider, G Rehm – pdfs.semanticscholar.org
Page 1. Domain-specific Entity Spotting: Curation Technologies for Digital Humanities and Text Analytics Peter Bourgonje, Julián Moreno-Schneider, and Georg Rehm Language Technology Lab, DFKI Alt-Moabit 91c, 10559 …

Articulate2: Toward a Conversational Interface for Visual Data Exploration
J Aurisano, A Kumar, A Gonzalez, J Leigh… – pdfs.semanticscholar.org
… Apache OpenNLP [12] was used to generate unigrams, bigrams, trigrams, chunking, and tagged un- igrams, while Stanford Parsers implemented Collins rules [2] were used to obtain the headword. The feature vector is comprised of 7,244 total features. … [12] A. OpenNLP. …

A Constituent Syntactic Parse Tree Based Discourse Parser.
Z Li, H Zhao, C Pang, L Wang, H Wang – CoNLL Shared Task, 2016 – aclweb.org
… (3) There are different ways to process text, and our work shows that using constituent parse tree is a proper method in this task or similar ones. * MaxEnt classifier of OpenNLP, an open-source toolkit. See http://opennlp. apache. org/ 61 Page 72. 3.1. …

Processing Natural Language Queries to Disambiguate Named Entities and Extract Users’ Goals: Application to e-Tourism.
S Kamath, L Goeuriot, MC Fauvet – CORIA-CIFED, 2016 – pdfs.semanticscholar.org
… 5.3. Named Entity Recognition and Disambiguation Apache OpenNLP toolkit also provides a tool for Named Entity Recognition. The external knowledge sources we use are WolframAlpha6, Google places7, Foursquare8. 3 …

ConvKN at SemEval-2016 Task 3: Answer and question selection for question answering on Arabic and English fora
S Joty, A Moschitti, FA Al Obaidli, S Romeo… – Proceedings of …, 2016 – aclweb.org
… More precisely, we used OpenNLP’s tokenizer, POS-tagger and chunk anno- tator6, and Stanford’s lemmatizer (Manning et al., 2014), all accessible through DKPro Core. … net. 5https://dkpro.github. io/dkpro-core/ 6https://opennlp.apache.org/ 7http://stanfordnlp.github.io/CoreNLP …

GoWvis: a web application for Graph-of-Words-based text visualization and summarization
AJP Tixier, K Skianis, M Vazirgiannis – ACL 2016, 2016 – aclweb.org
… In addition to R built-in functions, the stringr pack- age (Wickham, 2015) is used here. Also, text is split into sentences using the implementation of the Apache OpenNLP Maxent sentence detector offered by the openNLP R package (Hornik, 2015). …

Benchmarking mi-pos: Malay part-of-speech tagger
BCM Xian, M Lubani, LK Ping… – International …, 2016 – umexpert.um.edu.my
… OpenNLP [14] is an open source NLP code library with pre-trained models to perform different NLP tasks such as tokenization, POS tagging, Named-Entity recognition (NER), chunking, parsing and coreference resolution. Although …

Towards a rhizomatic narrative
JN Vilaplana – users.cecs.anu.edu.au
… Named Entity Recognition: identification and classifi- cation of Named Entities (NE) in each segment. We applied the OpenNLP Named Entity recogniser [2], which distilled four types of entities, Person, Location, Organization and Others. … The opennlp project. …

Mining Health Data using Weighted Approach
P Priyanga, NC Naveen – Communications – caeaccess.org
… Relevant sentence extraction module is used to detect sentences using Apache OpenNLP to check for relevance to the search key. These relevant sentences are then stored as an array of strings. The OpenNLP tokenizer segments an input character sequence into tokens. …

A web application for Graph-of-Words-based text visualization and summarization
AJP Tixier, K Skianis, M Vazirgiannis – lix.polytechnique.fr
… In addition to R built-in functions, the stringr package (Wickham, 2015) is used here. Also, text is split into sentences using the implementation of the Apache OpenNLP Maxent sentence detector offered by the openNLP package (Hornik, 2015). …

SoMaJo: State-of-the-art tokenization for German web and social media texts
T Proisl, P Uhrig – ACL 2016, 2016 – aclweb.org
… There are, however, also sys- tems that use supervised or unsupervised machine learning techniques, eg the maximum entropy to- kenizer offered by the Apache OpenNLP project4 or the HMM-based one presented by Jurish and Würzner (2013). … sed 4https://opennlp. apache. …

AI for Online Criminal Complaints: From Natural Dialogues to Structured Scenarios
F Bex, J Peters, B Testerink – Artificial Intelligence for Justice …, 2016 – ecai2016.org
… Since we are interested in name recognition, we do make use of this. OpenNLP 5 provides models for finding the names for persons, loc- ations and organisations for both English and Dutch. We adopt the OpenNLP module …

Sentiment Analysis on Product Reviews using Hadoop
J Mehta, J Patil, R Patil, M Somani… – … Journal of Computer …, 2016 – pdfs.semanticscholar.org
… General Terms Algorithms Keywords Sentiment Analysis, Opinion Mining, Product Reviews, Hadoop, MapReduce, OpenNLP, SentiWordNet. … Apache?s OpenNLP has been used to perform Sentence Detection and POS tagging. …

Automatic Analysis of Flaws in Pre-Trained NLP Models
RE de Castilho – WLSI-OIAF4HLT 2016, 2016 – aclweb.org
… Since the license of the TreeTagger models does not allow for redistribution, we can unfortunately not make them available. 4http://opennlp. apache. org 21 Page 34. Tool ID Product Tool Languages Series Models C-TAG CoreNLP …

Enabling Technology Modules: Final Version
E Berndl, T Kurz, T Köllmer – 2016 – mico-project.eu
… 7 2.3.1 OpenNLP Text Classification Extractor (TE–213) – New . . . . . 7 2.3.2 Competence Classification (TE–213) – New . . . . . … 12 2.3.4 OpenNLP Named Entity Recognition (TE-220) – New . . . . . …

JATE 2.0: Java Automatic Term Extraction with Apache Solr.
Z Zhang, J Gao, F Ciravegna – LREC, 2016 – researchgate.net
… Many text mining and NLP tools have integrated with Solr to benefit from these fea- tures as well as contributing additional support for doc- ument indexing (eg, OpenNLP plugins10, SolrTextTag- ger11, and UIMA12). However, no plugin is available for …

Impact of MWE Resources on Multiword Recognition
M Riedl, C Biemann – ACL 2016, 2016 – aclweb.org
… For retriev- ing POS tags, we apply the OpenNLP POS tag- ger4. The lemmatization is performed using the WordNetLemmatizer, contained in nltk (Loper and Bird, 2002). … chokkan. org/software/ crfsuite 4We use the version 1.6 available from: https:// opennlp. apache. org. …

Log Analysis and Document Classification Toolkit (First Version)
L Dolamic, S Metallidis, J Palotti, C Boyer, A Hanbury – kconnect.eu
… In the light of the specificity of the given vocabulary for the Date Attribution criterion, we opted to replace the machine learning classifier for this criterion by the Named Entity Recognition (NER) tool from the OpenNLP toolkit [6]. The previously mentioned corpus of 2794 excerpts …

Different Applications and Techniques for Sentiment Analysis
S Alhojely – International Journal of Computer …, 2016 – pdfs.semanticscholar.org
… 3. OpenNLP: play out the most widely recognized NLP assignments, for example, POS labeling, named substance extraction, lumping, what’s more, co-reference determination. http://opennlp.apache.org/StanfordCoreNLP: If …

TextPro-AL: An Active Learning Platform for Flexible and Efficient Production of Training Data for NLP Tasks.
B Magnini, AL Minard, MRH Qwaider… – COLING …, 2016 – pdfs.semanticscholar.org
… 1http://stanfordnlp.github.io/CoreNLP/ 132 Page 3. Stanford University, the OpenNLP pipeline2 and LingPipe3. … 2http://opennlp.apache.org/index.html 3http://alias-i.com/lingpipe/index.html 4http://textpro.fbk.eu/ 5The IOB2 tagging format is a common format for text chunking. …

Mining Real Time Tweets for Cover Picture Suggestion by Text Mining Techniques
NM Zalavadiya – 2016 – irjet.net
… data preparation, results validation, visualization and optimization. • OpenNLP: The Apache OpenNLP library is a machine learning based toolkit for the dealing out of natural language text. It supports the foremost common NLP …

Easily Accessible Language Technologies for Slovene, Croatian and Serbian
N Ljubešic, T Erjavec, D Fišer, T Samardzic, M Milicevic… – 2016 – sdjt.si
… 1http://nl.ijs.si/tei/convert/ 2http://eng.slovenscina.eu/tehnologije/ oznacevalnik 2012), OpenNLP (Apache Software Foundation, 2014), there are two main reasons why they do not suit our needs. … 2014. openNLP Natural Language Processing Library. http://opennlp.apache.org/. …

Assessing the Quality of Unstructured Data: An Initial Overview.
C Kiefer – LWDA, 2016 – ceur-ws.org
… To get these confidence values, follow the documentation of the OpenNLP library (see footnote 2, eg, for the 1 https://dkpro.github.io/dkpro-similarity/ 2 https://opennlp.apache.org/ Page 8. 8 Assessing the Quality of Unstructured Data: An Initial Overview …

Creation of comparable corpora for English-Urdu, Arabic, Persian.
M Abouammoh, K Shah, A Aker – LREC, 2016 – staffwww.dcs.shef.ac.uk
… To extract such core terms we first extract from the source document all nouns using the OpenNLP toolkit4. Then each noun is ranked according whether it is mentioned in the title, in the first sentence of the source document, in the following 5 sentences after the first sentence …

TLT-CRF: A Lexicon-supported Morphological Tagger for Latin Based on Conditional Random Fields.
T vor der Brück, A Mehler – LREC, 2016 – researchgate.net
… Re- garding the taggers of our evaluation, only MarMoT and TLT-CRF natively support the use of a lexicon (OpenNLP supports a lexicon but this use is not documented). … Page 6. TLT-CRF MarMoT Lapos TnT Mate TreeTagger Stanford OpenNLP Max.Entr Perceptron …

University of Alicante at the NTCIR-12: Mobile Click.
F Llopis, E Lloret, JM Gómez – NTCIR, 2016 – research.nii.ac.jp
… Once the input is splitted into sentences, using the OpenNLP Java library3, each of them is tokenized to subsequently filter stopwords. … ftp://ftp.cis.upenn.edu/pub/chunker/ 3https://opennlp.apache.org/ 4http://sourceforge.net/projects/jwordnet them. …

The Scielo Corpus: a Parallel Corpus of Scientific Publications for Biomedicine.
ML Neves, A Jimeno-Yepes, A Névéol – LREC, 2016 – lrec-conf.org
… We decided to use the HANA database tool for sentence splitting after finding that it compared favorably to the OpenNLP library8 on a sample of documents. … php?script=sci_abstract&pid= S0874-48902010000300006&lng=pt&nrm=iso& tlng=pt 8https://opennlp.apache.org/ …

Phrase Detectives Corpus 1.0 Crowdsourced Anaphoric Coreference.
J Chamberlain, M Poesio, U Kruschwitz – LREC, 2016 – lrec-conf.org
… the case of fre- quent errors: 1. A pre-processing step normalised the input, applied a sentence splitter and ran a tokeniser over each sen- tence (developed from the openNLP toolkit11); 2. A custom-developed processing step …

ASU: An Experimental Study on Applying Deep Learning in Twitter Named Entity Recognition
MN Gerguis, C Salama, MW El-Kharashi – WNUT 2016, 2016 – aclweb.org
… 1 POS Tags Our experimentation started by deciding which POS tagger to use. We experimented with OpenNLP model (Morton et al., 2005), TweetNLP model (Owoputi et al., 2013; Gimpel et al., 2011), and Ritter’s model (Ritter et al., 2011). … Opennlp: A java-based nlp toolkit. …

Extracting process graphs from medical text data.
A Niekler, C Kahmann – ODLS, 2016 – keki2016.linguistic-lod.org
… procedures. Thus, the process for identifying and 2 OpenNLP was used to process the text sources for this paper. http://opennlp. apache.org/ 3 The “*” implies a minimum occurrence of 0 and an unbounded maximum occur- rency. …

A Tool for Efficient Content Compilation.
BA Galitsky – COLING (Demos), 2016 – pdfs.semanticscholar.org
… of written documents on a wide variety of topics is available at http://mail3.fvds.ru/wrt_latest/.The source code can be obtained at https://github.com/bgalitsky/relevance-based-on-parse-trees under Apache Licence and is a sub-project of Apache OpenNLP https://opennlp.apache …

Checking Eligibility of Google and Microsoft Machine Learning Tools for use by JACK e-Learning System
R Kabbani – researchgate.net
… Figure 2: OpenNLP, NELA, and NLP Checkers integrated within JACK Backend [3]. _____ 4 … These important functions are provided by a library, also implemented in Java, this tool is called NELA, which is in turn based on OpenNLP library. …

Non-uniform Language Detection in Technical Writing
AJ Soto, EE Milios – anthology.aclweb.org
… length to represent the length of each can- didate pair. 4OpenNLP: https://opennlp. apache.org/ documentation/1.5.3/manual/opennlp.html 1894 Page 4. Input : User Manual Output: Threshold-Length_List [(T1, L1), … 1 begin …

Sentiment Analysis and Opinion Mining: A Survey
S Alhojely – pdfs.semanticscholar.org
… Language analysis modules for developers contribute various languages are available to be applied plugged in your pipeline. 3. 3. OpenNLP: perform the most common NLP tasks, such as POS tagging, named entity extraction, chunking, and co-reference resolution. …

A multi-lingually applicable journalist toolset for the big-data era
G Kiomourtzis, G Giannakopoulos, V Karkaletsis… – iit.demokritos.gr
… We combine a set of entity lists (gazetteers) from a variety of sources (name/surname lists, or- ganizations, etc.) with NER models (based on the OpenNLP toolkit [Baldridge, 2005]) to identify entities in the journal- ist article. … The opennlp project. URL: http://opennlp. apache. …

Using Peer Assessment Data to Help Improve Teaching and Learning Outcomes
Z Jin – 2016 – cs.anu.edu.au
… Based on the above criteria, a comparison of five document analysis tools, KH Coder, tm (Text Mining Infrastructure in R), Natural Language Toolkit (NLTK), KNIME, and OpenNLP was conducted; the results of which are shown in Table 1, below. Page 15. 15 …

Implicit Aspect Detection in Restaurant Reviews
R Panchendrarajan, MNN Ahamed… – Proceedings of NAACL …, 2016 – aclweb.org
… For explicit aspect identification, a standard maximum entropy classifier (Opennlp.apache.org, 2016) is used to create the model M1 using our an- notated corpus. … ACM, 2005. 135 Page 9. Opennlp.apache.org,. “Apache Opennlp – Welcome To Apache Opennlp”. Np, 2016. Web. …

Commonsense in Parts: Mining Part-Whole Relations from the Web and Image Tags
NTCHJ Urbani, ARMRG Weikum – 2016 – ai2-s2-pdfs.s3.amazonaws.com
… Our novel contribution is to add a new layer on top of IMS to solve this problem. First, we perform noun phrase chunking on the input sentence where the assertion occurs. We use the widely used OpenNLP Chunker (opennlp.apache.org). …

CoNLL 2016 Shared Task on Multilingual Shallow Discourse Parsing.
N Xue, HT Ng, S Pradhan, A Rutherford… – CoNLL Shared …, 2016 – aclweb.org
… 5), Convolutional Network (implicit discourse senses) syntactic parses, word embeddings no devenshu (Jain and Majumder, 2016) DA-IICT Maxent (openNLP) syntactic parses no ecnucs (Wang and Lan, 2016) ECNU Liblinear, convolutional network for implicit relation (for …

The Denoised Web Treebank: Evaluating Dependency Parsing under Noisy Input Conditions.
J Daiber, R van der Goot – LREC, 2016 – let.rug.nl
… Our experiments use a standard trigram HMM tagger3 (Brants, 2000) and the OpenNLP maximum entropy tagger.4 Impact on parse quality Table 2 shows the influence of POS tagging on the performance of the MST parser on the development part of our dataset. …

The ACL RD-TEC 2.0: A Language Resource for Evaluating Term Extraction and Entity Recognition Methods.
B QasemiZadeh, AK Schumann – LREC, 2016 – pdfs.semanticscholar.org
… machine translation, speech recogni- tion, . . . 2 Tool and Library Names of implemented (actualised) methods and libraries OpenNLP, Sphinx, . . . … 10Using OpenNLP pre-trained sentence splitting (https:// opennlp.apache.org/.) 11Optical Character Recognition. 1866 Page 6. …

Get Semantic With Me! The Usefulness of Different Feature Types for Short-Answer Grading.
U Padó – COLING, 2016 – nlpado.de
… 4 Method We pre-processed the corpora with the DKPro pipeline (Eckart de Castilho and Gurevych, 2014), using the OpenNLP segmenter5, the TreeTagger for POS tags and lemmas (Schmid, 1995) and the MaltParser (Nivre, 2003) for dependency parses. …

VCU at Semeval-2016 Task 14: Evaluating similarity measures for semantic taxonomy enrichment
BT McInnes – Proceedings of SemEval, 2016 – aclweb.org
… To obtain the POS of the words in the OOV descriptions, we used the OpenNLP POS Tagger (Baldridge et al., 2002). 3 Evaluation Metrics The VCU system was evaluated using four met- rics: WuP, Lemma Match, Recall and F1. … 2002. The opennlp maximum entropy package. …

Neural Attention for Learning to Rank Questions in Community Question Answering.
S Romeo, G Da San Martino, A Barrón-Cedeno… – COLING, 2016 – aclweb.org
… 2In a set of preliminary experiments, we compared, a true re-ranker, SVMrank (Joachims, 2002), with a standard SVM, the results were comparable. 3We have used the OpenNLP tool to build the trees: https://opennlp.apache.org. 1737 Page 5. …

Identifying Multiple Topics in Texts.
M Mouine, D Inkpen… – Int. J. Comput …, 2016 – pdfs.semanticscholar.org
… Most of the meaningful noun phrases are between 2 to 3 words long, without stopwords. We use the part of speech tagger (POS) of the OpenNLP java library5 to extract nouns phrases (NPs). … 5 https://opennlp.apache.org/documentation/manual/opennlp.html Page 14. …

The Language Application Grid and Galaxy.
N Ide, K Suderman, J Pustejovsky, M Verhagen… – …, 2016 – pdfs.semanticscholar.org
… Figure 3 shows a simple workflow configuration in LAPPS/Galaxy that invokes a chain of processors from different sources (in this example, GATE, Stanford NLP tools, and OpenNLP tools) to perform named entity recog- nition. …

LiMoSINe pipeline: Multilingual UIMA-based NLP platform
O Uryupina, B Plank, G Barlacchi, FV Albacete… – ACL 2016, 2016 – aclweb.org
… In addi- tion, many research groups publicly release their pre- 1http://opennlp. apache. org processing modules. … 3.3 Spanish We have tested two publicly available toolkits support- ing language processing in Spanish: OpenNLP and IXA (Agerri et al., 2014). …

Mining Consumption Intent from Social Data: A Survey
F Khan, S Borah, A Pradhan – researchgate.net
… 16 and bag of words. Renowned toolkits include StanfordNLP [17] and OpenNLP [18]. 4.4 Classification This step classifies a given set of posts into predefined categories. … 20 [18] Baldridge, Jason. “The opennlp project.” URL: http://opennlp. apache. org/index. …

OPINION MINING: APPLICATIONS, TECHNIQUES, TOOLS, CHALLENGES AND FUTURE TRENDS OF SENTIMENT ANALYSIS
N Mago – ijcea.com
… They are available to be used plugged in the pipeline. • OpenNLP: perform the most common NLP tasks, such as POS tagging, named entity extraction, chunking and co-reference resolution. http://opennlp.apache.org/.StanfordCoreNLP. …

IDENTIFYING OPINION FEATURES USING INTRINSIC AND EXTRINSIC DOMAIN RELEVANCE
AP Nimbhore, SB Siledar – ijtra.com
… Keywords- Opinion Mining, Opinion Feature Mining, Online Reviews, OpenNLP. I. Introduction … The contribution of this paper is to extract relevant features during the process of sentiment analysis and applying proper part of speech tagging is performed by using OpenNLP. …

An Integrated Approach to Answer Selection in Question Answering: Exploring Multiple Information Sources and Domain Adaptation
B Rudzewitz – 2016 – sfs.uni-tuebingen.de
… Sentence Detection OpenNLP2 DKPro Tokenization OpenNLP DKPro Web Token Annotation ArkTweetNLP (Gimpel et al., 2011) CoMiC … Lemmatization TreeTagger (Schmid, 2013) DKPro Chunking OpenNLP DKPro Dependency Parsing MaltParser (Nivre et al., 2007) CoMiC …

From CATs to KATs
F do Carmo, L Trigo, B Maia – pdfs.semanticscholar.org
… engine. A toolkit like OpenNLP (https://opennlp.apache.org/) is appropriate for this task, since it extends this feature recognition to locations, dates and other elements that may be tagged for processing separately by the MT engine. …

Text Analytics: the convergence of Big Data and Artificial Intelligence.
A Moreno, T Redondo – IJIMAI, 2016 – researchgate.net
… names, etc. These are several tools relevant for this task: Apache OpenNLP [2], Stanford Named Entity Recognizer [3] [4], LingPipe [5]. Fig. 1. Overview of a Text Mining Framework B. Topic Tracking and Detection) Keywords …

Using Ontology-Driven Methods to Develop Frameworks for Tackling NLP Problems.
T Kostareva, S Chuprina, A Nam – AIST (Supplement), 2016 – ceur-ws.org
… con- struction and refinement. We use the term “semi-automatic ontology engineering” as 1 http://opennlp.apache.org/index.html 2 http://www.nltk.org/ 3 https://gate.ac. uk/ 4 http://nlp.stanford.edu/ Page 3. opposed to ontology …

ltl. uni-due: Stance Detection in Social Media Using Stacked Classifiers
M Wojatzki, T Zesch – ltl.uni-due.de
… hashtag labels that occur at the end of a tweet. For all other hashtags, we additionally apply the OpenNlp PoS tagger5 and overwrite the hashtag label with the syntactic cate- gory. Afterwards, we annotate a fixed set of nega- tions …

Stack Overflow Question Analysis using Topic Modeling
N Ganesan – pdfs.semanticscholar.org
… Their model was built using MALLET[2] which a basically a tool built using Java for building the LDA model. For pre-processing the questions, they used a natural language processing tool Apache OpenNLP. Their train their data using LDA and performed over 2000 iterations. …

Cross Lingual Mention and Entity Embeddings for Cross-Lingual Entity Disambiguation
H Shahbazi, C Ma, X Fern, P Tadepalli – tac.nist.gov
… Chinese and Span- ish languages. Our annotator uses pre-trained models from Stanford CoreNLP [Manning et al., 2014] and OpenNLP imported in the Reconcile system [Stoyanov et al., 2010]. We make some adjustments …

Between Platform and APIs: Kachako API for Developers
Y Kano – WLSI-OIAF4HLT 2016, 2016 – aclweb.org
… an open source project in Apache UIMA2. Most of UIMA related works are UIMA component implementations, including OpenNLP, JulieLab (Hahn et al., 2008), CCP BioNLP (Baumgartner Jr. et al., 2008), U-Compare (Kano …

On the Acquisition of Intensifier Constructions
M Schweinberger – martinschweinberger.de
… Exchange System) ? POS-tagged all utterances in the HSLLD ? POS-tagging via the Apache OpenNLP library in R using a Maximum Entropy model (machine learning based toolkit for NLP of text written in Java) ? search for …

Should corpus linguists learn to code?
PE Rayson – BAAL Corpus Linguistics SIG, 2016 – ucrel.lancs.ac.uk
… linguishcs community – Computahonal linguishcs (NLP) methods and tools – OpenNLP, NLTK, GATE, Stanford CoreNLP – POS, NER, Coreference, Dependencies, Senhment, IE – The power of R, Python, Java, C … – Visualisahons …

Extracting Useful Information from Clinical Notes.
Y Wang, H Fang – TREC, 2016 – trec.nist.gov
… Apache cTAKES1 is a powerful toolkit that designed to extract information from electronic medical records. cTAKES is built using the UIMA framework and OpenNLP. One useful component of cTAKES is the noun phrase detection. …

The Sensitivity of Topic Coherence Evaluation to Topic Cardinality.
JH Lau, T Baldwin – HLT-NAACL, 2016 – aclweb.org
… 2The sub-sampled document collections are lemmatised us- ing OpenNLP and Morpha (Minnen et al., 2001) before topic modelling. Domain N 5 10 15 20 WIKI 2.42 (±0.54) 2.37 (±0.53) 2.35 (±0.51) 2.29 (±0.50) NEWS 2.49 (±0.53) 2.46 (±0.53) 2.42 (±0.51) 2.39 (±0.51) …

QTLeap WSD/NED Corpora: Semantic Annotation of Parallel Corpora in Six Languages.
A Otegi, N Aranberri, A Branco, J Hajic, M Popel… – LREC, 2016 – di.fc.ul.pt
… part of IXA pipes (Agerri et al., 2014). Every model has been trained with the averaged Perceptron algorithm as described in Collins (2002) and as implemented in Apache OpenNLP. The datasets used for training the models …

Enhancing Academic Literature Review through Relevance Recommendation
T Rúbio, C Gulo – 11th Iberian Conference on Information Systems …, 2016 – paginas.fe.up.pt
… such as results and methodological discussions. Apache OpenNLP, Stanford CoreNLP, MALLET, GATE and CIteSpace [8, 9] are just some tools provided for text processing using NLP. Information retrieval techniques enable to …

Overview of the IGGSA 2016 Shared Task on Source and Target Extraction from Political Speeches
J Ruppenhofer, JM Struß… – Bochumer …, 2016 – linguistics.ruhr-uni-bochum.de
… 2http://www.parlament.ch/ab/frameset/ d/n/4802/263473/d_n_4802_263473_263632. htm 3http://opennlp.apache.org/ into TigerXML-Format using TIGER-tools (Lez- ius, 2002). To perform the annotation we used the Salto-Tool (Burchardt et al., 2006).4 …

Twitter-based semantic approach to multi-class classification
KV Rathore, I Mukherjee – academia.edu
… feature vector. 2. Extraction of keyphrases: – WordNet POS tagging and shallow parsing (OpenNLP sentence chunkers) are used to extract noun phrase unigrams, bigrams and trigrams [12] from our tweets. We remove terms …

CodE Alltag: A German-Language E-Mail Corpus.
U Krieg-Holz, C Schuschnig, F Matthies, B Redling… – LREC, 2016 – lrec-conf.org
… For these basic computations, we used the sentence split- ter and tokenizer from OPENNLP (with German models)10 and a JAVA wrapper for the lemmatization function of the TREETAGGER software.11 From these basic parameters, we determined the average length of …

Tweet Segmentation and Named Entity Recognition
C Chavan, R Surwanshi – IJSART, 2016 – ijsart.com
… 3.3. Preprocessing This module takes input as Twitter collected data, preprocess on it with the help of OpenNLP with the following steps, ? Stopword Removal ? Lemmization ? Tokenization ? Sentence segmentation ? part-of-speech tagging ? Named entity extraction 3.4. …

What to Do with an Airport? Mining Arguments in the German Online Participation Project Tempelhofer Feld
M Liebeck, K Esau, S Conrad – Proceedings of the Third Workshop on …, 2016 – aclweb.org
… As there is 8https://opennlp. apache. org 148 Page 161. … 4.1 Preprocessing First, we tokenize all sentences in our dataset with OpenNLP and use Mate Tools (Björkelund et al., 2010) for POS-tagging and dependency parsing. …

Visualizing and curating knowledge graphs over time and space
T Ge, Y Wang, G de Melo, H Li, B Chen – ACL 2016, 2016 – aclweb.org
… When a pair of entities matches a seed fact, the surface string between the two entities is lifted to a pattern. This is constructed by replacing the entities with 1http://tomcat. apache. org/ 2http://opennlp. apache. org/ 3http://www. postgresql. org/ 4https://www. openstreetmap. …

Cross Media Entity and Concept Driven Search.
C Perera, D Jayakody – SEMANTiCS (Posters, Demos …, 2016 – pdfs.semanticscholar.org
… This extractor is supported by a Diarization extractor, which acts as a preprocessor by splitting large audio files into more manageable ones. • Named Entity Extraction – Performs named entity recognition and linking based on Stanford NLP and OpenNLP language models. …

Corpus-based collocation games
S Wu, M Franken, IH Witten – researchgate.net
… The OpenNLP tagger2 was used to assign part-of-speech tags to five-grams and the tagged five-grams were compared against a chosen set of ten syntactic patterns or collocation types. … 1 http://www.greenstone.org/ 2 http://opennlp.sourceforge.net/ Page 12. 12 …

Concept Based Sentence Modeling for Extractive Speech Summarization
G keerthana Gaddam – ijercse.com
… processing of data in order to prepare it for further analysis.It involves tokenization,stemming, stop word removal.Clustering is a process of identifying interesting patterns within a document or group of documents.This whole process is done using OpenNLP toolkit which consists …

EXTRACTING ACTIONS FROM INSTRUCTION MANUAL AND TESTING THEIR EXECUTION IN A ROBOTIC SIMULATION
PN Hung, T Yoshimi – seed-net.org
… Delia Rusu and et al. [8] presented an approach to extracting subject-predicate-object triplets from English sentences by using four different well-known syntactical parsers including Stanford Parser, OpenNLP, Link Parser, and Minipar. …

POLY: Mining Relational Paraphrases from Multilingual Sentences.
A Grycner, G Weikum – EMNLP, 2016 – people.mpi-inf.mpg.de
… To create this ranking, we perform POS tagging and noun phrase chunking using Stanford CoreNLP (Manning et al., 2014) and Apache OpenNLP 2. For head noun extraction, we use the YAGO Javatools3 and a set of manually crafted regular expressions. …

Clinical Narrative Analytics Challenges
H Ambit, C Gonzalo – Rough Sets: International Joint Conference, IJCRS …, 2016 – Springer
… To evaluate the use of current NLP techniques in an automatic knowledge acquisi- tion domain, a system is introduced in Taboada et al.[27]. The system reuses OpenNLP, Stanford parsers, SemRep and UMLS NormalizeString service as building blocks. …

Achieving High Quality Tweet Segmentation using the HybridSeg Framework
I Kavati, P Dayakar, EA Reddy, VK Thumu – ijcttjournal.org
… Data Preprocessing is a module takes contribution as twitter gathered information, preprocess it with the assistance of OpenNLP with the accompanying strides: • Tokenization • Sentence division • POS (Parts-of-speech) labeling • Named Entity Recognition • Stopword Removal …

OECD Blue Sky meeting on Science and Innovation Indicators Ghent, 19-21 September 2016 Developing Science Culture Indicators through Text Mining and …
A Suerdem – oecd.org
… We used Apache OpenNLP named entity recognizer which offers a number of pre-trained name finder models to tag the named entities for organizations (https://opennlp.apache.org/ documentation/manual/opennlp.html#tools.namefind.recognition. Page 17. …

Preprint: Automated SKOS Vocabulary Design for the Biopharmaceutical Industry
R Hubain, M De Wilde, S van Hooland – academia.edu
… system in which it will be implemented. 3Terminology extraction from raw text can be performed with off-the-shelf tools (TerMine, FiveTerms, Alche- myAPI, etc.) but also with hand-crafted methods based on libraries such as NLTK, OpenNLP, etc. 4 Page 5. …

Developing science culture indicators through text mining and online media monitoring
MW Bauer, A Suerdem – 2016 – eprints.lse.ac.uk
Page 1. MW Bauer and A. Suerdem Developing science culture indicators through text mining and online media monitoring Conference Item Original citation: Bauer, MW and Suerdem, A. (2016) Developing science culture indicators …

Visualizing the spatiality in fictional narratives
JE Stange, M Dörk – 2016 – mariandoerk.de
… 2.3 Data corpus After a few unsuccessful attempts with the tools Stanford NLP and OpenNLP to apply Named Entity Recognition to extract place names from the three novels, we decided to pursue a manual approach. Being …

SemAligner: A Method and Tool for Aligning Chunks with Semantic Relation Types and Semantic Similarity Scores
N Maharjan, R Banjade, NB Niraula, V Rus – CRF, 2016 – lrec-conf.org
… The accuracy on the test data was comparable at 85.13% chunk level and 66.15 sentence level. Both the EO-NLP and CRF chunkers are available as part of the SemAligner tool. 4 http://opennlp.apache.org/cgi-bin/download.cgi 1208 Page 3. …

A Comparative Classification of Approaches and Applications in Opinion Mining
R Asgarnezhad, K Mohebbi – iaiest.com
… available are: Sentiment140, Opinion Crawl, OpenAmplify, Amplified Analytics, SAS Sentiment Analysis Manager, Twittratr, IBM Social Sentiment Index, SAS Sentiment Analysis Studio, TweetSentiments, Red Opal, Review Seer tool, OpinionFinder, Weka, and OpenNLP [13-15]. …

Summ-it++: an enriched version of the Summ-it corpus
A Antonitsch, A Figueira, D Amaral, E Fonseca… – of the Language …, 2016 – inf.pucrs.br
… As a pre-processing phase, the POS tagging was pro- vided through the use of the OpenNLP parser. With the texts properly tagged, the system is then able to extract and classify the NEs. For the training of the CRF model, the Second HAREM’s Golden Collection was utilized. …

RMIT at the NTCIR-12 MobileClick-2: iUnit Ranking and Summarization Subtasks.
K Ong, RC Chen, F Scholer – NTCIR, 2016 – research.nii.ac.jp
… The occurrences of iUnits are identified by running case- sensitive string matching. To further pinpoint the iUnits at the sentence level, we used the sentence delimiter in Apache OpenNLP to split sentences. 2.2 Evaluation We followed the setting in Yang et al. …

Towards semantic story telling with digital curation technologies
JM Schneider, P Bourgonje, J Nehring, G Rehm… – Proceedings of Natural …, 2016 – dfki.de
… et al., 2015]. Each analysis takes either plain text or NIF as input and outputs NIF, in which the additional semantic information is stored as annotations. Our NER module is based on OpenNLP. The approach combines models …

SoNLP-DP System for ConLL-2016 English Shallow Discourse Parsing.
F Kong, S Li, J Li, M Zhu, G Zhou – CoNLL Shared Task, 2016 – anthology.aclweb.org
… All our classifiers are trained using the OpenNLP maximum entropy package4 with the default pa- 3The PDTB provides annotation for Implicit relations, Al- tLex relations, entity transition (EntRel), and otherwise no relation (NoRel), which are lumped together as Non-Explicit …

Appraising UMLS Coverage for Summarizing Medical Evidence.
E ShafieiBavani, M Ebrahimi, RK Wong, F Chen – COLING, 2016 – aclweb.org
… sourceforge.net/projects/ebmsumcorpus 4http://www.jfponline.com/articles/clinical-inquiries.html 5Available at http://biotext.berkeley.edu/software.html 6http://nltk.org/ 7http://www.ncbi.nlm.nih. gov/books/NBK3827/table/pubmedhelp.T.stopwords/ 8http://opennlp.sourceforge.net/ …

Syntactic Parsing of Web Queries with Question Intent.
Y Pinter, R Reichart, I Szpektor – HLT-NAACL, 2016 – aclweb.org
… posted on Yahoo Answers; and (c) 100K story bodies from Yahoo News. In our analysis, individual sen- tences were identified using the OpenNLP sentence split- ting tool2, POS-tagged by the Stanford parser (Klein and Manning …

Tools for educational data mining
S Slater, S Joksimovic, V Kovanovic, R Baker… – J Edu Behav …, 2016 – research.ed.ac.uk
… NLP toolkits (Stanford CoreNLP, Python NLTK, Apache OpenNLP) Given that text mining systems typically involve analysis of natural language text, natural language processing (NLP) toolkits represent an important part of the text mining toolset. …

A novel approach for semantic analysis of Arabic texts using an Arabic ontology and Conceptual Graphs
M NASRI, L ABOUENOUR, A KABBAJ, K BOUZOUBAA – researchgate.net
… question). For this purpose, the syntactic tags of the text are identified using OpenNLP and are then used together with the WordNet library in order to respectively construct relations and concepts used to build the result CG. …

IMAI RESEARCH GROUP COUNCIL
EDDJS Carrión, RG Crespo, JP Mestras, A Rocha… – academia.edu
… names, etc. These are several tools relevant for this task: Apache OpenNLP [2], Stanford Named Entity Recognizer [3] [4], LingPipe [5]. Fig. 1. Overview of a Text Mining Framework B. Topic Tracking and Detection) Keywords …

Clustering Urdu News Using Headlines
S Khaliq, W Iqbal, F Bukhari, K Malik – LANGUAGE & TECHNOLOGY – cle.org.pk
… preprint arXiv: 1308.3830 (2013). [5] OpenNLP, Apache.” a Machine Learning Based Toolkit for the Processing of Natural Language Text.” URL http://opennlp. apache. org (Last accessed: 2016-09-18). [6] Manning, Christopher …

Named Entity Recognition from Indian tweets using Conditional Random Fields based Approach
ML Patawar, MA Potey – International Journal of Advanced Research in …, 2016 – ijarcet.org
… Many POS taggers for Indian languages have proposed [18][19]. In proposed system, an OpenNLP POS tagger is used for tweets for assigning tags. C. Algorithmic Steps While extracting NEs from tweets, it is required to normalize them first. …

Visual Analytics for Narrative Text
M John, S Lohmann, S Koch, M Wörner, T Ertl – visualdataweb.org
… Yet other metadata, such as the main characters listed on the overview page, can only be determined by using advanced text analysis, 3http://nlp.stanford.edu/software/corenlp.shtml 4http://opennlp.apache.org/ 5https://gate.ac.uk/ie/annie.html Page 4. Page 5. Page 6. Page 7. …

Opinion Mining on E-Commerce: Need of the Hour
JS Aravindan – ijtrd.com
… There are several tools available for extracting sentiments. Some of the NLP tools include LingPipe, OpenNLP, Stanford Parser, POS Tagger, OpenFST, NTLK, Opinion Finder, Tawlk/osae, GATE, textir and NLP Toolsuite. 2) Machine Learning Algorithms. …

A Study on Twitter 4j Libraries for Data Acquisition from Tweets
S Singh, TN Manjunath… – International Journal of …, 2016 – pdfs.semanticscholar.org
… and so we removed it. Overlapping features could get the NB accuracy down, so we were not very concerned about the drop with NB. However it didn’t provide any drastic change with OpenNLP either. 4. Part of Speech (POS …

KeLP at SemEval-2016 Task 3: Learning Semantic Relations between Questions and Answers.
S Filice, D Croce, A Moschitti, R Basili – SemEval@ NAACL-HLT, 2016 – aclweb.org
… We used the OpenNLP pipeline for lemmatization, POS tag- ging and chunking to generate the tree representa- tions described in Section 3.2. All the kernel-based learning models are implemented in KeLP (Filice et al., 2015b). …

418 BIOINFORMATICS: VOLUME II: STRUCTURE, FUNCTION, AND APPLICATIONS Index
C COVER – Volume II: Structure, Function, and Applications … – Springer
… 11 Oligodendrocyte….. 272–276, 282–285, 287–289, 292–294 OpenDMAP….. 141 OpenNLP….. 144, 145 Open PHACTS….. …

Event Detection, version 3 Deliverable D4. 2.3
ALM Vossen – kyoto.let.vu.nl
Page 1. Event Detection, version 3 Deliverable D4.2.3 Version FINAL Authors: Rodrigo Agerri1, Itziar Aldabe1, Zuhaitz Beloki1, Egoitz Laparra1, Ger- man Rigau1, Aitor Soroa1, Marieke van Erp2, Antske Fokkens2, Filip Ilievski2 …

Twitter Feeds Classification and Redundancy Removal in TV Show Domain
MDN Perera, KVL Deshapriya, CDK Ilangasinghe… – iciit.iit.ac.lk
… where: F1, F2 – Feeds to be compared, WSpos – word class noun, verb or adjectives etc, w – word belongs to cor- responding word class, maxSim – highest semantic similarity, idf(w) – inverse document frequency of word w 3https://opennlp.apache.org/ Page 4. …

Large Scale Authorship Attribution of Online Reviews
P Shrestha, A Mukherjee, T Solorio – cs.uh.edu
… character ngrams capture the writing style of an author and have been used by previous researches success- fully for authorship attribution [5, 4, 7]. Syntactic: We extract part of speech (POS) tags as well as chunks by using the tagger and chunker available in Apache OpenNLP …

Named Entity Recognition in Swedish Health Records with Character-Based Deep Bidirectional LSTMs
S Almgren, S Pavlov, O Mogren – BioTxtM 2016, 2016 – aclweb.org
… and Gerdin, 2010). Validation was done using the Medical Wikipedia dataset. Training was done using the Adam optimizer (Diederik Kingma, 2015). 4http://opennlp. sourceforge. net/models-1.5/ 34 Page 47. 4.7 Evaluation Evaluation …

Automatic Ticket Triage Using Supervised Text Classification
BV Ded?k – is.muni.cz
… The used dataset consists of 500 bug reports from Mahout, Lucene and OpenNLP projects. This model is able to achieve a weighted precision, recall, F-measure and AUC of 65%, 67%, 62%, and 71% respectively. Xia et al. …

Lexicon Expansion System for Domain and Time Oriented Sentiment Analysis
NRP da Silva Guimarães – 2016 – repositorio-aberto.up.pt
… Part of speech (POS) tags such as the Stanford Log-linear Part-Of-Speech Tagger [116] and OpenNLP3 have been used to identify what words are nouns, verbs and adjectives with the goal of determining if a certain text is subjective and consequently classify it with sentiment. …

Ideede sidumine toetamaks uuendajaid ja ettevõtjaid
K Hambardzumyan – 2016 – dspace.ut.ee
Page 1. UNIVERSITY OF TARTU FACULTY OF SCIENCE AND TECHNOLOGY INSTITUTE OF COMPUTER SCIENCE SOFTWARE ENGINEERING CURRICULUM KHACHATUR HAMBARDZUMYAN IDEAS MATCHMAKING FOR SUPPORTING …

Short text matching in performance management
M Apte, S Pawar, S Patil, S Baskaran… – … of Data Mar 11-13, 2016 …, 2016 – comad.in
Page 19. Short Text Matching in Performance Management Manoj Apte Tata Consultancy Services manoj. apte@ tcs. com Sachin Pawar* Tata Consultancy Services sachin7. p@ tcs. com Sangameshwar Patil† Tata Consultancy Services sangameshwar. patil@ tcs. …

Evaluating and combining named entity recognition systems
R Jiang, RE Banchs, H Li – Proceedings of the Sixth Named Entity …, 2016 – aclweb.org
… tools. Kepa et al. 2012 evaluated the efficacy of four NER tools (OpenNLP, Stanford NER, AlchemyAP and OpenCalais) at extracting entities directly from the output of an optical character recognition (OCR) workflow. Their experiments …

ANALYTICS IN POST-GRANT PATENT REVIEW: POSSIBILITIES AND CHALLENGES (PRELIMINARY REPORT)
S Long, EH Ng, C Downing, B Nepal – 2016 – webpages.uncc.edu
… We started by parsing a small number (20) of patent claims (Claim 1) using both on-line and server-based parsing tools (Chen and Manning. 2014, ver. 3.5.2; https://opennlp.apache.org/ vers. 1.6; http://nlp.stanford.edu/software/lex- …

Structural Models for Ranking Tasks of Community Question Answering
S Filice, D Croce, A Moschitti, R Basili – plg2.cs.uwaterloo.ca
… scores are generated with the 10-fold cv. We used the OpenNLP pipeline for lemmatization, POS tagging and chunking to generate the tree rep- resentations described in Section 3.2. All the kernel-based learning models are …

Retweetability Analysis and Prediction during Hurricane Sandy.
VK Neppalli, MC Medeiros, C Caragea, D Caragea… – ISCRAM, 2016 – idl.iscram.org
… We assign a feature value of 1 for the 3 https://opennlp.apache.org/ Figure 3. The distribution of users based on the verification status for top 1000 and last 1000 authors extracted for Sandy. Page 6. Neppalli et al. Retweetability Prediction during Hurricane Sandy …

Domain-Based Sense Disambiguation in Multilingual Structured Data
G Bella, A Zamboni, F Giunchiglia – DIVERSITY@ ECAI 2016, 2016 – ecai2016.org
… Part-of-speech tagger. As conventional learning-based POS tag- gers (such as OpenNLP) are suboptimal on short text, we use their output cautiously. First, we distinguish between closed-class and open-class words (nouns, verbs, adjectives, adverbs). …

Semi-Supervised Named Entity Recognition of Medical Entities in Swedish
S ALMGREN, S PAVLOV – publications.lib.chalmers.se
Page 1. Semi-Supervised Named Entity Recognition of Medical Entities in Swedish SIMON ALMGREN SEAN PAVLOV Department of Computer Science and Engineering CHALMERS UNIVERSITY OF TECHNOLOGY Gothenburg, Sweden 2016 Page 2. Page 3. …

The Reliable Knowledge Discovery in Textual Database using R Infrastructure
A Yadav – krishisanskriti.org
… The list of open source text tools, Carrot2 (text and search results clustering), GATE (General Architecture for Text Engineering, NLP), Gensim (Topic modelling and extraction of semantic information), OpenNLP (NLP), Natural Language Toolkit (NLTK – NLP), Orange (text mining …

Towards End-to-end Shallow Discourse Parsing
Z Yang, F Kong – paclic30.khu.ac.kr
… All our classifiers are trained using the OpenNLP maximum entropy package6 with the default parameters (ie without smoothing and with 100 iterations). As the PDTB corpus is aligned with the PTB corpus, the gold parse trees and sentence boundaries are obtained from PTB. …

An approach for building Web Service Composition Engine for RESTful APIs
S Vasudevamurthy – pdfs.semanticscholar.org
… object. They have used OpenNLP pack- age provided by Stanford University to achieve this success. They have used Neural network and particularly logistic re- gression to achieve a high confidence score from the extrac- tions. …

Automatised analysis of emergency calls using Natural Language Processing
E Andersson, B Eriksson, S Holmberg, H Hussain… – publications.lib.chalmers.se
… There were, however, several tools for analysing English. OpenNLP [16], Grammatical Framework (GF) [17] and CoreNLP [18] were all candidates for alternative tools. Initial research on each of the applications resulted in CoreNLP and GF as the alternatives to iKnow. …

Deep learning architecture for patient data de-identification in clinical records
AE Shweta, S Saha, P Bhattacharyya – ClinicalNLP 2016, 2016 – pdfs.semanticscholar.org
… in both the models. We divide the major sources of errors in three different categories. Following observations can be made: 5https://opennlp. apache. org/ 6https://taku910. github. io/crfpp/ 38 Page 51. • MISSED ENTITY: This …

Cross-Platform Text Mining and Natural Language Processing Interoperability
RE de Castilho, S Ananiadou, T Margoni, W Peters… – 2016 – pdfs.semanticscholar.org
Page 1. LREC 2016 Workshop Cross-Platform Text Mining and Natural Language Processing Interoperability PROCEEDINGS Edited by Richard Eckart de Castilho, Sophia Ananiadou, Thomas Margoni, Wim Peters, Stelios Piperidis 23 May 2016 Page 2. …

Cross-Platform Text Mining and Natural Language Processing Interoperability-Proceedings of the LREC2016 conference
RE de Castilho, S Ananiadou, T Margoni, W Peters… – 2016 – eprints.gla.ac.uk
Page 1. LREC 2016 Workshop Cross-Platform Text Mining and Natural Language Processing Interoperability PROCEEDINGS Edited by Richard Eckart de Castilho, Sophia Ananiadou, Thomas Margoni, Wim Peters, Stelios Piperidis 23 May 2016 Page 2. …

Knowledge graph construction for research literatures
A Oldoni – 2016 – raw.githubusercontent.com
Page 1. Knowledge graph construction for research literatures Alisson Oldoni A research thesis submitted for the degree of Master of Computing and Information Technology School of Computer Science and Engineering The University of New South Wales …

Named Entity Recognition in Albanian Based on CRFs Approach.
G Kono, K Hoxha – RTA-CSIT, 2016 – ceur-ws.org
… nodes. The conditional probability of a state sequence x = (x1,…,xT ) given an observation sequence y = (y1,…,yT ) calculated as: 5https://opennlp.apache.org/ Page 3. p?(y|x) = 1 Z?(x) exp { T ? t=1 K ? k=1 ?kfk(yt?1,yt,xt) } (1) …

Multilingual Automated Text Anonymization
FMC Dias – 2016 – inesc-id.pt
Page 1. Multilingual Automated Text Anonymization Francisco Manuel Carvalho Dias Thesis to obtain the Master of Science Degree in Information Systems and Computer Engineering Supervisors: Prof. Dr. Nuno João Neves Mamede Dr. João de Almeida Varelas Graça …

Data Quality Centric Application Framework for Big Data
VN Gudivada, D Rao, WI Grosky – ALLDATA 2016, 2016 – researchgate.net
… IR capability is essential for the DQFA to en- able information fusion for knowledge extraction. Open source libraries to consider for this task include Apache OpenNLP, Stanford NLP, NLTK, Apache Lucene, Apache Solr, ElasticSearch, and Splunk. …

Creating a novel geolocation corpus from historical texts
G DeLozier, B Wing, J Baldridge, S Nesbit – LAW X, 2016 – aclweb.org
… predictions. NER inclusive scores (P, R, F-1) are generally much lower for WoTR- Topo than other datasets because the NER sys- tems utilized (Stanford-NER and openNLP-NER) are trained on very different domains. Never …

Unsupervised All-words Sense Distribution Learning
A Bennett – 2016 – minerva-access.unimelb.edu.au
Page 1. Minerva Access is the Institutional Repository of The University of Melbourne Author/s: Bennett, Andrew Title: Unsupervised all-words sense distribution learning Date: 2016 Persistent Link: http://hdl.handle.net/11343/148422 File Description: Thesis …

One does not simply produce funny memes!–explorations on the automatic generation of internet humor
HG Oliveira, D Costa, A Pinto – Proceedings of 7th …, 2016 – computationalcreativity.net
… 2 https://news.google.com/news?cf=all&hl=pt-PT&pz= 1&ned=pt-PT_pt&output=rss 3 https://api.imgflip.com/ 4 https://twitter.com/ 5 http://twitter4j.org/ 6 https://opennlp. apache.org/ 7 http://label.ist.utl.pt/pt/labellex_pt.php 245 240 …

THE NATIONAL ARTIFICIAL INTELLIGENCE RESEARCH AND DEVELOPMENT STRATEGIC PLAN
S PLAN – 2016 – raincent.com
Page 1. October 2016 THE NATIONAL ARTIFICIAL INTELLIGENCE RESEARCH AND DEVELOPMENT STRATEGIC PLAN National Science and Technology Council Networking and Information Technology Research and Development Subcommittee Page 2. ii Page 3. iii …

Lessons Learnt from the Named Entity rEcognition and Linking (NEEL) Challenge Series
G Rizzoa, B Pereirab, A Vargac, M van Erpd… – semantic-web-journal.net
Page 1. Semantic Web 0 (0) 1–35 1 IOS Press Lessons Learnt from the Named Entity rEcognition and Linking (NEEL) Challenge Series Emerging Trends in Mining Semantics from Tweets Editor(s): Andreas Hotho, Julius-Maximilians …

Data Selection for trainable Neural Machine Translation Models
W Abdelmoula – 2016 – academia.edu
Page 1. Master of Science in Informatics at Grenoble option Artificial intelligence and the web Data Selection for trainable Neural Machine Translation Models Wejdene Abdelmoula 24 June 2016 Research project performed at Laboratory of Informatics of Grenoble …

A method to perform an automated background check on professional football players
T Hendrickx – 2016 – pure.tue.nl
Page 1. Eindhoven University of Technology MASTER A method to perform an automated background check on professional football players Hendrickx, T. Award date: 2016 Disclaimer This document contains a student thesis …

Automatic Generation of Sports News
JPBM Aires – 2016 – repositorio-aberto.up.pt
Page 1. FACULDADE DE ENGENHARIA DA UNIVERSIDADE DO PORTO Automatic Generation of Sports News João Pinto Barbosa Machado Aires DISSERTATION Mestrado Integrado em Engenharia Eletrotécnica e de Computadores Supervisor: Prof. Sérgio Sobral Nunes …

Text Analysis and Visualisation: creating deep interfaces to read textual document collections
JN Vilaplana – 2016 – researchgate.net
Page 1. Text Analysis and Visualisation: creating deep interfaces to read textual document collections Jaume Nualart Vilaplana 9 December 2016 A thesis submitted for the degree of Doctor of Philosophy in Communication University of Canberra ||*|| Page 2. ii Page 3. Abstract …

Health—exploring complexity: an interdisciplinary systems approach HEC2016
E Grill, M Müller, U Mansmann – European journal of epidemiology, 2016 – Springer
Page 1. 123 Page 2. ABSTRACTS Health—exploring complexity: an interdisciplinary systems approach HEC2016 28 August–2 September 2016, Munich, Germany Eva Grill • Martin Müller • Ulrich Mansmann © Springer Science+Business Media Dordrecht 2016 …

Large-Scale Semantic Relationship Extraction for Information Discovery
DS Batista – 2016 – researchgate.net
Page 1. UNIVERSIDADE DE LISBOA INSTITUTO SUPERIOR TÉCNICO Large-Scale Semantic Relationship Extraction for Information Discovery David Soares Batista Supervisor: Doctor Mário Jorge Costa Gaspar da Silva Thesis …

An ontology for human-like interaction systems
EA García – 2016 – core.ac.uk
Page 1. TESIS DOCTORAL An Ontology for Human-Like Interaction Systems Autora: Esperanza Albacete García Director/es: Francisco Javier Calle Gómez Elena Castro Galán DEPARTAMENTO DE INFORMÁTICA Leganés, Enero de 2016 Page 2. 2 Page 3. 3 …