Sentence Extractor - Meta-Guide.com

Notes:

Sentence extraction is the process of identifying and extracting sentences from a larger piece of text. Sentence extractors are algorithms or software tools that are designed to perform this task, typically by using a combination of natural language processing techniques and heuristics.

In the context of dialog systems, sentence extractors can be used to enable the system to understand and process user input. For example, if a user provides a long and complex sentence as input to a dialog system, the system can use a sentence extractor to identify the individual sentences within the input, and to process each sentence separately. This can help the system to understand the user’s intent and to generate a more accurate and relevant response.

In addition to their use in understanding user input, sentence extractors can also be used to improve the performance and accuracy of other natural language processing tasks. For example, they can be used to pre-process text data in order to improve the performance of other natural language processing algorithms, such as part-of-speech taggers and syntactic parsers.

Survey of text mining MW Berry, M Castellanos – Computing Reviews, 2004 – Springer … 1 Thesaurus Assistant 6.5. 2 Sentence Identifier 6.5. 3 Sentence Extractor. 6.6 Experimental Results…. 6.7 Mining Case Excerpts for Hot Topics 6.8 Conclusions 120 123 124 128 130 130 132 133 136 137 143 145 147 149 151 153 154 References….. … Cited by 450 Related articles All 15 versions

Knowledge-based weak supervision for information extraction of overlapping relations R Hoffmann, C Zhang, X Ling, L Zettlemoyer… – Proceedings of the 49th …, 2011 – dl.acm.org … fea- tures for aggregating the evidence from individual sentences, we demonstrate that aggregating strong sentence-level evidence with a simple deterministic OR that models overlapping relations is more effec- tive, and also enables training of a sentence extractor that … Cited by 87 Related articles All 19 versions

MUSE–a multilingual sentence extractor M Litvak, M Last, M Friedman, S Kisilevich – Computational linguistics & …, 2011 – cs.bgu.ac.il Abstract—MUltilingual Sentence Extractor (MUSE) is aimed at multilingual single-document summarization. MUSE implements the supervised language-independent summarization approach based on optimization of multiple statistical sentence ranking methods. The … Cited by 2 Related articles All 6 versions

The icsi summarization system at tac 2008 D Gillick, B Favre, D Hakkani-Tur – Proceedings of the Text …, 2008 – researchgate.net … programming so- lution. Our primary submission, a simple sentence extractor with an n-gram frequency heuristic, gives results at least as good as any reported on the non-update part of the main task. Our secondary submission … Cited by 29 Related articles All 12 versions

Information-content based sentence extraction for text summarization D Mallett, J Elding… – … Technology: Coding and …, 2004 – ieeexplore.ieee.org … summary evaluation [14]. In order to evaluate how well our sentence-extractor performs, we propose an extrin- sic retrieval-oriented evaluation that, unlike most previous work, does not rely on human assessors. Our experimental … Cited by 14 Related articles All 6 versions

Challenges for sentence level opinion detection in blogs MMS Missen, M Boughanem… – … and Information Science …, 2009 – ieeexplore.ieee.org … In this example, three sentences have been merged by sentence extractor to form one complete sentence. There is nothing wrong with sentence extractor but it happened because of lack of necessary punctuations in between sentences. … Cited by 13 Related articles All 16 versions

Topic-based Summarization at DUC 2005 H Saggion – Proceedings of Document Understanding Workshop ( …, 2005 – 83.212.103.151 … Abstract We describe a topic-based multidocument sentence extractor developed for the DUC 2005 competition. The system has been designed for the real task of producing summaries given an information need expressed as a set of questions. … Cited by 17 Related articles All 6 versions

Using dependency parsing and probabilistic inference to extract relationships between genes, proteins and malignancies implicit among multiple biomedical research … B Goertzel, H Pinto, A Heljakka, IF Goertzel… – Proceedings of the HLT …, 2006 – dl.acm.org … knowledge. 105 Page 3. Protein and Malignancy Tagger ? Nominalization Tagger ? Sentence Extractor ? Dependency Ex- tractor ? Relationship Extractor ? Semantic Mapper ? Probabilistic Reasoning System. Each component … Cited by 29 Related articles All 9 versions

Improved Machine Translation Performance via Parallel Sentence Extraction from Comparable Corpora. DS Munteanu, A Fraser, D Marcu – HLT-NAACL, 2004 – mt-archive.info … We compute the values of the classifier parameters using the YASMET2 im- plementation of the GIS algorithm. 3 Results We assess the performance of our parallel sentence extractor in the context of an end-to-end Arabic- English MT system. … Cited by 67 Related articles

Context-based hierarchical clustering for the ontology learning L Karoui, MA Aufaure, N Bennacer – Proceedings of the 2006 IEEE/WIC/ …, 2006 – dl.acm.org … to improve the content extraction task. In [11], Kiyota and Kurohashi present a sentence extractor and a sentence summarizer based on a syntactic analysis, the Salton’s tf.idf method and the html Proceedings of the 2006 IEEE … Cited by 19 Related articles All 7 versions

CBSEAS, a summarization system integration of opinion mining techniques to summarize blogs A Bossard, M Généreux, T Poibeau – … of the 12th Conference of the …, 2009 – dl.acm.org … another. Moreover, sentence synonymy is also dependent on the corpus granularity and on the user compres- sion requirement. 3 CBSEAS: A Clustering-Based Sentence Extractor for Automatic Summarization We assume … Cited by 10 Related articles All 19 versions

A new approach to improving multilingual summarization using a genetic algorithm M Litvak, M Last, M Friedman – Proceedings of the 48th Annual Meeting …, 2010 – dl.acm.org … Edges representing the similarity relations can be weighted (Mihalcea, 2005) or unweighted (Erkan and Radev, 2004): two sentences are connected if their similarity is above some predefined threshold value. 3 MUSE – MUltilingual Sentence Extractor … Cited by 28 Related articles All 7 versions

Priberam’s question answering system in a cross-language environment A Cassan, H Figueira, A Martins, A Mendes… – … of Multilingual and Multi …, 2007 – Springer … At this point, the language module for Y has pivots in its own language and question categories which are language independent. So it proceeds as in the Y monolingual environ- ment, with the document retrieval module, the sentence extractor and the answer extractor. … Cited by 27 Related articles All 15 versions

Textual Distraction as a Basis for Evaluating Automatic Summarisers. A Renouf, A Kehoe – LREC, 2004 – hnk.ffzg.hr … Abstract Our summarisation tool, SEAGULL (Summary Extraction Algorithm Generated Using Lexical Links), is a sentence extractor which exploits the patterns of lexical repetition across a text and creates abridgements which express non-trivially the conceptual content and … Cited by 3 Related articles All 8 versions

Corpus refactoring: a feasibility study HL Johnson, WA Baumgartner… – Journal of …, 2007 – archive.biomedcentral.com … In cases where the original corpus text did not span an entire sentence, the automatic sentence extractor expanded the text span to the sentence boundaries, and the curator verified the expansion. … Table 2 describes the performance of the automatic sentence extractor. … Cited by 16 Related articles All 20 versions

Evaluating the generation of domain ontologies in the knowledge puzzle project A Zouaq, R Nkambou – Knowledge and Data Engineering, …, 2009 – ieeexplore.ieee.org … 3.1 Extracting Key Sentences Paragraphs and sentences are obtained from each document through IBM UIMA-based Java annotators [30]. Key sentences are extracted by running a key sentence extractor that collects sentences, which include certain keywords. … Cited by 47 Related articles All 10 versions

Toward unification of source attribution processes and techniques F Khosmood, R Levinson – Machine Learning and Cybernetics, …, 2006 – ieeexplore.ieee.org … 13-16 August 2006 4554 In order to fit Link Grammar with our feature extraction process analysis from above, we can consider a simple sentence extractor as an initial pre-processing function. This function simply looks through … Cited by 4 Related articles All 2 versions

Question Answering with QACTIS at TREC 2004. P Schone, T Bassi, A Kulman, GM Ciany… – TREC, 2004 – comminfo.rutgers.edu … Mine tokens ImportantQwords Filter 1 Sentence Extractor … Page 7. 2.3.2.1 Sentence Extractor Filter (SEF): The SEF identi- fies sentences of each top IR document that contain a match to the question noun phrase or synset synonyms, and also contain a numeric value. … Cited by 5 Related articles All 5 versions

The Compleat Lexical Tutor, v. 4 T Cobb, P Windows, P Free – 2005 – tesl-ej.org … The first is Texttools, with tools for processing text such as frequency list-makers, HTML-tag strippers (useful for rendering HTML files into text files for further analysis), corpus builders (for those who wish to create their own corpora), and a sentence extractor, which removes end … Cited by 7 Related articles All 4 versions

Creating RSS for News Archives, Beyond. S Debnath – FLAIRS Conference, 2006 – aaai.org … For HeadLine ItemFinder works as a sentence extractor. … Input : HTML Page H, Parameter Set P Output :Training Set to train the Support-Vector Classifier Standard: Word/Phrase Extractor,ISO 8601 standard for date and time, Sentence Extractor Algo- rithm, etc. … Cited by 3 Related articles All 5 versions

The KNIME Text Processing Plugin K Thiel – 2009 – tech.knime.org … of documents and adds them as string columns. Document vector Transforms documents in a bag of words into document vectors. Sentence Extractor Extracts all sentences of documents and adds them as string column. String to Term Converts strings into terms. … Cited by 5 Related articles All 3 versions

[BOOK] Capturing document semantics for ontology generation and document summarization D Baxter, B Klimt, M Grobelnik, D Schneider… – 2009 – Springer … one level. The system was able to find pages for 45% of the unknown terms in our dataset, yielding 1,228 pages. Once a page has been downloaded, the Sentence Extractor identifies a sentence that defines the term. In the … Cited by 6 Related articles All 5 versions

Multi-document summarization by cluster/profile relevance and redundancy removal H Saggion, R Gaizauskas – Proceedings of the Document …, 2004 – duc.nist.gov … deciding in which order sentences should be presented. In this work we do not use any natural language (re)generation techniques: our system is a sentence extractor. The National Institute of Standards and Technology (NIST … Cited by 52 Related articles All 8 versions

Question Answering System for Entrance Exams in QA4MRE X Li, T Ran, NLT Nguyen, Y Miyao… – Proceedings of CLEF …, 2013 – ims-sites.dei.unipd.it … 3 pronouns. The second component is the Sentence Extractor. … 5 2.2 Sentence Extractor Since the final goal with our system is to select the correct answer based on the output from the Recognizing Textual Entailment component. … Cited by 1 Related articles

PSE: a tool for browsing a large amount of MEDLINE/PubMed abstracts with gene names and common words as the keywords T Yoneya – BMC bioinformatics, 2005 – biomedcentral.com … Results. We developed a web-based software, the PubMed Sentence Extractor (PSE), which parses large number of PubMed abstracts, extracts and displays the co-occurrence sentences of gene names and other keywords, and some information from EntrezGene records. … Cited by 3 Related articles All 13 versions

Unsupervised Knowledge Extraction for Taxonomies of Concepts from Wikipedia. E Barbu, M Poesio – RANLP, 2009 – aclweb.org … module eliminates the content of some heads not used by the system, like: Links, Miscellaneous, See also. The next module, Sentence Extractor and Co- Reference Resolution, extracts from the Wikipedia text of an article all sentences containing references to the title concept. … Cited by 4 Related articles All 3 versions

PEXACC: A Parallel Sentence Mining Algorithm from Comparable Corpora. R Ion – LREC, 2012 – mt-archive.info … size. We will show that the performance of a parallel sentence extractor crucially depends on the degree of comparability such that it is more difficult to process a weakly comparable corpus than a strongly comparable corpus. … Cited by 2 Related articles All 4 versions

Multilingual Single-Document Summarization with MUSE M Litvak, M Last – MultiLing 2013, 2013 – aclweb.org … ac. il Mark Last Department of Information Systems Engineering, Ben Gurion University Beer Sheva, Israel mlast@ bgu. ac. il Abstract MUltilingual Sentence Extractor (MUSE) is aimed at multilingual single-document summarization. … Cited by 1 Related articles All 5 versions

Refactoring corpora HL Johnson, WA Baumgartner Jr, M Krallinger… – Proceedings of the …, 2006 – dl.acm.org … (Here and below, “cu- ration” time includes both visual inspection of out- puts, and correction of any errors detected.) The source of error was largely due to the fact that the sentence extractor returned the best sentence from the abstract, but the original corpus text was some … Cited by 2 Related articles All 8 versions

NUS at TAC 2008: augmenting timestamped graphs with event information and selectively expanding opinion contexts Z Lin, HH Hoang, L Qiu, S Ye… – Proceedings of TAC 2008 …, 2008 – comp.nus.edu.sg … With this ranked list, the Sentence Extractor uses a mod- ified MMR reranker to extract highly-ranked sentences that are not overlapped with sen- tences from the summary for previous set and sentences that are just extracted. … Sentence Extractor Sentence Reranker Summary … Cited by 2 Related articles All 22 versions

Engkoo: mining the web for language learning MR Scott, X Liu, M Zhou – Proceedings of the 49th Annual Meeting of …, 2011 – dl.acm.org … By now, about 2 billion pages have been scanned and about 0.1 parallel/bilingual pages have been down- loaded. Extractor. A bilingual term/sentence extractor is implemented following Shi et al. (2006) and Jiang et al. (2009b). … Cited by 2 Related articles All 6 versions

Combining a multi-document update summarization system–CBSEAS–with a genetic algorithm A Bossard, C Rodrigues – Combinations of Intelligent Methods and …, 2011 – Springer … 3 CBSEAS: A Clustering-Based Sentence Extractor for Automatic Summarization … CBSEAS –Clustering-Based Sentence Extractor for Automatic Summarization– clusters semantically close sentences. In others terms, it creates different clusters for semantically distant sentences. … Cited by 5 Related articles All 9 versions

Resolving this-issue anaphora V Kolhatkar, G Hirst – Proceedings of the 2012 Joint Conference on …, 2012 – dl.acm.org … 12 Oracle candidate extractor + row 3 79.63 82.26 80.70 58.32 74.65 87.06 80.38 64.71 13 Oracle candidate sentence extractor + row 3 86.67 92.12 89.25 63.72 79.71 91.49 85.20 62.00 Table 3: this-issue resolution results with SVMrank. … Cited by 2 Related articles All 42 versions

A New Extraction Concept Based on Contextual Clustering L Karoui, M Aufaure, N Bennacer – … Intelligence for Modelling, …, 2006 – ieeexplore.ieee.org … Then, these elements are arranged into a hierarchy to improve the content extraction task. In [10], Kiyota and Kurohashi present a sentence extractor and a sentence summarizer based on a syntactic analysis, the Salton’s tf.idf method and the html structure. … Cited by 5 Related articles All 5 versions

Towards multi-lingual summarization: A comparative analysis of sentence extraction methods on English and Hebrew corpora M Litvak, H Lipman, AB Gur, M Last… – Proceedings of the 4th …, 2010 – aclweb.org … COV DEG Graph-based extension of COV measure. DEG Average degree for all sentence nodes: score(S) = ?i?{words(S)} Degi |S| . GRASE(GRaph-based Automated Sentence Extractor) Modification of Salton’s algo- rithm (Salton et al., 1997) using the graph … Cited by 8 Related articles All 13 versions

Digital learning for summarizing Arabic documents MM Boudabous, MH Maaloul, LH Belguith – Advances in Natural …, 2010 – Springer … The most important systems which are based on the numerical approaches are: LAKHAS system [3] which summarizes Arabic documents in XML format. CBSEAS “Clustering-Based Sentence Extractor for Automatic Summarization” system Page 3. … Cited by 1 Related articles All 6 versions

Keyphrase based Arabic summarizer (KPAS) T El-Shishtawy, F El-Ghannam – Informatics and Systems ( …, 2012 – ieeexplore.ieee.org … concepts. Therefore, the main objective of our work is to demonstrate, using smaller language constructs (keyphrases), a more flexibility method in directing the proposed sentence extractor towards one or more summarization goals. … Cited by 1 Related articles All 3 versions

Cascaded Information Synthesis for Timeline Construction G Mann – 2006 – DTIC Document … The tenure midpoint is estimated by extracting years for which the CEO was in office and taking a weighted sum over the list of years. To build the sentence extractor to ex- tract the years in which the CEO was in office, these tenure years are marked in the training corpus. … Cited by 1 Related articles All 8 versions

An Automatic Question Generation Tool for Supporting Sourcing and Integration in Students’ Essays M Liu, RA Calvo – ADCS 2009, 2009 – es.csiro.au … There are two major components to perform these tasks: 1 Sentence Extractor, performs citation sentence extraction using the combination of trained Stanford Name Entity Rec- ognizer [5], and a Pronoun Resolver, which is imple- mented by finding the nearest Name Entity … Cited by 1 Related articles All 5 versions

Dissimilarity algorithm on conceptual graphs to mine text outliers SS Kamaruddin, AR Hamdan… – Data Mining and …, 2009 – ieeexplore.ieee.org … text. The text format files are then Fed into a developed sentence extractor, which performs a multi pass scan, and with a pre-programmed rule based method to extract the desired relevant sentences from the documents. The … Cited by 3 Related articles

Cross-lingual training of summarization systems using annotated corpora in a foreign language M Litvak, M Last – Information retrieval, 2013 – Springer … In this article, we describe cross-lingual methods for training an extractive single-document text summarizer called MUSE (MUltilingual Sentence Extractor)—a supervised approach, based on the linear optimization of a rich set of sen- tence ranking measures using a Genetic … Cited by 4 Related articles All 7 versions

Automatic Extraction of Kannada Complex Predicates from Corpora S Parameswarappa, VN Narayana – Proceedings of International …, 2012 – Springer … The architecture consists of the following modules based on their functionality. They are Sentence Extractor, Kannada Shallow Parser and Complex Predicates extractor. Fig. 1. Proposed system architecture … The input for sentence extractor module is Kannada raw corpora. … Related articles All 2 versions

Tools for automatisation of voice creation for diphone based speech synthesis J Bachan – bachan.speechlabs.pl … The paper presents two tools designed for automatisation of voice creation for diphone based speech synthesis. The first tool, a phonetically rich sentence extractor selects the smallest number of sentences with the largest number of diphones out of a text corpus. … Related articles

Efficient diphone database creation for MBROLA, a multilingual speech synthesiser J Bachan – mechatronika.polsl.pl … With phonetically rich sentence extractor and diphone extractor tools, building new MBROLA voices for any language is very efficient and encourages work on evaluating speech models which can be later used in more advanced speech synthesis systems or applied in spoken … Related articles All 3 versions

Using unsupervised word sense disambiguation to guess verb subjects on untagged corpora PC Vaz, DM de Matos – researchgate.net … The subject finder is composed by two main modules and SenseClusters, as shown in figure 1. The first module, the feature and sentence extractor (FSE), reads the tagged corpus and extracts the sentences with the referenced verbs and its subjects; SenseClusters then … Related articles All 7 versions

Kannada Word Sense Disambiguation Using Decision List S Parameswarappa, VN Narayana – ijettcs.org … Figure 1 Proposed System Architecture 6.1 Implementation Modules The program uses the following modules for disambiguation task. a) Sentence extractor: This module extracts the sentence from corpora for disambiguation task. … Sentence Extractor Kannada Shallow Parser … Related articles All 3 versions

Combining a Multi-document Summarization System with a Genetic Algorithm A Bossard, C Rodrigues – Combinations of Intelligent Methods and …, 2012 – hal.inria.fr … 3 CBSEAS: A Clustering-Based Sentence Extractor for Automatic Summarization … CBSEAS –Clustering-Based Sentence Extractor for Automatic Summarization– clusters semantically close sentences. In others terms, it creates different clusters for semantically distant sentences. … Related articles All 2 versions

A combined method of text summarization via sentence extraction C Dang, X Luo – Proceedings of the 2007 annual Conference on …, 2007 – wseas.us … granularity, eg, keyword, sentence, or paragraph. MEAD [8], a state of the art sentence- extractor and a top performer at DUC, aims to extracts sentences central to the overall topic of a document. The system employs (1) a centroid … Related articles All 2 versions

A framework for restricted domain Question Answering System P Biswas, A Sharan, N Malik – Issues and Challenges in …, 2014 – ieeexplore.ieee.org … out. Hence the output of Paragraph Extractor will be those paragraphs which contain at least a keyword of the question. These paragraphs will be sent to the next sub module that is the Sentence Extractor only in case of 616 … Related articles

Uncovering the semantics of Wikipedia pagelinks V Presutti, S Consoli, AG Nuzzolese, DRRA Gangemi… – semantic-web-journal.net … of pagelinks. 1a. Sentence extractor. Given a DBpedia … the redirected objects. For example, given the fragment of the page wp:Ron Cobb25 depicted in Fig- ure 226 the sentence extractor will store the data shown in Table 1. link ID …

Summarize to learn: summarization and visualization of text for ubiquitous learning R Chongtay, M Last, M Verbeke… – Proceedings of the 3rd …, 2013 – findresearcher.sdu.dk … The MUSE (MUltilingual Sentence Extractor) approach to single-document summarization [Litvak & Last, 2012] uses a linear combination of 31 language-independent features from various categories for ranking each sentence in a document. … Related articles All 5 versions

miraQA: Initial experiments in Question Answering P Sánchez, JL Martínez Fernández… – 2004 – oa.upm.es … Answer EFE94/95 Sen tence POS +Parsing Question classifier Question Qclass Term: SemTag Term: SemTag …. IR engine Sentence extractor POS +Parsing Answer Recog. Answer ranking Anchor searching Qclass Answer model Page 3. Figure 2: Question Analysis Example … Related articles All 4 versions

What kind of knowledge is in wikipedia? Unsupervised extraction of properties for similar concepts E Barbu – Journal of the Association for Information Science …, 2014 – Wiley Online Library … on. The next module, Sentence Extractor, extracts from the text of a Wikipedia article all sentences containing references to the title of the article. … territory.”. The Sentence Extractor module extracts all the sentences in the example. … Related articles

Wordnet-based document summarization C Dang, X Luo – Proceeding of the 7th WSEAS International Conference, 2008 – wseas.us … granularity, eg, keyword, sentence, or paragraph. MEAD [8], a state of the art sentence-extractor and a top performer at DUC, aims to extracts sentences central to the overall topic of a document. The system employs (1) a centroid … Cited by 11 Related articles All 2 versions

Forum Summarization Using Topic Models and Content-Metadata Sensitive Clustering J Krishnamani, Y Zhao… – Web Intelligence (WI) and …, 2013 – ieeexplore.ieee.org … distribution. B. Sentence Extraction For our sentence extractor scheme, we model each thread in a given forum as a corpus containing several documents and each message in the thread as a single document in the corpus. The … Related articles All 3 versions

Automatic Lexical Alignment between Syntactically Weak Related Languages. Application for English and Romanian M Colhon – Computational Collective Intelligence. Technologies …, 2013 – Springer … candidates. LEXACC – Lucene-Based Parallel Sentence Extractor from Comparable Corpora [10] was developed to work on comparable corpora and obtained state-of-the-art results in comparison with established approaches [10]. … Cited by 33 Related articles All 2 versions

Anchored speech recognition for question answering S Yaman, G Tur, D Vergyri, D Hakkani-Tur… – Proceedings of Human …, 2009 – dl.acm.org … Our sentence extractor relies on non-stop word n-gram match between the ques- tion and the candidate sentence, and returns the sen- tence with the largest weighted average. Since not all word n-grams have the same importance (eg function vs. … Cited by 1 Related articles All 25 versions

NLP [EMNLP 2013, AKBC-WEKEX 2012, 2013] N Balasubramanian – 2013 – homes.cs.washington.edu … in a coherent fashion. I built a sentence extractor that identifies most typical connection between the topic and its aspect and used simple word-precedence models to organize the retrieved sentences. The resulting topic pages … Related articles

Bag of senses versus bag of words: comparing semantic and lexical approaches on sentence extraction JG Flores, L Gillard, O Ferret, G de Chandelar – TAC 2008 Workshop- …, 2008 – nist.gov … Further work will include the fusion of these criteria into a first-stage sentence extractor that will be connected to reformu- lation and syntactic compression modules. Other bases of senses are being developed for specialized domains (in particular, for the nuclear energy field). … Cited by 2 Related articles All 3 versions

Developing a system for machine translation from Hindi language to English language S Mall, UC Jaiswal – Computer and Communication …, 2013 – ieeexplore.ieee.org Page 1. Abstract– Many research organizations in India and abroad have started developing translation systems for the Indian languages recently using conventional approaches like ruled-based or exampled-based or hybrid. … Related articles

QACTIS-based Question Answering at TREC 2005. P Schone, GM Ciany, R Cutts, P McNamee… – TREC, 2005 – comminfo.rutgers.edu … However, for clarity, these filters consist of: (1) a sentence extractor filter, which identifies potential answer sentences from the top N returned IR documents; (2) a template matcher filter, which use regular expres- sions to find exact or near-exact phrase matches; … Cited by 7 Related articles All 3 versions

Semantic document engineering with WordNet and PageRank P Tarau, R Mihalcea, E Figa – Proceedings of the 2005 ACM symposium …, 2005 – dl.acm.org … While the use of PageRank as a keyword and key sentence extractor requires more work to get close to human perfor- mance, we have run some experiments on the Brown Corpus data. The first approach attaches to each word phrase the value of its highest ranked synset. … Cited by 12 Related articles All 4 versions

Acquiring relational patterns from wikipedia: A case study R Mahendra, L Wanzare, R Bernardi, A Lavelli… – Proc. of the 5th …, 2011 – hnk.ffzg.hr … <http://dbpedia.org/resource/Forrest_Gump> <http://dbpedia.org/ontology/writer> <http://dbpedia.org/resource/Winston_Groom> the sentence extractor module would return the following sentence (1), where both the domain and the range of the writer relation are highlighted. … Cited by 8 Related articles All 3 versions

Discrepancy between automatic and manual evaluation of summaries S Mithun, L Kosseim, P Perera – … of Workshop on Evaluation Metrics and …, 2012 – dl.acm.org … We have designed an extractive query-based summ- rizer called BlogSum. In BlogSum, we have devel- oped our own sentence extractor to retrieve the ini- tial list of candidate sentences (we called it OList) based on question similarity, topic similarity, and subjectivity scores. … Cited by 1 Related articles All 7 versions

An extractive text summarization based on multivariate approach ME Hannah, S Mukherjee… – … Computer Theory and …, 2010 – ieeexplore.ieee.org … MEAD [3], a state of art sentence extractor and a top performer in DUC, aims to extract sentences central to the overall topic of the document. Other approaches for sentence extraction include NLP based methods and machine learning based techniques. … Related articles

A framework for sentiment analysis in turkish: Application to polarity detection of movie reviews in turkish AG Vural, BB Cambazoglu, P Senkul… – Computer and Information …, 2013 – Springer … (sentence-binary). \(-1\) and \(-1\). (sentence-max/min). \((+1,-2)\) and \((+1,-2)\). (word-sum). \(-1.5\) and \(-3\). Sentence extractor: This is a simple module which splits the input text into sentences based on certain sentence separators (ie, “.!? … Cited by 5 Related articles All 3 versions

Multi-Document Summarization of Evaluative Text. G Carenini, RT Ng, A Pauls – EACL, 2006 – cs.ubc.ca … extracted summaries. Because of the widespread and well-developed use of sentence extractors in summarization, we chose to develop our own sentence extractor as a first attempt at summarizing evaluative argu- ments. To … Cited by 101 Related articles All 16 versions

ACCURAT toolkit for multi-level alignment and information extraction from comparable corpora M Pinnis, R Ion, D ?tef?nescu, F Su, I Skadi?a… – Proceedings of the ACL …, 2012 – dl.acm.org … The evaluation on the gold standard shows a strong correlation (between 0.883 and 0.999) between human defined comparability levels and the confidence scores of the metric. 2.2 Parallel Sentence Extractor from Comparable Corpora … Cited by 3 Related articles All 8 versions

Wordnet-based summarization of unstructured document C Dang, X Luo, H Zhang – WSEAS Transactions on Computers, 2008 – wseas.us … granularity, eg, keyword, sentence, or paragraph. MEAD [8], a state of the art sentence-extractor and a top performer at DUC, aims to extracts sentences central to the overall topic of a document. The system employs (1) a centroid … Cited by 7 Related articles All 2 versions

Expanding Queries Using Multiple Resources. E Meij, M de Rijke, M Jansen – TREC, 2006 – staff.science.uva.nl … We experimented with various ways of identifying passages, and decided to consider every sentence as being a passage— which we identify using Lingpipe’s sentence extractor [1]. Every sentence gets indexed as a separate document and we include positional information … Cited by 2 Related articles All 13 versions

Automatic extraction of citation information in japanese patent applications H Nanba, N Anzen, M Okumura – International Journal on Digital Libraries, 2008 – Springer … By repeating the above steps until no more cue phrases were obtained, we finally obtained 14 external cues, 22 internal cues, and two negative cues. We also obtained an SVM-based sentence extractor using these cue phrases. The Appendix shows a list of these cue phrases. … Cited by 17 Related articles All 8 versions

Description of the LIPN System at TAC 2008: Summarizing Information and Opinions A Bossard, M Généreux, T Poibeau – Proceedings of the 2008 Text …, 2008 – hal.inria.fr … We then describe the results obtained for the different tracks. 3.1 CBASES: A Clustering-Based Sentence Extractor for Automatic Summarization We assume that redundant pieces of information are the most important thing in order to produce a good summary. … Cited by 35 Related articles All 15 versions

Generating update summaries: Using an unsupervized clustering algorithm to cluster sentences A Bossard – Multi-source, multilingual information extraction and …, 2013 – Springer … summaries. This article is based on a generic multi-document summarization system,CBSEAS—Clustering-Based Sentence Extractor for Automatic Summarization—[ 5 ], which uses unsupervized clustering to detect redundancy. … Cited by 2 Related articles All 4 versions

Data analytics in the cloud with flexible MapReduce workflows C Goncalves, L Assuncao… – … Technology and Science ( …, 2012 – ieeexplore.ieee.org … This function must implement the IRecordExtractor interface. In the text mining application it is a sentence extractor; ii) The user supplied type name of the Map function that produces the key/value pairs from data records. This function must implement the IMap interface. … Related articles

Extracting information networks from the blogosphere Y Merhav, F Mesquita, D Barbosa, WG Yee… – ACM Transactions on …, 2012 – dl.acm.org … 6, No. 3, Article 11, Publication date: September 2012. Page 8. 11:8 Y. Merhav et al. Sentence Extractor 1 Server Relation Extractor Sentence Extractor N Thread 1 blog posts Sentence Repository Post Queue Sentence Queue Thread M Sentence Extractor 2 Post Queue … Cited by 12 Related articles All 4 versions

Multi-Document Summarization Of Evaluative Text G Carenini, JCK Cheung, A Pauls – Computational Intelligence, 2013 – Wiley Online Library … Not surprisingly, they performed well for different but complementary reasons. Although the NLG summarizer appears to provide a more general overview of the source text, the sentence extractor provides a more varied language and level of detail about customer opinions. … Cited by 5 Related articles All 2 versions

Automatic Summarization LH Belguith, M Ellouze, MH Maaloul, M Jaoua… – … Language Processing of …, 2014 – Springer … In: Proceedings of the 4th international workshop on cross lingual information access, Beijing, pp 61–69. http://bib.dbvis.de/uploadedFiles/219.pdf; Litvak M, Last M, Friedman M, Kisilevich S (2011) MUSE – a multilingual sentence extractor. … Related articles

Integrating Document Structure into a Multi-Document Summarizer A Bossard, T Poibeau – Proceedings of Recent Advances …, 2009 – hal.archives-ouvertes.fr … Our goal is to determine the impact of the type and structure of news stories in automatic summarization, since these features have rarely been used. 3 CBSEAS: A Clustering-Based Sentence Extractor for Auto- matic Summarization … Related articles All 9 versions

Hotminer: Discovering hot topics from dirty text M Castellanos – Survey of Text Mining, 2004 – Springer … Section 6.5 describes the method for obtaining excerpts of relevant sentences from cases. Section 6.6 gives a flavor of the results obtained by applying the sentence extractor to a case document. Section 6.7 briefly describes how hot topics are discovered from case excerpts. … Cited by 22 Related articles All 5 versions

A review of authorship attribution J Smith, I Fujinaga – Retrieved February, 2008 – music.mcgill.ca … sentence extraction (van Halteren 2002): from the hypothesis that the most content-rich sentences of a document might consciously be crafted in a style different from the rest of the document, van Halteren created a document-summarizing sentence extractor that outperformed … Cited by 3 Related articles All 2 versions

miraQA: experiments with learning answer context patterns from the web C de Pablo-Sánchez, JL Martínez-Fernández… – … Information Access for …, 2005 – Springer … Answer Extraction Answer EFE94/95 Sent. POS +Parsing Question classifier Question QA class Term: SemTag Term: SemTag …. IR engine Sentence extractor POS +Parsing Answer Recog. Answer ranking Anchor searching QA class model Fig. 1. miraQA architecture … Cited by 2 Related articles All 5 versions

A Fuzzy Similarity Based Concept Mining Model for Text Classification S Puri – arXiv preprint arXiv:1204.2061, 2012 – arxiv.org … A text document TDi is composed of a set of sentences, so consider TDi = {si1, si2, si3,…,sim} (2) Where i denotes the text document number and m denotes the total number of sentences in TDi. Sentence Extractor (SE) is used to extract the sentence si1 from TDi. … Cited by 4 Related articles All 8 versions

Collecting and Using Comparable Corpora for Statistical Machine Translation I Skadi?a, A Aker, N Mastropavlos… – Proceedings of the …, 2012 – staffwww.dcs.shef.ac.uk … 0.25 84% Table 5: Parallel sentences extracted by LEXACC from the ACCURAT News Comparable corpora. The parallel sentence extractor is a parameterized tool, being tuned for each language pair in the project. The user … Cited by 7 Related articles All 10 versions

Domain adaptation in statistical machine translation using comparable corpora: case study for english latvian IT localisation M Pinnis, I Skadi?a, A Vasi?jevs – Computational Linguistics and Intelligent …, 2013 – Springer … acquired from the Web. 3.2 Extraction of Semi-parallel Sentence Pairs The parallel sentence extractor LEXACC [23] was used to extract semi-parallel sen- tences from the comparable corpus. Before extraction, texts were pre … Cited by 1 Related articles All 2 versions

Paraphrase acquisition via crowdsourcing and machine learning S Burrows, M Potthast, B Stein – ACM Transactions on Intelligent …, 2013 – dl.acm.org … For each original and paraphrase pair, the number of sentences was counted using the OpenNLP maximum entropy sentence extractor for plotting as shown in Figure 4. From this experiment, we observed that 38.7% of the paraphrases have the same number of sentences as … Cited by 19 Related articles All 4 versions

A Logic Programming Framework for Semantic Interpretation with WordNet and PageRank P Tarau, R Mihalcea, E Figa – Proceedings of CICLOPS, 2004 – digital.library.unt.edu … 8.9 Keyword and Sentence Extraction Experiments While the use of PageRank as a keyword and key sentence extractor requires more work to get close to human performance, we have run some experiments on the Brown Corpus data. … Cited by 1 Related articles

Multi-document summarization of scientific corpora O Yeloglu, E Milios, N Zincir-Heywood – … of the 2011 ACM Symposium on …, 2011 – dl.acm.org … value, NC-value). The last component of W3SS is the key sentence extractor where the top N most significant sentences are retrieved from all narrative paragraphs based on the presence density of keyphrases. 3.5 Data Two … Cited by 4 Related articles All 9 versions

Language-independent Techniques for Automated Text Summarization. M Last, M LITVAK – 2010 – cs.bgu.ac.il … research. In this chapter, we introduce MUSE (MUltilingual Sentence Extractor) – a new ap- proach to multilingual single-document extractive summarization, considering summa- rization as an optimization or a search problem. … Cited by 1 Related articles All 5 versions

A K-mixture connective-strength-based approach to automatic text summarisation TM Chang, WF Hsiao – … Journal of Intelligent Systems Technologies and …, 2011 – Inderscience … The lexical chainer component employed WorldNet to associate the nouns with their synonyms, hyponyms and hyponyms. Finally, sentence extractor component identified lexical chains in each segment and generated summary based on segments with significant chains. … Related articles All 3 versions

A hybrid model to improve relevance in document retrieval TJ Siddiqui, US Tiwary – Journal of Digital Information Management, 2006 – dirf.org … 3. Tagged and processed document is input to a sentence extractor which extracts sentences. Each extracted sentence of the tagged text is then passed through four modules. Each of these modules is devoted to identify certain types of relationships between concepts. … Cited by 4 Related articles All 5 versions

Enhancing Citation Context based Information Services through Sentence Context Identification AA Mandya – 2012 – otago.ourarchive.ac.nz Page 1. Enhancing Citation Context based Information Services through Sentence Context Identification MA Angrosh a thesis submitted for the degree of Doctor of Philosophy at the University of Otago, Dunedin, New Zealand October 2012 Page 2. ABSTRACT … Related articles

Key Phrase Extraction Based Multi-Document Summarization N Chaudhary, S Kapoor – ijettjournal.org … Sentence, Score Sentence Repository File Parser scoring SenseADFSFSFSSSsd Sentence Extractor Page 5. International Journal of Engineering Trends and Technology (IJETT) – Volume 13 Number 4 – Jul 2014 ISSN: 2231-5381 http://www.ijettjournal.org Page 152 …

Survey of Text Mining of Biomedical Corpora T Polajnar – 2009-08-20]. http://www. brc. des. gla. ac. uk/tamara/ …, 2006 – researchgate.net … If applying the sentence extractor before NLP one may simply choose only the sentences which contain the query terms or one can train a machine learning engine such as SVM to categorise sentences as relevant or not, as PreBIND does. … Cited by 4 Related articles All 3 versions

[BOOK] Building Search Applications: Lucene, LingPipe, and Gate M Konchady – 2008 – books.google.com Page 1. Building Search Applications Lucene, LingPipe, and Gate A Practical Guide to Building Search Applications Using Open Source Software Manu Konchady Page 2. Building Search Applications Lucene, LingPipe, and … Cited by 25 Related articles All 4 versions

Ontology-Based System for the Marketing Information Management E BARBU – 2006 – clic.cimec.unitn.it … instances of the concepts. The ontology concepts and instances are used in the summarization task. The summarization component of the AMI-SME system is a language independent sentence extractor. From the point of view … Cited by 1 Related articles

Cognitive lab evaluation of innovative items in mathematics and English/language arts assessment of elementary, middle, and high school students RP Dolan, J Goodman, E Strain-Seymour… – 2011 – images.pearsonassessments.com Page 1. Cognitive Lab Evaluation of Innovative Items in Mathematics and English Language Arts Assessment of Elementary, Middle, and High School Students Research Report Robert P. Dolan Joshua Goodman Ellen Strain-Seymour Jeremy Adams Sheela Sethuraman … Cited by 3 Related articles All 2 versions

COMPENDIUM: A text summarization system for generating abstracts of research papers E Lloret, MT Romá-Ferri, M Palomar – Data & Knowledge Engineering, 2013 – Elsevier … That is the case of CBSEAS [4] which generate multi-document extractive sentiment-based summaries, or MUSE — MUltilingual Sentence Extractor [21], that employs language- independent techniques for generating summaries in English and Hebrew. … Cited by 1 Related articles All 5 versions

[BOOK] Mining Parallel Documents Using Low Bandwidth and High Precision CLIR from the Heterogeneous Web S Shi, P Fung – 2013 – Springer Logo Springer. Search Options: … Related articles All 9 versions

Using genetic algorithms with lexical chains for automatic text summarization M Berker – 2011 – cmpe.boun.edu.tr Page 1. USING GENETIC ALGORITHMS WITH LEXICAL CHAINS FOR AUTOMATIC TEXT SUMMARIZATION by Mine Berker BS, Computer Engineering, Bo?gaziçi University, 2003 Submitted to the Institute for Graduate Studies in Science and Engineering in partial fulfillment … Cited by 4 Related articles All 5 versions

Sentiment Analysis PNBL Varela – 2012 – fenix.tecnico.ulisboa.pt Page 1. Sentiment Analysis Pedro do Nascimento Barata Leal Varela (Licenciado) Dissertation submitted for obtaining the degree of Master in Electrical and Computer Engineering Examination Commitee Chairperson: Prof. Fernando Duarte Nunes Supervisor: Prof. … Related articles All 2 versions

Extraction of Parallel Corpora from Comparable Corpora RC Kulkarni – cfilt.iitb.ac.in … sequence predicted by the model. 4.4.2 CRF Based Parallel Sentence Extraction The parallel sentence extractor based on the CRF model described above uses a subset of the rich set of features, described in the next section. 4.5 Features Functions …

Summarizing technical support documents for search: expert and user studies CG Wolf, SR Alpert, JG Vergo, L Kozakov… – IBM Systems …, 2004 – ieeexplore.ieee.org … and their numerous variations. The heuristics give primacy to selecting specific sec- tions of the documents and then, conditionally, in- voking the sentence extractor tool on contents of spe- cific sections. We also decided that … Cited by 13 Related articles All 8 versions

Bayesian nonparametric models for name disambiguation and supervised learning AM Dai – 2013 – era.lib.ed.ac.uk … resolution system. The entity clusters for each document that are found by the within- document coreference system are passed through a sentence extractor that extracts sen- tences relevant to each of the entities. The vector space model is used by storing the sen- … Related articles All 2 versions

On the use of a schema-based framework to improve relevance and discourse coherence in blog summarization S Mithun, L Kosseim – Document numérique, 2012 – cairn.info All 2 versions

Multi Document Summarization A Hägerstrand – 2011 – test.findwise.com … putting the highest scoring sentence first. D. Bollegala et al. [2] have focused on sentence ordering using a more advanced approach, while not constructing the sentence extractor. Rather they assumed that the extraction had … Related articles All 5 versions

An Epistemological approach to domain-specific multiple biographical document summarization B Tennessy – 2006 – cs.ubc.ca … We have more to say about the MEAD summarizer later, when we extend the basic platform system with features of our own. MEAD serves as our sentence extractor baseline in the evaluation phase. Columbia University/DefScriber … Cited by 2 Related articles All 6 versions

Exploiting Rhetorical Relations in Blog Summarization S Mithun – 2012 – spectrum.library.concordia.ca Page 1. EXPLOITING RHETORICAL RELATIONS IN BLOG SUMMARIZATION Shamima Mithun A thesis in The Department of Computer Science and Software Engineering Presented in Partial Fulfillment of the Requirements For the Degree of Doctor of Philosophy … Cited by 3 Related articles All 2 versions

Extracting conceptual structures from multiple sources E Barbu – 2009 – clic.cimec.unitn.it Page 1. Extracting conceptual structures from multiple sources Eduard Barbu Center for Mind/Brain Sciences Universita degli Studi di Trento Advisor:prof. Massimo Poesio A thesis submitted for the degree of Doctor of Philosophy (PhD) 2009, December Page 2. … Related articles All 2 versions

Analysis of user generated spatio-temporal data: Learning from collections of geotagged photos S Kisilevich – kops.ub.uni-konstanz.de … In 4th International Workshop On Cross Lingual Information Access, 2010. [2] Marina Litvak, Mark Last, Menahem Friedman, and Slava Kisilevich. MUSE – A Multilingual Sentence Extractor. In Computational Linguistics & Applications (CLA 11) (to appear), 2011. 7 Page 8. … Related articles All 3 versions