Apache Tika


tika.apache.org {related:}

Notes: 

content analysis toolkit (text extraction)

YouTube: 

Content extraction with Apache Tika (45min)


[BOOK] Tika in Action C Mattmann, J Zitting – 2011 – dl.acm.org … top of page ABSTRACT. SummaryTika in Action is a hands-on guide to content mining with Apache Tika. The book’s many … exotic ones. About this BookTika in Action is the ultimate guide to content mining using Apache Tika. You … Cited by 9 Related articles All 3 versions Cite

Formats over Time: Exploring UK Web History AN Jackson – arXiv preprint arXiv:1210.1714, 2012 – arxiv.org … Using the DROID and Apache Tika identification tools, we examined each resource and captured the results as extended MIME types, embedding version, software and hardware identifiers alongside the format information. … Formats identified using Apache Tika. … Cited by 2 Related articles All 4 versions Cite

Towards the development of an integrated framework for enhancing enterprise search using latent semantic indexing O Alhabashneh, R Iqbal, N Shah, S Amin… – Conceptual Structures for …, 2011 – Springer … 3.2 Apache Tika Apache Solr and its underlying technology can parse or acquire limited types of documents. … Packt Publishing (2009) 28. Apache Hadoop, http://hadoop.apache.org/ 29. Apache Lucene, http://lucene.apache.org/solr/ 30. Apache Tika, http://tika.apache.org/ Cited by 3 Related articles All 4 versions Cite

US Geological Survey Community for Data Integration: Data Upload, Registry, and Access Tool WI Is, HI Works – 2012 – pubs.usgs.gov … The Apache Software Foundation, 2012a, Apache Tika project: Apache Tika, accessed June 20,2012, at http://tika.apache.org/. The Apache Software Foundation, 2012b, Apache Lucene project: Apache Lucene, accessed June 20, 2012, at http://lucene. apache.org/. … Related articles Cite

Ontology-based classification of unstructured information S Burger, B Stieger – Digital Information Management (ICDIM), …, 2010 – ieeexplore.ieee.org … This paper introduces a JAVA-application, in which algo- rithms for the extraction of metadata from different sources are implemented which also match this metadata to an on- tology with techniques such as Apache TIKA [5], a relate project to Apache Lucene [17]. … Related articles Cite

Opening OPUS for User Contributions J Tiedemann, M Zumpe… – Proceedings of the …, 2012 – stp.lingfil.uu.se … 2011. TIKA in Action. Manning Publications Co. Apache Tika is avail- able at http://tika.apache. org/. Jörg Tiedemann. 2012. Parallel data, tools and inter- faces in opus. In Proceedings of the Eight Interna- tional Conference on Language Resources and Evalua- tion (LREC’12). … Related articles All 2 versions Cite

Guided text analysis using adaptive visual analytics CA Steed, CT Symons… – Proceedings of the …, 2012 – ebooks.spiedigitallibrary.org … During † Apache Tika TM is a content analysis toolkit that is available at http://tika.apache.org ‡ Apache Lucene TM is a text search engine written in Java that is available at http://lucene.apache.org SPIE-IS&T/ Vol. 8294 829408-4 … Cited by 1 Related articles All 15 versions Cite

A digital library content metadata generator for e-print AAA Hussein – 2011 – eprints.oum.edu.my … ustring%3Aiso-8859-1=9b7bffcd2daeca6198b4ee5a848f9beec2f600e5 Apache Tika – a content analysis toolkit (nd) Retrieved 5 th November 2010 from http://tika.apache.org/ Apps, A. and Macintyre, R. (2000) Dublin Core Metadata for Electronic Journals – … Related articles All 2 versions Cite

Hybrid methodologies to foster ontology-based knowledge management platform V Loia, G Fenza, C De Maio… – Intelligent Agent (IA), …, 2013 – ieeexplore.ieee.org … Index MathNbased Concepts & Relations Wikipedia Miner Language Tool Snowball Analyzer Apache TIKA Apache SolR Legend Free Technology Apache Lucene Our Algorithms Unsupervised Conceptualisation FFCA Algorithm Translation Algorithm Fig. … Cite

Computing: A vision for data science CA Mattmann – Nature, 2013 – nature.com … Because they must do more for less, such facilities largely use and generate community-based open-source software 4, 5, 6 . Examples include Apache Hadoop 7 and Apache Tika 8 , used in Earth science, biomedicine and business. … Cited by 6 Related articles All 8 versions Cite

A mobile peer-to-peer search and retrieval service for social networks IM Lombera, LE Moser… – Mobile Services (MS …, 2012 – ieeexplore.ieee.org … document identifier. Metadata generation is dependent on the application, and may be manually provided by the end user, or automatically generated by appropriate packages such as Apache Tika, Apache Lucene, etc. Metadata … Cited by 2 Related articles All 4 versions Cite

Descripción de recursos multimedia georreferenciados AB Fonollosa, CG Canut, JH Guijarro – V JORNADAS DE SIG LIBRE, 2011 – sigte.udg.edu … http://iaaa.cps.unizar.es/software/index.php/CatMDEdit_English_user_manual, Último acceso 03.2011. [8] Apache Software Foundation (2010) Apache Tika: a content analysis toolkit. http://tika.apache.org, Último acceso 03.2011. [9] Turton I (2008) GeoTools. … Cited by 2 Related articles Cite

Automatic Plagiarism Detection System for Specialized Corpora FC Buruiana, A Scoica, T Rebedea… – Control Systems and …, 2013 – ieeexplore.ieee.org … Because of the multitude of formats supported, we use Apache Tika (http://tika.apache.org/), a project of the Apache Software Foundation, which extracts meta-data and structured text content from various documents using existing parser libraries. … Cite

Describing heterogeneous resources through Apache Tika and OSGeo FDO A Beltrán Fonollosa, C Granell Canut, J Huerta Guijarro – 2011 – init.uji.es Resumen: Nowadays information technology plays a fundamental role in the society we live  in, even to the point of dependence. Therefore, making information available globally and  easily reachable for as many people as possible is becoming essential for collaborative … Cite More

OSGeo FDO y Apache Tika: construyendo una plataforma para la descripción de recursos multimedia A Beltrán Fonollosa, C Granell Canut, J Huerta Guijarro – 2010 – init.uji.es Resumen: La información geográfica juega un papel fundamental en la sociedad actual, y  el interés de los usuarios por ella crece día a día. Sin embargo, aún resulta demasiado  complicado encontrar contenidos geográficos que sean relevantes (actualizados, de … Cite More

Final Report: Guided Text Search Using Adaptive Visual AnalyticsCA Steed, C Symons, J Senter, F DeNap – 2012 – info.ornl.gov … Apache Tika is a content analysis toolkit that is available at http://tika.apache.org. †Apache Lucene is a text search engine written in Java that is available at http://lucene.apache.org. Page 13. Southeast Region Research Initiative SERRI Report 89990-01 5 Fig. … Related articles All 3 versions Cite

LivingKnowledge: A Platform and Testbed for Fact and Opinion Extraction from Multimodal Data D Dupplaw, M Matthews, R Johansson, P Lewis – Eternal Systems, 2012 – Springer … We have approached this problem with the use of the Apache Tika toolkit2 which supports many common document formats, such as Word documents and PDFs, and converts their content into a consistent format that we subsequentally publish as a base … 2 http://tika.apache.org … Cited by 1 Related articles All 3 versions Cite

Avtomatizirano opremljanje ucnih gradiv z metapodatki M Ramšak – 2011 – eprints.fri.uni-lj.si … Different conversions (Apache Tika, pdftotext, copy & paste, and manual conversion) that prepare original file to form that is acceptable for these tools were used. We have shown that conversions influence on extraction, but not always to improve results. … Cite More
  Tailored news in the palm of your hand: a multi-perspective transparent approach to news recommendation M Tavakolifard, JA Gulla, KC Almeroth… – Proceedings of the …, 2013 – dl.acm.org … Next, the Apache Tika library1 is used to identify the body part of the HTML document and scrape off the unnecessary in- formation. … 1http://tika.apache.org/ 2http://opennlp.apache.org/ 306 Page 3. User Interface CLIENT SER VER HTML5 iOS app Android Windows phone … Cited by 2 Related articles All 6 versions Cite

A modular open-source focused crawler for mining monolingual and bilingual corpora from the web V Papavassiliou, P Prokopidis, G Thurmair – ACL 2013, 2013 – aclweb.org … 7http://hadoop. apache. org 44 Page 55. 3.2 Normalizer The normalizer module uses the Apache Tika toolkit 8 to parse the structure of each fetched web page and extract its metadata. … To 8http://tika. apache. org 9http://code. google. com/p/boilerpipe/ 10http://code. google. … Cite More

UNICORE Data Management: Recent Advancements K Benedyczak, T Rekawek, J Rybicki… – … 2011: Proceedings, 7- …, 2011 – books.google.com … 3. Waquas Noor and Bernd Schuller. MMF: A flexible framework for metadata manage- ment in UNICORE. In Proceedings ofthe 2010 UNICORE Summit, volume 5, pages 51–60, May 2010. Apache Tika Project. http://tika. apache. org/. The iRODS project. https://www. irods. … Related articles All 3 versions Cite

Constructing a Focused Taxonomy from a Document Collection O Medelyan, S Manion, J Broekstra, A Divoli… – The Semantic Web: …, 2013 – Springer … To federate inputs stored on file systems, servers, databases and document management systems, we use Apache Tika to extract text content from various file formats and Solr for … 7 See http://tika.apache.org/ and http://lucene.apache.org/solr/ 8 See http://apidemo.pingar.com … Cited by 1 Related articles All 3 versions Cite

Intelligence in the Cloud R Hill, L Hirsch, P Lake, S Moshiri – Guide to Cloud Computing, 2013 – Springer … Generally, the fi rst task of the indexer will be to extract the text data from any of the different formats likely to be encountered. This is a complex task but luckily open source tools such as Apache Tika ( http://tika.apache.org/ ) are freely available. … Related articles All 2 versions Cite

Utilización de la plataforma Hadoop para la implementación de un programa que permita determinar mensajes spam G Crespo P, S Véliz M, V Cedeño M – 2013 – dspace.espol.edu.ec … [14] Apache Software Foundation, “Apache Pig”. Disponible en: http://pig.apache.org/ Fecha del último acceso: Mayo del 2012. [15] Apache Software Foundation, “Apache Tika”. Disponible en: http://tika.apache.org/ Fecha del último acceso: Mayo del 2012. … Related articles All 2 versions Cite

Construyendo un sistema de indexación y búsqueda de recursos georreferenciados A Beltrán Fonollosa, L Díaz Sánchez, J Huerta Guijarro – 2012 – dugi-doc.udg.edu … Web. 7 Mar. 2012. doi:10.4018/978-1-4666-0945-7 [8] Apache Software Foundation (2010) Apache Tika: a content analysis toolkit. http://tika.apache.org, 2010. [9] Open Source Geospatial Foundation (2010) Feature Data Objects (FDO) Data Access Technology. … Related articles All 3 versions Cite

Insider Threat Control: Using Plagiarism Detection Algorithms to Prevent Data Exfiltration in Near Real Time T Lewellen, GJ Silowash, DL Costa – 2013 – repository.cmu.edu … Apache Tika—a Java library for parsing text out of many different file formats … WebDLPIndexer uses two additional Apache libraries aside from Lucene: Tika and Commons IO. Apache Tika is an effective Java library for parsing text from various kinds of documents. … Cite

Unsupervised discovery and extraction of semi-structured regions in text via self-information E Yeh, J Niekrasz, D Freitag – Proceedings of the 2013 workshop on …, 2013 – dl.acm.org … To normalize non-text files into text, we used Apache Tika.3 An assessment of the documents showed that the labor statistics con- sisted of paragraph sized text descriptions, followed by tables, en- coded in a variety of styles. … 3http://tika.apache.org 105 Page 4. … Cite

Knowledge-based Approaches to Information Management Systems in Coalition Environments A Uszok, T Reichherzer, L Bunch, J Bradshaw… – 2012 – ieeexplore.ieee.org … PowerPoint. When the new information is published, it is first parsed and mapped to the ontology. We use Apache Tika (http://tika.apache.org/) as a parser that outputs XHTML (eXtensible Hypertext Markup Language). ). We … Related articles All 9 versions Cite

Trustworthy distribution and retrieval of information over HTTP and the Internet I Michel Lombera, YT Chuang… – INTERNET 2011, The …, 2011 – thinkmind.org … In the case of cURL, the file(s) is fetched directly by the node and written to the local disk. When a resource is entered into the SQLite database, metadata for the resource file is generated using the Apache Tika/Lucene packages. … Cited by 11 Related articles All 3 versions Cite

Rapid Exploitation and Analysis of Documents DJ Buttler, D Andrzejewski, KD Stevens, D Anastasiu… – 2011 – fas.org … Open source components, like Apache Tika 7 handle con- verting standard document types into plain text and associated meta- data. … The remaining sections are excerpts from published conference pa- 6http://netapp.com 7http://tika.apache.org 8http://www.promedmail.org 9http … Related articles All 7 versions Cite

Reconstructing provenance S Magliacane – The Semantic Web–ISWC 2012, 2012 – Springer … all available versions and metadata using the Dropbox API 1, extracts content (both text and images) and other metadata using Apache Tika2; and … 1 https://www.dropbox.com/developers/ reference/sdk 2 http://tika.apache.org/ 3 http://lucene.apache.org/ 4 http://sourceforge.net … Cited by 4 Related articles All 6 versions Cite

Digital libraries with J-ISIS: a preliminary account of possibilities and performance HH Berhe, E de Smet – Library Hi Tech News, 2012 – emeraldinsight.com … The interface is based on existing data-entry worksheets: when a DL field is recognized, a special icon allowing uploading of the document is shown, the text is extracted (using Apache Tika) and normal storage and indexing is applied while the ISIS-PFT creates the URL to the … All 3 versions Cite

Representing document semantics by means of graphs E Velazquez-Garcia, I Lopez-Arevalo… – … Control (CCE), 2011 …, 2011 – ieeexplore.ieee.org … In this stage we use the Apache Tika toolkit6, it is posible to extract metadata from audio files and other formats, but we are limited to the formats described above. This allows us to work with different types of files. … 6http://tika.apache.org visited in September 2011. Page 3. Fig. … Related articles Cite

Apache airavata: a framework for distributed applications and computational workflows S Marru, L Gunathilake, C Herath… – Proceedings of the …, 2011 – dl.acm.org … 10. REFERENCES [1] Apache. Tika. http://tika.apache.org/. [2] M. Armbrust, A. Fox, R. Griffith, A. Joseph, R. Katz, A. Konwinski, G. Lee, D. Patterson, A. Rabkin, I. Stoica, et al. Above the clouds: A berkeley view of cloud computing. … Cited by 11 Related articles All 3 versions Cite

Using Big Data and sentiment analysis in product evaluation L Banic, A Mihanovic, M Brakus – Information & Communication …, 2013 – ieeexplore.ieee.org … Language detection of the review text was done by means of the Apache Tika [12] .Apache Tika is a content analysis toolkit, which among … and Knowledge Discovery, Volume 62, Issue 7, October 2011, pp 2779 2792 [11] http://nutch.apache.org/ [12] http://tika.apache.org [13] http … Cite

Tibetan Web Information Collection System G Xu, D Zhong, X Gao, Y Lin, X Zhao… – … and Intelligent Systems …, 2012 – ieeexplore.ieee.org … Apache Nutch is an open source web-search software project. Stemming from Apache Lucene, it now builds on Apache Solr adding web-specifics, such as a crawler, a link-graph database and parsing support handled by Apache Tika for HTML and other document formats. … Related articles All 3 versions Cite

Créer un moteur de recherche avec des logiciels libres IR Viseur – 2012 – robertviseur.be … 33 Existence d’outils intégrés (1/2) • Serveur d’indexation: SolR (lucene.apache.org/solr/). • Comprend: extracteurs (Apache Tika), indexeur (Lucene), API (JSON ou REST). … Extraction de texte et de métadonnées : • Apache Tika / POI (Java), utilitaires (xls2csv, catdoc, pdfinfo, … All 3 versions Cite

Towards Reconstructing the Provenance of Clinical Guidelines. S Magliacane, PT Groth – SWAT4LS, 2012 – ceur-ws.org … 2, in which each node represents a file and each edge a dependency of the origin file from the destination file. 1 https://www.dropbox.com/developers/reference/sdk 2 http://tika.apache.org/ 3 http://lucene.apache.org/ 4 https://github.com/lucmoreau/ProvToolbox Page 3. 0 1 2 3 4 … Cited by 1 Related articles All 2 versions Cite

Large-scale content profiling for preservation analysis P Petrov, C Becker – 2012 – publik.tuwien.ac.at … 1.4 Table 2: The same set with additional meta data Characteristic File 1 File2 File 3 Format PDF 1.2 PDF 1.2 PDF 1.4 Page count 20 20.000 40 Encryption Yes No Yes File Size 1 MB 120 MB 2 MB Valid No Yes No Well-formed Yes Yes Yes such as Apache Tika, JHove and … Cited by 4 Related articles All 3 versions Cite

A framework for bridging the gap between open source search tools M Khabsa, S Carman… – … 2012 Workshop on …, 2012 – opensearchlab.otago.ac.nz … For rich media formats such as Word, PDF, Power Point, YouSeer converts the document into text us- ing Apache TIKA. The output of the middleware is an XML file containing the fields extracted from the documents. … http://tika. apache. org/. [20] Webglimpse homepage. … Cited by 3 Related articles All 6 versions Cite

EPUB for archival preservation J Van der Knijff – KB/National Library of The …, 2012 – openplanetsfoundation.org … 15 Apache Tika…..16 … 20 Apache Tika…..21 Other feature extraction … Cited by 1 Related articles Cite

ADAM: automated data management for research datasets M Woodbridge, CD Tomlinson, SA Butcher – Bioinformatics, 2013 – Oxford Univ Press … their location changes. Data type classification, metadata extraction and format conversion are performed using the Apache Tika, LOCI Bio-Formats (Linkert et al., 2010) and Apache PDFBox libraries. The system additionally … Related articles All 12 versions Cite

News Media Analysis Using Focused Crawl and Natural Language Processing: Case of Lithuanian News Websites T Krilavicius, Ž Medelis, J Kapociute-Dzikiene… – Information and …, 2012 – Springer … engineering approach. Journal of Universal Computer Science 3(8), 955–987 (1997) 7. Apache Foundation: Apache Tika. Web page (2011), http://tika.apache.org (last visited: December 10, 2011) 8. LingPipe: Lingpipe. Web … Related articles All 2 versions Cite

A multi-classifier based guideline sentence classification system MH Song, SH Kim, DK Park… – Healthcare informatics …, 2011 – synapse.koreamed.org … Available from: http://nutch.apache.org/. 20. The Apache Software Foundation. The Apache Software Foundation; c2011 [cited at 2011 Sep 14]. Apache Tika: a content analysis toolkit [Internet]. Available from: http://tika.apache.org/. 21. OpenNLP. … Cited by 4 Related articles All 9 versions Cite

Sentimatrix: multilingual sentiment analysis service AL Gînsca, E Boros, A Iftene, D TrandabAt… – Proceedings of the 2nd …, 2011 – dl.acm.org … official grammar. 190 Page 3. identifying the language: N-grams detection, strictly 3-grams detection and lemma correction. The 3-grams classification method uses corpus from Apache Tika for several languages. The Romanian … Cited by 6 Related articles All 10 versions Cite

Black swan: augmenting statistics with event data J Lorey, F Naumann, B Forchhammer… – Proceedings of the 20th …, 2011 – dl.acm.org … We use Apache Tika8 to parse 1http://www.gapminder.org 2http://www.correlatesofwar.org 3http://dbpedia.org/About 4http://www.emdat.be 5http://www.ngdc.noaa.gov 6http://www.freebase. com 7http://news.bbc.co.uk/2/hi/europe/country_ profiles/ 8http://tika.apache.org/ … Cited by 1 Related articles All 8 versions Cite

CLEAR: a credible method to evaluate website archivability V Banos, Y Kim, S Ross, Y Manolopoulos – 2013 – delab.csd.auth.gr … 20http://www.robotstxt.org/ 21http://validator.w3.org/ 22http://jigsaw.w3.org/css-validator/ 23http://www.jhove2.org 24http://tika.apache.org/ 25http … response header to indicate this; in cases where it is missing, this process might be refined to use JHOVE2 or Apache Tika to identify … Cite More

Notes on Operations L Ivanovic, D Ivanovic, D Surla – Library Resources & Technical Services, 2012 – ALA … The file server component also extracts textual content from upload- ed files using open-source Apache Tika library (http://tika.apache.org). After extraction, text goes through a Cyrillic to Latin transliteration algo- rithm and then is indexed using the Apache Lucene library. … Cited by 4 Related articles All 3 versions Cite

David Jones/blog D Jones – Hand, 2011 – i-proving.ca … In addition to finding a document it also has to be broken down into a sequence of terms that can then be indexed. One common library for achieving this seems to be the Apache Tika library that can analyse HTML, Microsoft Office, and many other document formats. … Cite More

Fuzzy Conceptual Data Analysis Applied to Knowledge Management C De Maio, G Fenza, V Loia, S Salerno – On Fuzziness, 2013 – Springer … The workflow and mapping on the exploited technological solutions is shown in Fig.19.1. It in- volves following phases: • Natural Language Processing, that relies on several activities, such as: language detection (ie, Apache TIKA), multiformat analysis, stopword removal … Related articles Cite

Subject-based semantic document clustering for digital forensic investigations GG Dagher, B Fung – Data & Knowledge Engineering, 2013 – Elsevier … 6.1 Data Sets We use three data sets in our experiments: Classic3, Forensic-1 and Forensic-2. The pre-classification of each document will be used to measure the accuracy of 4 Apache Tika. http://tika.apache.org/ 5 Apache Lucene. … Related articles All 2 versions Cite

Performance comparison study of language identification tools for identification of Farsi web pages H Kordestanchi, H Naderi – Information and Knowledge …, 2013 – ieeexplore.ieee.org … C. Tika Tika is a project of the Apache Software Foundation [27], and was formerly a subproject of Apache Lucene [28]. The Apache Tika™ toolkit detects and extracts metadata and structured text content from various documents using existing parser libraries. … Cite
  [BOOK] Apache Solr 4 Cookbook R Kuc – 2013 – books.google.com … How it works… There’s more… Page 5. Language identification based on Apache Tika Optimizing your primary key field indexing How to do it… How it works… 3. Analyzing YourText Data Introduction Storing additional information using payloads How to doit… How itworks… … Cited by 1 All 5 versions Cite

Encryption Domain Text Retrieval T Chiueh, DN Simha, A Saxena, S Bhola… – … ), 2012 IEEE 4th …, 2012 – ieeexplore.ieee.org … 2010. [3] Apache, “Apache Solr,” In the Apache Software Foundation, Available: http://lucene.apache.org/solr/, 2007. [4] Apache, “Apache Tika,” In the Apache Software Foundation, Available: http://tika.apache.org/, 2007. [5] Georgios … Related articles All 4 versions Cite

A framework for semantic annotation of digital evidence BWP Hoelz, CG Ralha – Proceedings of the 28th Annual ACM …, 2013 – dl.acm.org … The result, if positive, is often displayed as an additional attribute, with a notable or alert value. This attribute can be en- riched by adding contextual information regarding the hash 3http://tika.apache.org Figure 1: A segment of the PROTON ontology, adapted from [16] … Related articles Cite

Nástroj na sledovanie zmien obsahu webov D Šimanský – 2012 – is.muni.cz … 12 Page 18. 3. TECHNOLÓGIE 3.5 Apache Tika Reprezentuje integrovaný balík nástrojov použitel’ný na detekciu me- tadát a extrakciu štruktúrovaného obsahu z celej rady zdrojov za pomoci zozbieraných knižníc. Najväcšou … Related articles Cite

BrainMap–A Navigation Support System in a Tourism Case Study LFS Teixeira, RA Ribeiro, A Falcão, GP Lopes… – … Innovation for the …, 2013 – Springer … Several libraries are available (for many programming languages) that allow us to transform documents in a given format into a simple text file. For example [18] presents Apache Tika, a Java based toolkit for extracting content from a variety of document formats … Related articles Cite

CURATEcamp iPres 2012 M Jordan, C Mumma, N Ruest – 2012 – yorkspace.library.yorku.ca … JHOVE, FITS, FIDO, Apache Tika, and the recently released Unified Digital Format Registry (UDFR), which unifies two other services, PRONOM and the Global Digital Format Registry, are all services or pieces of software that perform functions related to format identification and … Cite

Stabilization of Users Profiling Processed by Metaclustering of Web Pages M Draminski, B Owczarczyk, K Trojanowski… – Language Processing …, 2013 – Springer … status http 403 – Forbidden – status http 404 – Not Found – status http 500 – Internal server error – did not answer within 30 seconds The content of properly downloaded pages has been filtered to remove html tags using Apache Tika and then uploaded to Solr. … Related articles Cite

ConnectME: Semantic Tools for Enriching Online Video with Web Content. LJB Nixon, M Bauer, C Bara, T Kurz… – I-SEMANTICS (Posters & …, 2012 – ceur-ws.org … itself is already carried out by LMF Core, but the management of versions will be carried out by this module • LMF Enhancer offers semantic enhancement of content by analysing textual and media content; the LMF Enhancer will build upon UIMA, Apache Tika, and the semantic … Related articles Cite

Improving PHENIX search with Solr, Nutch and Drupal. D Morrison, I Sourikova – Journal of Physics: Conference Series, 2012 – iopscience.iop.org … 1245 [12] http://indico.cern.ch/contributionDisplay.py?contribId=114&sessionId=7&confId=149557 [13] http://indico.cern.ch/contributionDisplay.py?contribId=531&sessionId=7&confId=149557 [14] http://hadoop.apache.org/ [15] http://shibboleth.net/ [16] http://tika.apache.org/ … All 2 versions Cite

Trustworthy Decentralized Publication, Search and Retrieval in Heterogeneous Networks IM Lombera – 2013 – itrust.ece.ucsb.edu Page 1. UNIVERSITY OF CALIFORNIA Santa Barbara Trustworthy Decentralized Publication, Search and Retrieval in Heterogeneous Networks A dissertation submitted in partial satisfaction of the requirements for the degree of Doctor of Philosophy in … Cite More

Meeting the Preservation Demand Responsibly= Lowering the Ingest Bar? A Goethals – Archiving Conference, 2009 – ingentaconnect.com … Any documentation or code for FITS will be available on the FITS website [7]. There are a number of tools that will be evaluated for incorporation into FITS in the future: • Apache Tika [2] for document and other formats • JHOVE 2 [3] • Aduna Aperture [1] for document, text, email … Cite

[BOOK] Apache Solr 3.1 Cookbook R Kuc – 2011 – books.google.com Page 1. Quick answers to common problems Apache Solr 3.1 Cookbook Over 100 recipes to discover new ways to work with Apache’s Enterprise Search Sewer Rafal Kué Open sogce Page 2. Table of Contents Apache Solr … Cited by 7 Related articles All 13 versions Cite

SWAN-Scientific Writing AssistaNt: a tool for helping scholars to write reader-friendly manuscripts T Kinnunen, H Leisma, M Machunik… – Proceedings of the …, 2012 – dl.acm.org … files. JFreeChart4 is used in generating graphs 2http://pdfbox.apache.org/ 3http://tika.apache.org/ 4http://www.jfree.org/jfreechart/ and XStream5 in saving and loading inputs and results. 5 Initial User Experiences of SWAN Since … Cited by 2 Related articles All 12 versions Cite

Utilização da computação distribuída para o armazenamento e indexação de dados forenses MA da Silva, RAP Júnior – 2012 – icofcs.org … A. Interpretação de Arquivos Para realizar a interpretação dos diferentes tipos de arquivos presentes nos dados forenses coletados foi selecionada a ferramenta Apache Tika [5]. Esta ferramenta tem código-fonte aberto, utiliza uma expansível técnica de interpretação de … Related articles All 3 versions Cite

Canopy 2.1 User Guide ER Burtner – 2012 – pnnl.gov … The first step of Canopy’s extraction and feature identification process is to employ the Apache Tika™ toolkit to detect and extract metadata, structured text, unstructured text, image, and video content from documents (Figure 8). Canopy retains information about the source files … All 2 versions Cite

LAICOS: an open source platform for personalized social web search MR Bouadjenek, H Hacid, M Bouzeghoub – Proceedings of the 19th …, 2013 – dl.acm.org … We setup the crawlers in such a way to remove all the non- english web pages, an operation performed using the Apache Tika toolkit. Note that all the web pages that return an http error code was considered as unavailable. … Cited by 2 Cite

An informatics architecture for the virtual pediatric intensive care unit DJ Crichton, CA Mattmann, AF Hart… – … (CBMS), 2011 24th …, 2011 – ieeexplore.ieee.org … Figure 2. Our CHLA/JPL Site Architecture for the Virtual Pediatric Intensive Care Unit. Page 5. Apache Tika [13], for crawling large file systems and extracting metadata. Regularly crawling the files in the HDP, and extracting the available metadata creates the HDP catalog. … Cited by 3 Related articles All 4 versions Cite

DLFP: Solr 1.4 est de sortie B Cénou – mars, 2012 – cleoradar.hypotheses.org … Meilleure intégration aux SGBD grâce au gestionnaire d’import de données ; * Possibilités d’indexation de documents externes (Word, OOo, PDF, HTML, etc.) grâce au projet Apache Tika ; * Clustering dynamique de résultats de recherche avec Carrot2 ; * Une tonne d … All 2 versions Cite

Exploiting evidence from unstructured data to enhance master data management K Murthy, PM Deshpande, A Dey… – Proceedings of the …, 2012 – dl.acm.org … formats such as PDF, MS Word and HTML (Step 2 in Figure 7). It uses functional- ity provided by Apache’s Tika project3 to … com/enterprise-content- management/index.htm 2http://www-01.ibm. com/software/data/content- management/filenet-p8-platform 3http://tika.apache.org … Cited by 2 Related articles All 2 versions Cite

[BOOK] Java Coding Guidelines: 75 Recommendations for Reliable and Secure Programs F Long, D Mohindra, RC Seacord, DF Sutherland… – 2013 – books.google.com Page 1. 75 Rl-ICOALHl-IXDATIONS FOR 0 RELIABLE AND SECURE [Hummus 49 “A must-readfor all lava dwelopers.’ MARY Ass D/wnms, (so, (hit GUIDELINES CODING “MT-w / FRI-:1) Loxc | DHRUV MOHINDRA I ROBERT … Cite

Digital preservation ingest can be a “CINCH” A Rudersdorf – Library Hi Tech, 2012 – emeraldinsight.com … Tools such as Apache’s Tika might be utilized to enhance metadata results and include full-text extraction, while OpenNLP could be integrated to test natural language processing to try to extract other metadata based on its location within the file. … Cited by 1 Related articles All 3 versions Cite

Control de integridad y calidad en repositorios DSpace MR De Giusti, N Oviedo, AJ Lira… – III Conferencia de …, 2013 – sedici.unlp.edu.ar … De la misma forma, Apache Tika permite obtener datos de este tipo desde múltiples formatos de archivos, como ser imágenes (JPEG, PNG, etc … 2 http://tika.apache.org/ 3 Para mas información sobre el modelo de datos ver https://wiki.duraspace.org/display/DSDOC3x/Functional … Cite
  Razširljiva arhitektura za agregacijo podatkov iz razlicnih spletnih virov A Rijavec – 2012 – eprints.fri.uni-lj.si … pomagamo z orodji, ki nekoliko olajšajo našo nalogo. Eno izmed teh orodij je Apache Tika [2, 3]. S tem programom lahko olajšano izvlecemo podatke iz raznovrstnih dokumentov, med katere na pri- … pravilno uporabiti v sledecih poizvedbah. Apache Tika nam lahko pomaga … Related articles All 2 versions Cite

iTrust: Trustworthy information publication, search and retrieval PM Melliar-Smith, LE Moser, IM Lombera… – Distributed Computing …, 2012 – Springer … The Apache Tika and Lucene packages are used to generate metadata from resources automatically and efficiently, if the user chooses not to generate the metadata manually. The WordNet dictionary provides spell checking and synonym suggestions. 5.3 Public Interface … Cited by 11 Related articles All 6 versions Cite

Interactive visual comparison of multimedia data through type-specific views R Burtner, S Bohn, D Payne – IS&T/SPIE …, 2013 – proceedings.spiedigitallibrary.org … format. The first step of Canopy’s extraction and feature identification process is to employ the Apache Tika™ toolkit to detect and extract metadata, structured text, unstructured text, image, and video content from documents. … Cited by 1 Related articles All 5 versions Cite

Türkçe metin tabanli açik arsivlerde kullanilan dizinleme yönteminin degerlendirilmesi Ç Çapkin – 2013 – bbytezarsivi.hacettepe.edu.tr … 14 Tablo 4. Veri Erisim ve Bilgi Erisim Özellikleri 15 Tablo 5. Idf Parametresi Örnegi 26 Tablo 6. Apache Tika’nin Metin veya Üstveri Çikartabildigi Dosya Türleri ve Kullandigi API’lar 38 Tablo 7. Sik Kullanilan Tokenizer Türlerinin Özellikleri 41 … Related articles Cite

Testing Software Tools of Potential Interest for Digital Preservation Activities at the National Library of Australia M Hutchins – National Library of Australia Staff Papers, 2012 – nla.gov.au … The results were analysed from the point of view of comparing the tools to determine the extent of coverage and the level of agreement between them. Five metadata extraction tools were tested: File Investigator Engine, Exiftool, MediaInfo, pdfinfo and Apache Tika. … Cited by 2 Related articles All 5 versions Cite

Indexing and Searching Cross Media Content in a Social Network P Bellini, D Cenni, P Nesi – … for Performing Arts, Media Access and …, 2012 – disit.dsi.unifi.it … ppt, pptx, xls, xlsx, pdf, html, txt) are extracted after detecting magic bytes (ie, a prefix that identifies the file format), file extension, content type and encoding, with the aim of an internal MIME database and parsing libraries provided by Apache Tika [31 … 31] http://tika.apache.org/ [32 … Cited by 2 Related articles All 3 versions Cite

An Approach to Document Warehousing System Lifecyle from Textual ETL to Multidimensional Queries: A Proof-of-Concept Prototype A Cembalo, FM Pisano… – Complex, Intelligent and …, 2012 – ieeexplore.ieee.org … 2011 from http://www.mdxtutorials.net/ [16] Rapid-I, http://rapid-i.com/. [17] Netbeans, http://netbeans.org/. [18] Apache Tika, http://tika.apache.org/. [19] MySql, http://www.mysql.it/. [20] Mondrian, http://mondrian.pentaho.com/. [21] JPivot, http://jpivot.sourceforge.net//. … Related articles All 4 versions Cite

Métadonnées et processus pour l’archivage de données médiatiques M Amar – 2012 – archipel.uqam.ca … 57 4.2.1 Java Content Repository 57 4.2.2 Apache-Jackrabbit 59 4.2.3 Apache-Sling 60 4.2.4 Apache-Tika 61 4.2.5 Choix des outils 63 4.3 IMPLÉMENTATION 64 4.3.1 Environnement de développement 64 4.3.2 Architecture applicative 64 4.4 EXPÉRlMENTATION 66 … Related articles Cite

Cercador documental amb lematització J Boix Requesens – 2013 – 84.88.10.29 Page 1. Cercador documental amb lematització Memòria del projecte d’Enginyeria Tècnica en Informàtica de Sistemes realitzat per Josep Boix Requesens i dirigit per Asier Ibeas Escola d’Enginyeria Sabadell, Juny de 2012 Page 2. 2 Page 3. 3 … Cite

Interest based user profiles for personalizationV Singh – 2010 – cse.iitk.ac.in Page 1. Interest based user profiles for personalization by VATSHEEL SINGH A thesis submitted in partial fulfillment of the requirements for the degree of Bachelor-Master of Technology (Dual Degree) to the Department of Computer Science and Engineering … Related articles All 2 versions Cite

Improving search engines with open Web-based SKOS vocabularies FNF Martins – 2012 – run.unl.pt Page 1. Dezembro, 2012 Flávio Nuno Fernandes Martins Licenciado em Engenharia Informática Improving search engines with open Web-based SKOS vocabularies Dissertação para obtenção do Grau de Mestre em Engenharia Informática Orientador: Prof. … Related articles Cite

[BOOK] Guide to Cloud Computing: Principles and Practice L Hirsch, P Lake – 2013 – books.google.com … 181 7.17.1 Task 1: Explore Visualisations ….. 181 7.17.2 Task 2: Extracting Text with Apache Tika ….. 182 7.17.3 Advanced Task 3: Web Crawling with Nutch and Solr ….. … Cited by 4 Related articles All 2 versions Cite

Hacia el desarrollo y utilización de Repositorios de Acceso Abierto para Objetos Digitales Educativos PS San Martín, A Casali – … de Investigadores en Ciencias de la …, 2012 – sedici.unlp.edu.ar … La implementación de este repositorio utiliza la librería WebDAV de Apache Jackrabbit, la base de datos orientada a objetos DB4o, el motor de búsqueda de texto Apache Lucene y extractor de contenido Apache Tika para archivos de formato Word, Power Point, PDF, Zip, etc … Related articles Cite

Time-bound analytic tasks on large datasets through dynamic configuration of workflows Y Gil, V Ratnakar, R Verma, A Hart, P Ramirez… – Proceedings of the 8th …, 2013 – isi.edu … Additional metadata properties to train the performance model can be extracted through OODT using Apache Tika [Mattmann and Zitting 2011], such as the number of distinct words, the language of the file, the file format (html, plain text, etc). … Cite More

Mobile Remote LAN: Designing a modular service platform LA Aschim, L Martinsen – 2009 – ntnu.diva-portal.org Page 1. June 2009 Van Thanh Do, ITEM Simone Lupetti, Telenor R&I Master of Science in Communication Technology Submission date: Supervisor: Co-supervisor: Norwegian University of Science and Technology Department of Telematics Mobile Remote LAN … Cite
  Building a scalable index and a web search engine for music on the Internet using Open Source software AP Ricardo – 2011 – repositorio-iul.iscte.pt Page 1. Department of Information Science and Technology Building a Scalable Index and a Web Search Engine for Music on the Internet using Open Source software André Parreira Ricardo Thesis submitted in partial fulfillment of the requirements for the degree of … Related articles All 3 versions Cite

Modelado Semántico y Centrado en el Usuario de Servicios Adaptados al Contexto de Uso JC Yelmo Garcia, M García, Y Samuel… – 2011 – oa.upm.es … Realizamos este proceso a través de herramientas externas de fuente abierta como GATE (General Architecture for Text Engineering) y Apache Tika, u otras propias de socios del proyecto, que permiten tanto extraer metadatos incrustados como inferir nuevo conocimiento a … Related articles All 3 versions Cite

Planning for Variation and E-Discovery Costs MA Burke – ICAIL 2011/DESI IV, 2011 – umiacs.umd.edu Page 105. Planning for Variation and E-Discovery Costs By Macyl A. Burke President Eisenhower was fond of the quote,“In preparing for a battle I have found plans are useless, but planning is indispensable”. The same logic … Related articles All 3 versions Cite

RTS-an integrated analytic solution for managing regulation changes and their impact on business compliance D Pasetto, H Franke, W Qian, Z Guo, H Guo… – Proceedings of the …, 2013 – dl.acm.org … Patent Application, 12 2012. US 2012/0323806 A1. [2] Apache Fundation. OpenNLP. http://opennlp.apache.org/, . [3] Apache Fundation. TiKA. http://tika.apache.org/, . [4] DM Blei, AY Ng, and MI Jordan. Latent dirichlet allocation. J. Mach. Learn. Res., 3:993–1022, Mar. 2003. … Related articles All 3 versions Cite

[BOOK] Alfresco 3 Business Solutions M Bergljung – 2011 – books.google.com Page 1. Expo-rionoa Distilled Alfresco 3 Business Solutions Prsotiosi implementation techniques and guidance for delivering business solutions with Alfresco Martin Bergijung Page 2. Alfresco 3 Business Solutions Page 3. Alfresco … All 3 versions Cite

Semantic Document Clustering for Crime Investigation K Daghir – 2011 – spectrum.library.concordia.ca Page 1. SEMANTIC DOCUMENT CLUSTERING FOR CRIME INVESTIGATION KABI G. DAGHIR A THESIS IN THE CONCORDIA INSTITUTE FOR INFORMATION SYSTEMS ENGINEERING PRESENTED IN PARTIAL FULFILLMENT OF THE REQUIREMENTS … Related articles All 2 versions Cite

MMET: A Migration Metadata Extraction Tool for Long-Term Preservation Systems F Luan, M Nygård – … Information and Communication Technology and Its …, 2011 – Springer … 41–49. Springer, Heidelberg (2008) 13. ExifTool, http://www.sno.phy.queensu.ca/~phil/exiftool/ 14. Tika, http://tika.apache.org/ 15. DROID, http://sourceforge.net/apps/mediawiki/droid/ index.php?title=Main_Page 16. PRONOM, http://www.nationalarchives.gov.uk/pronom/ 17. … Related articles All 3 versions Cite

Partnervõrgul baseeruva hajusa failijagamissüsteemi loomine kasutades Mobile Host’i P Halapuu – 2013 – dspace.utlib.ee … used. Mobile host uploads services data as services.txt file to this address with SOLR API. Thanks to Apache Tika integration in SOLR 4.2.0, a full text search in txt files can be done. Services file is generated right after program is started and services are published. … Cite

Guidelines for multilingual linked data A Gómez-Pérez, D Vila-Suero… – Proceedings of the 3rd …, 2013 – dl.acm.org Page 1. Guidelines for Multilingual Linked Data Asunción Gómez-Pérez, Daniel Vila-Suero, Elena Montiel-Ponsoda, Jorge Gracia, Guadalupe Aguado-de-Cea Universidad Politécnica de Madrid Facultad de Informática Departamento … Related articles Cite

Architecture for Aggregation, Processing and Provisioning of Data from Heterogeneous Scientific Information Services C Mazurek, M Mielnicki, A Nowak, M Stroinski… – Intelligent Tools for …, 2013 – Springer … As men- tioned above, metadata records from digital libraries are expressed in PLMET and ESE schemas, and encoded in XML. The transformation between XML data and Lucene full text index documents is made by Apache Tika toolkit. … Cited by 2 Related articles Cite

A semantic web based approach to expertise finding at KPMG EA Jansen – … master’s thesis, University of Technology Delft- …, 2010 – repository.tudelft.nl Page 1. Master’s Thesis, August 31 2010 Ernst Alexander Jansen A Semantic Web based approach to expertise finding at KPMG Page 2. ii Page 3. iii THESIS submitted in partial fulfillment of the requirements for the degree of MASTER OF SCIENCE in … Cited by 3 Related articles All 2 versions Cite

Mobile News: Design, User Experience and Recommendation KR Haugen – 2013 – diva-portal.org Page 1. Mobile News Design, User Experience and Recommendation Kent Robin Haugen Master of Science in Computer Science Supervisor: Jon Atle Gulla, IDI Department of Computer and Information Science Submission date: June 2013 … Cite

Gestão confiável de dados cifrados, distribuídos em múltiplas nuvens de armazenamentoJMCM Rodrigues – 2013 – asc.di.fct.unl.pt … de pesquisa sobre os dados e por consequente a indexação dos mesmos para isso poderá ser efectuada uma simples indexação sobre os dados ou alternativamente uma pesquisa sobre o conteúdo dos mesmos fazendo recursos a bibliotecas como o Apache Tika toolkit [17 … Related articles Cite

Web-Archiving M Pennock – DPC Technology Watch Report, 2013 – basie.exp.sis.pitt.edu Page 1. 01000100 01010000 01000011 01000100 01010000 01000011 01000100 01010000 01000011 01000100 01010000 01000011 01000100 01010000 01000011 Web-Archiving DPC Technology Watch Report 13-01 March 2013 DPC Technology Watch Series … Cited by 2 Cite More

Semantically Linked Media for Interactive User-Centric Services V Damjanovic, T Kurz, G Güntner… – Media Networks: …, 2012 – books.google.com Page 467. Chapter 20 Semantically Linked Media for Interactive User-Centric Services Violeta Damjanovic, Thomas Kurz, Georg Güntner, Sebastian Schaffert, and Lyndon Nixon Contents 20.1 Introduction….. 446 20.1. … Related articles Cite

Foreign Language Analysis and Recognition (FLARe) Initial Progress RE Slyh, EG Hansen, BM Ore, DM Hoeferlin, SA Thorn – 2012 – stormingmedia.us Page 1. AFRL-RH-WP-TR-2012-0165 FOREIGN LANGUAGE ANALYSIS AND RECOGNITION (FLARE) INITIAL PROGRESS Brian M. Ore Stephen A. Thorn David M. Hoeferlin SRA International 5000 Springfield Street, Suite 200 Dayton, OH 45431 … Related articles Cite

Text mining with Lucene and Hadoop: Document clustering with feature extraction D Shrestha – Wakhok University, 2009 – wakhok.ac.jp Page 1. TEXT MINING WITH LUCENE AND HADOOP: DOCUMENT CLUSTERING WITH FEATURE EXTRACTION BY DIPESH SHRESTHA A THESIS SUBMITTED FOR THE FULFILLMENT OF RESEARCH DEGREE TO WAKHOK UNIVERSITY 2009 Page 2. Page 3. … Related articles All 3 versions Cite

Recommending adaptive changes for framework evolution B Dagenais, MP Robillard – ACM Transactions on Software Engineering …, 2011 – dl.acm.org Page 1. 19 Recommending Adaptive Changes for Framework Evolution BARTH ´EL ´EMY DAGENAIS and MARTIN P. ROBILLARD, McGill University In the course of a framework’s evolution, changes ranging from a simple refactoring … Cited by 91 Related articles All 16 versions Cite

A multidisciplinary, model-driven, distributed science data system architecture DJ Crichton, CA Mattmann, JS Hughes, SC Kelly… – Guide to e-Science, 2011 – Springer … Software Architecture: Foundations, Theory and Practice. Wiley Press, 2009. 11. Apache Tika. http://lucene.apache.org/tika/, 2010. 12. … Prentice-Hall, 1996. 16. Apache Lucene, http://lucene.apache.org/, 2010. 17. X. Yang, L. Wang, G. von Laszewski. … Cited by 3 Related articles All 4 versions Cite

Introduction to Scientific Writing Assistant (SWAN)–Tool for Evaluating the Quality of Scientific Manuscripts T Turunen – Computer Science, 2013 – epublications.uef.fi Page 1. Introduction to Scientific Writing Assistant (SWAN) – Tool for Evaluating the Quality of Scientific Manuscripts Teemu Turunen Master’s Thesis School of Computing Computer Science May 2013 Page 2. UNIVERSITY OF … Cite More

Vyhledávání v ?ceských dokumentech pomocí Apache SolrBR Sikora – 2012 – is.muni.cz Page 1. MASARYKOVA UNIVERZITA FAKULTA INFORMATIKY Vyhledávání v ?ceských dokumentech pomocí Apache Solr DIPLOMOVÁ PRÁCE Bc. Radek Sikora Brno, jaro 2012 Page 2. zadani ii Page 3. Prohlášení Prohlašuji … Related articles All 2 versions Cite

[BOOK] Deliver Who I Mean: Automatische Erstellung von Personenprofilen in großen Unternehmen L Mählmann – 2009 – opus.haw-hamburg.de Page 1. Faculty of Engineering and Computer Science Department of Computer Science Fakultät Technik und Informatik Department Informatik Masterarbeit Lars Mählmann Deliver who I mean, automatische Erstellung von Personenprofilen in großen Unternehmen Page 2. … Related articles Cite

Enterprise Information Architecture for the dissemination of Geothermal Data A Mehta – 2011 – upcommons.upc.edu Page 1. Enterprise Information Architecture for the dissemination of Geothermal Data Anshuman Mehta Barcelona School of Informatics Universitat Politécnica de Catalunya A thesis submitted for the degree of Master in Information Technology 2011 September Page 2. Abstract … Related articles All 3 versions Cite

Contributions for building a Corpora-Flow system AF dos Santos – 2011 – publications.andrefs.com Page 1. Contributions for building a Corpora-Flow system André Fernandes dos Santos (andrefs@cpan.org) Dissertation submitted in partial fulfillment of the requirements for the degree of Master in Informatics Engineering at … Related articles All 2 versions Cite

Um sistema de coleta de dados de fontes heterogêneas baseado em computação distribuída PH Souza – 2013 – repositorio.ufsc.br Page 1. UNIVERSIDADE FEDERAL DE SANTA CATARINA TECNOLOGIAS DA INFORMAÇÃO E COMUNICAÇÃO PEDRO HENRIQUE SOUZA UM SISTEMA DE COLETA DE DADOS DE FONTES HETEROGÊNEAS BASEADO EM COMPUTAÇÃO DISTRIBUÍDA … Related articles Cite

Recomanador de ponències a una conferència N Bassols García – 2011 – upcommons.upc.edu Page 1. T´itol: Recomanador de pon`encies a una confer`encia Volum: 1/1 Alumne: Núria Bassols Garcia Director/Ponent: Ricard Gavald`a Mestre Departament: LSI Data: 03/01/2011 Page 2. Page 3. DADES DEL PROJECTE … Related articles All 2 versions Cite

PROPOSTA DE ARQUITETURA PARA UM SISTEMA DE DETECÇÃO DE PLÁGIO MULTI-ALGORITMO RM de Abreu – 2011 – objdig.ufrj.br Page 1. PROPOSTA DE ARQUITETURA PARA UM SISTEMA DE DETECÇÃO DE PLÁGIO MULTI-ALGORITMO Rodrigo Mesquita de Abreu Dissertação de Mestrado apresentada ao Programa de Pós-Graduação em Engenharia de Sistemas e Computação, COPPE, da … Related articles Cite

MyOwnTrip: obtenció i localització de punts d’interès I V Casado Sachez – 2013 – recercat.net Page 1. „MyOwnTrip?: Obtenció i localització de punts d?interès I Memòria de projecte 1 „MyOwnTrip?: Obtenció i localització de punts d?interès I Memòria del projecte d?Enginyeria Tècnica en Informàtica de Sistemes realitzat per Victor Casado Sachez i dirigit per … Cite
  Unit Test Virtualization with VMVM JS Bell, GE Kaiser – 2013 – academiccommons.columbia.edu Page 1. Unit Test Virtualization with VMVM Jonathan Bell Columbia University 500 West 120th St, MC 0401 New York, NY USA jbell@cs.columbia.edu Gail Kaiser Columbia University 500 West 120th St, MC 0401 New York, NY USA kaiser@cs.columbia.edu … Cite

Content Recommendation in Social MediaM BREUSS – 2013 – dare.uva.nl Page 1. UNIVERSITY OF AMSTERDAM FACULTY OF SCIENCE MASTER THESIS Content Recommendation in Social Media A THESIS SUBMITTED IN PARTIAL FULFILMENT OF THE REQUIREMENTS FOR THE DEGREE OF MASTER OF SCIENCE … Related articles Cite

[BOOK] Providing semantic links to the invisible geospatial FJL Pellicer – 2012 – books.google.com Page 1. Cuadernos de Investigación en Geoinformática Notes in Geoinformatics Research Francisco J. Lopez-Pellicer Rubén Béjar F. Javier Zarazaga-Soria Providing Semantic Links to theInvisible Geospatial Web 1 Page 2. … Related articles Cite

Exploiting Social Semantics for Multilingual Information Retrieval C Weinhardt – 2011 – aifb.kit.edu Page 1. Exploiting Social Semantics for Multilingual Information Retrieval Zur Erlangung des akademischen Grades eines Doktors der Wirtschaftswissenschaften (Dr. rer. pol.) von der Fakultät für Wirtschaftswissenschaften am Karlsruher Institut für Technologie … Related articles All 3 versions Cite

Speech Processing and Recognition (SPaRe) BM Ore, DM Hoeferlin, SA Thorn, D Snyder – 2011 – stormingmedia.us Page 1. AFRL-RH-WP-TR-2011- SPEECH PROCESSING AND RECOGNITION (SPaRe) David M. Hoeferlin Brian M. Ore Stephen A. Thorn David Snyder SRA International, Inc. 5000 Springfield Street, Suite 200 Dayton OH 45431 JANUARY, 2011 Final Report … Related articles Cite

Vergleich von FACT-Finder und Solr im E-Commerce am Beispiel von SCOOBOX M Glazkov – 2011 – dotsource.de Page 1. Vergleich von FACT-Finder und Solr im E-Commerce am Beispiel von SCOOBOX – Diplomarbeit – zur Erlangung des akademischen Grades Diplom-Informatiker eingereicht von Michael Glazkov michael.glazkov@uni-jena.de Betreuer Dipl.-Inf. Andreas Göbel Prof. … Related articles Cite

IDENTIFICAÇÃO DE REUSO EM DOCUMENTOS DIGITAIS FR Duarte – 2011 – objdig.ufrj.br Page 1. IDENTIFICAÇÃO DE REUSO EM DOCUMENTOS DIGITAIS Fellipe Ribeiro Duarte Dissertação de mestrado apresentada ao Programa de Pós-Graduação em Engenharia de Sistemas e Computação, COPPE, da Universidade Federal do Rio de Janeiro, como … Related articles Cite

Interaksjon og Søk i Dynamic Presentation Generator TR Olsen – 2010 – bora.uib.no Page 1. Interaksjon og Søk i Dynamic Presentation Generator Tobias Rusås Olsen Institutt for informatikk Universitetet i Bergen Norge Lang Masteroppgave 2010 Page 2. Forord Denne masteroppgaven er resultatet av forfatterens masterstudium i programvareutvikling. … Cited by 2 Related articles All 2 versions Cite

Un approccio basato su DBpedia per la sistematizzazione della conoscenza sul Web F Cairo – 2013 – porto.polito.it Page 1. POLITECNICO DI TORINO SCUOLA DI DOTTORATO Dottorato in Beni Culturali – XXV Ciclo Tesi di Dottorato Un approccio basato su DBpedia per la sistematizzazione della conoscenza sul Web Federico Cairo Tutore Coordinatore del corso di dottorato prof. … Related articles All 3 versions Cite