Scene Understanding & Natural Language 2015


Resources:

Wikipedia:

See also:

Scene Understanding & Natural Language 2013 | Scene Understanding & Natural Language 2014


Vqa: Visual question answering S Antol, A Agrawal, J Lu, M Mitchell… – Proceedings of the …, 2015 – cv-foundation.org … In particular, research in image and video captioning that combines Com- puter Vision (CV), Natural Language Processing (NLP), and Knowledge Representation & Reasoning (KR) has dra- matically increased in the past year [13, 7, 9, 32, 21, 19, 45]. … Cited by 148 Related articles All 18 versions

Ask your neurons: A neural-based approach to answering questions about images M Malinowski, M Rohrbach, M Fritz – Proceedings of the IEEE …, 2015 – cv-foundation.org … 1. Introduction With the advances of natural language processing and image understanding, more complex and demanding tasks have become within reach. … This task unites inference of question intends and visual scene understanding with a word sequence prediction task. … Cited by 69 Related articles All 13 versions

Cider: Consensus-based image description evaluation R Vedantam, C Lawrence Zitnick… – Proceedings of the IEEE …, 2015 – cv-foundation.org … 1. Introduction Recent advances in object recognition [15], attribute classification [23], action classification [26, 9] and crowd- sourcing [40] have increased the interest in solving higher level scene understanding problems. One … Cited by 95 Related articles All 14 versions

Learning models for following natural language directions in unknown environments S Hemachandra, F Duvallet, TM Howard… – … on Robotics and …, 2015 – ieeexplore.ieee.org … The robot observes transitions between environment re- gions and the semantic label of its current region. As scene understanding is not the focus of this work, we use AprilTag fiducials [19] placed in each region that denotes its label. … Cited by 13 Related articles All 6 versions

Show, attend and tell: Neural image caption generation with visual attention K Xu, J Ba, R Kiros, K Cho, A Courville… – arXiv preprint arXiv: …, 2015 – jmlr.org … Flickr9k, Flickr30k and MS COCO. 1. Introduction Automatically generating captions for an image is a task close to the heart of scene understanding — one of the pri- mary goals of computer vision. Not only must caption gen … Cited by 372 Related articles All 14 versions

Shapenet: An information-rich 3d model repository AX Chang, T Funkhouser, L Guibas… – arXiv preprint arXiv: …, 2015 – arxiv.org … Scene understanding from 2D images is a grand challenge in vision that has recently benefited tremendously from 3D CAD models [28 … driven methods from the machine learn- ing community have been exploited by researchers in vision and NLP (natural language processing). … Cited by 28 Related articles All 8 versions

Robobarista: Object part based transfer of manipulation trajectories from crowd-sourcing in 3d pointclouds J Sung, SH Jin, A Saxena – arXiv preprint arXiv:1504.03071, 2015 – arxiv.org … Thus, rather than relying on scene understanding techniques [7, 33, 17], we directly use 3D point-cloud for manipulation planning using machine learning algorithms … Deep learning has made impact in related application ar- eas (eg, vision [29, 5], natural language processing [47 … Cited by 15 Related articles All 13 versions

Text to 3d scene generation with rich lexical grounding A Chang, W Monroe, M Savva, C Potts… – arXiv preprint arXiv: …, 2015 – arxiv.org Page 1. Text to 3D Scene Generation with Rich Lexical Grounding Angel Chang?, Will Monroe?, Manolis Savva, Christopher Potts and Christopher D. Manning Stanford University, Stanford, CA 94305 1angelx,wmonroe4,msavval … Cited by 7 Related articles All 14 versions

Learning to interpret and describe abstract scenes LGM Ortiz, C Wolff, M Lapata – Proceedings of the 2015 Conference of …, 2015 – aclweb.org … We approach this problem within the methodology of Zitnick and Parikh (2013), who pro- posed the use of abstract scenes generated from clip art to model scene understanding (see Figure 1). The use of abstract scenes offers several advantages over real images. … Cited by 10 Related articles All 6 versions

Knowlywood: Mining activity knowledge from hollywood narratives N Tandon, G de Melo, A De, G Weikum – … of the 24th ACM International on …, 2015 – dl.acm.org … is usually followed by kissing, is a valuable asset for tasks like natural language dialog, scene understanding, or video … Categories and Subject Descriptors I.2.7 [Natural Language Processing]: Text Analysis Keywords Activity Knowledge; Commonsense Knowledge Acquisition … Cited by 11 Related articles All 7 versions

Neural Self Talk: Image Understanding via Continuous Questioning and Answering Y Yang, Y Li, C Fermuller, Y Aloimonos – arXiv preprint arXiv:1512.03460, 2015 – arxiv.org … The benefits of modeling scene understanding task as a re- vealing of the “self talk” of the intelligent agents are mainly twofold: 1) the … In Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing, pages 819–826. … Cited by 5 Related articles All 4 versions

Image question answering using convolutional neural network with dynamic parameter prediction H Noh, PH Seo, B Han – arXiv preprint arXiv:1511.05756, 2015 – arxiv.org … Image question answering (ImageQA) [1, 17, 23] aims to solve the holistic scene understanding problem by propos- ing a task … Malinowski and Fritz [17] propose a Bayesian framework, which exploits recent advances in computer vision and natural language processing. … Cited by 18 Related articles All 3 versions

Grasp type revisited: A modern perspective on a classical feature for vision Y Yang, C Fermüller, Y Li… – 2015 IEEE Conference …, 2015 – ieeexplore.ieee.org … We hope our contributions can help advance the field of static scene understanding and human action fine level analysis, and we … Moreover, we believe that progress in natural language processing, such as mining the relationship between grasp type and actions, can advance … Cited by 11 Related articles All 8 versions

Large-scale deep learning on the yfcc100m dataset K Ni, R Pearce, K Boakye, B Van Essen, D Borth… – arXiv preprint arXiv: …, 2015 – arxiv.org … downstream capabilities such as scene or object classification, addi- tional unsupervised learning (ie via topic modeling [11] or natural language processing algorithms [12]). … On the first thrust, we aim for improved high-level summariza- tion and scene understanding. … Cited by 5 Related articles All 5 versions

HC-Search for structured prediction in computer vision M Lam, J Rao Doppa, S Todorovic… – Proceedings of the …, 2015 – cv-foundation.org … evaluates candidate solutions. The recently-developed HC-Search method has been shown to achieve state-of-the- art results in natural language processing, but mixed suc- cess when applied to vision problems. This paper … Cited by 8 Related articles All 13 versions

A Distributed Representation Based Query Expansion Approach for Image Captioning S Yagcioglu, E Erdem, A Erdem, R Cak?c? – Annual Meeting of the …, 2015 – aclweb.org … Automatic image captioning is a fast growing area of research which lies at the intersection of com- puter vision and natural language processing and refers to the problem of generating natural … The SUN Attribute Database: Beyond Categories for Deeper Scene Understanding. … Cited by 4 Related articles All 8 versions

Multimodal person discovery in broadcast tv at mediaeval 2015 J Poignant, H Bredin, C Barras – Proceedings of …, 2015 – pdfs.semanticscholar.org … 4. BASELINE AND METADATA This task targeted researchers from several communities including multimedia, computer vision, speech and natural language processing. … Scene understanding for identifying persons in TV shows: beyond face authentication. In CBMI, 2014. … Cited by 9 Related articles All 4 versions

Viske: Visual knowledge extraction and question answering by visual verification of relation phrases F Sadeghi, SK Divvala, A Farhadi – 2015 IEEE Conference on …, 2015 – ieeexplore.ieee.org Page 1. VisKE: Visual Knowledge Extraction and Question Answering by Visual Verification of Relation Phrases Fereshteh Sadeghi† ? Santosh K. Divvala‡,† Ali Farhadi†,‡ †University of Washington ‡The Allen Institute for … Cited by 25 Related articles All 11 versions

A Restricted Visual Turing Test for Deep Scene and Event Understanding H Qi, T Wu, MW Lee, SC Zhu – arXiv preprint arXiv:1512.01715, 2015 – arxiv.org … Integrating computer vision and natural language processing, as well as other modal knowledge, has been a hot topic in the recent de- velopment of deeper image and scene understanding. Visual Turing Test. Inspired by the generic Turing test principle in AI [36], Geman et al. … Cited by 1 Related articles All 3 versions

Sentiment Analysis Using Social Multimedia J Yuan, Q You, J Luo – Multimedia Data Mining and Analytics, 2015 – Springer … Therefore, research in sentiment analysis not only has an important impact on Natural Language Processing, but may also have a profound … such kind of data and researchers are trying to incorporate techniques such as attribute learning and scene understanding before going … Cited by 2 Related articles All 3 versions

Generating multi-sentence natural language descriptions of indoor scenes D Lin, S Fidler, C Kong… – British Machine Vision …, 2015 – pdfs.semanticscholar.org … to show one’s understanding. The task of automatically generating textual descriptions for images has received increasing attention from both the computer vision and natural language processing communities. This is an important … Cited by 1 Related articles All 5 versions

Yin and yang: Balancing and answering binary visual questions P Zhang, Y Goyal, D Summers-Stay, D Batra… – arXiv preprint arXiv: …, 2015 – arxiv.org Page 1. Yin and Yang: Balancing and Answering Binary Visual Questions Peng Zhang?† Yash Goyal?† Douglas Summers-stay‡ Dhruv Batra† Devi Parikh† †Virginia Tech ‡US Army Research Laboratory †{zhangp, ygoyal, dbatra, parikh}@vt.edu … Cited by 6 Related articles All 6 versions

Rich image description based on regions X Zhang, X Song, X Lv, S Jiang, Q Ye… – Proceedings of the 23rd …, 2015 – dl.acm.org … of an image is a funda- mental problem in artificial intelligence that connects com- puter vision and natural language processing. … And Computer Vision]: Scene Analysis — Object recognition; I.2.10 [Artificial Intelli- gence]: Vision and Scene Understanding — perceptual rea … Cited by 1 Related articles All 3 versions

Unsupervised domain discovery using latent Dirichlet allocation for acoustic modelling in speech recognition M Doulaty, O Saz, T Hain – arXiv preprint arXiv:1509.02412, 2015 – arxiv.org … LDA is an statistical approach to discover latent topics in a collection of documents in an unsupervised manner [7]. It is mostly used in Natural Language Processing (NLP) for the … [8] S. Kim, S. Sundaram, P. Georgiou, and S. Narayanan, “Au- dio scene understanding using topic … Cited by 5 Related articles All 8 versions

Integrating mechanisms of visual guidance in naturalistic language production MI Coco, F Keller – Cognitive processing, 2015 – Springer … mechanisms that guide attention. Keywords. Eye movements Language production Scene understanding Cross-modal processing Eye–voice span Structural guidance. Electronic supplementary material. The online version of … Cited by 6 Related articles All 12 versions

Deep learning approaches to problems in speech recognition, computational chemistry, and natural language text processing GE Dahl – 2015 – tspace.library.utoronto.ca … A good model for visual scene understanding would leverage the constraints of the generative process to make sharp and accurate inferences. Similarly, in natural language processing, we can observe sequences of words with very little corruption, but the interactions between … Cited by 3 Related articles All 5 versions

A Deep Learning Model for Structured Outputs with High-order Interaction H Guo, X Zhu, MR Min – arXiv preprint arXiv:1504.08022, 2015 – arxiv.org … of interests in using real-valued, low-dimentional vector to rep- resent a word or a sentence in the natural language processing (NLP … we plan to explore our strategy with a hinge loss for structured label classification with applications in image labeling and scene understanding. … Cited by 3 Related articles All 11 versions

Modelling User Affect and Sentiment in Intelligent User Interfaces: A Tutorial Overview BW Schuller – Proceedings of the 20th International Conference on …, 2015 – dl.acm.org … Methodologies and techniques, Modeling, Signal analysis, synthesis, and pro- cessing; I.2.10 Vision and scene understanding: Video anal … In Proceedings of Conference on Empirical Methods in Natural Language Processing (EMNLP-2008) (Edinburgh, UK, 2008), 466–474. … Cited by 1 Related articles All 2 versions

Utilizing Depth Sensors for Analyzing Multimodal Presentations: Hardware, Software and Toolkits CW Leong, L Chen, G Feng, CM Lee… – Proceedings of the 2015 …, 2015 – dl.acm.org … First, while there are depth sensors operating on sim- ilar principles that focus on scene understanding and recon- struction, such as the 3D structure sensor1 or Intel’s R200 3D sensor2, we believe understanding human body move- ments to be key to valuable multimodal … Cited by 2 Related articles All 4 versions

Layered ontological image for intelligent interaction to extend user capabilities on multimedia systems in a folksonomy driven environment M Dal Mas – Intelligent Interactive Multimedia Systems and Services …, 2015 – Springer … To determine, which attributes are most relevant for describing scenes, we use the extensive Scene UNderstanding (SUN) database … Natural Language Processing (NLP) pre-processing to determine attributes not classified in the five groups of semantic attributes considered … Cited by 1 Related articles

Visual word2vec (vis-w2v): Learning visually grounded word embeddings using abstract scenes S Kottur, R Vedantam, JMF Moura, D Parikh – arXiv preprint arXiv: …, 2015 – arxiv.org … That is, AI is inherently multi-modal. Language modeling is an important problem in natural language processing (NLP). … Learning from Visual Abstraction: There is a lot of re- cent literature on learning from visual abstractions for a variety of high-level scene understanding tasks. … Cited by 3 Related articles All 3 versions

Language & Common Sense JK Hartshorne, JB Tenenbaum – mindmodeling.org … This workshop brings together researchers from across the cognitive sciences – including developmental and cognitive psychology, linguistics, natural language processing, artificial intelligence, and robotics – to … Simulation as an engine of physical scene understanding. … Related articles All 2 versions

Knowledge acquisition for language description from scene understanding P Jain, P Pawar, G Koriya, A Lele… – … and Control (IC4), …, 2015 – ieeexplore.ieee.org … for computer vision applications have attempted several works with the use of language to alleviate image scene understanding. … Natural Language Generation (NLG) is the natural language processing task of generating natural language from a machine representation system … Related articles

Neural-Symbolic Learning and Reasoning (Dagstuhl Seminar 14381) A d’Avila Garcez, M Gori, P Hitzler, LC Lamb – Dagstuhl Reports, 2015 – drops.dagstuhl.de … in machine learning, knowledge representation and reason- ing, computer vision and image understanding, natural language processing, and cognitive … I. 2.4 Knowledge Representation Formalisms and Methods, I. 2.6 Learning, I. 2.10 Vision and Scene Understanding, I. 2.11 … Related articles All 7 versions

Feedback of robot states for object detection in natural language controlled robotic systems J Bao, Y Jia, Y Cheng, H Tang… – 2015 IEEE International …, 2015 – ieeexplore.ieee.org … Natural Language Processing: Our NLP module [14] is in charge of extracting semantic information from human language inputs. It is mainly composed of an intention recognizer and a semantic processor. … En- hanced visual scene understanding through human-robot dialog. … Related articles

A Semantic Approach to Enhance HITS Algorithm for Extracting Associated Concepts using ConceptNet. M Alsoos, A Kheirbek – Journal of Digital Information Management, 2015 – dline.info … Many researches use common sense knowledge bases in wide different fields like scene understanding [6], search engines [7], natural language processing [8] and ontology matching [9]. Other researches use common sense knowledge base for just a detailed task like finding … Related articles All 2 versions

Topic Network: Topic Model with Deep Learning for Image Classification Z Pan, Y Liu, G Liu, M Guo, Y Li – International Conference on Knowledge …, 2015 – Springer … Similar to natural language processing, topic model also performs well in image processing by regarding local features (eg SIFT [9]) as visual words … Li, L.-J., Socher, R., Fei-Fei, L.: Towards total scene understanding: classification, annota- tion and segmentation in an automatic … Related articles

Automatic Tamil lyric generation based on image sequence and derived tune R Sridhar, N Dharmaraj, J Damodaran… – … (IACC), 2015 IEEE …, 2015 – ieeexplore.ieee.org … Index terms: Natural Language Processing, Lyric Generation, Image Identification, Raga synthesis … RGB combined with texture based scene understanding can result in better identification of the shooting location. Consequently, the lyrics obtained will be still more relevant. … Related articles

Outdoor scene labeling using deep convolutional neural networks W Jun, Z Chaolliang, L Shirong… – … Conference (CCC), 2015 …, 2015 – ieeexplore.ieee.org … [3] Haibing Zhang, Shirong Liu and Chaoliang Zhong, Ourdoor Scene Understanding Using SEVI-BOVW Model, Proc. IEEE. … [9] R. Collobert and J. Weston, A Unified Architecture for Natural Language Processing: Deep Neural Networks with Multitask Learning, Proc. … Cited by 1 Related articles

Reformulation Strategies of Repeated References in the Context of Robot Perception Errors in Situated Dialogue N Schutte, J Kelleher, B Mac Namee – 2015 – arrow.dit.ie … of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th Internatonal Joint Conference on Natural Language Processing, pages 292–301. … Combining top-down spatial reasoning and bottom- up object class recognition for scene understanding. … Related articles All 2 versions

SAGE: Semantic Annotation of Georeferenced Environments P Moghadam, B Evans, E Duff – Journal of Intelligent & Robotic Systems, 2015 – Springer … While auto- matic approaches show promising results for generic object detection and scene understanding in 3D point clouds, they rely on … With the support of Natural Language Processing (NLP) and cloud-based speech recognition applications, users’ speech is converted to … Related articles

Robot audition based Acoustic Event Identification using a Bayesian model considering spectral and temporal uncertainties K Nakamura, K Nakadai – … and Systems (IROS), 2015 IEEE/RSJ …, 2015 – ieeexplore.ieee.org … NPY process [17] has been previously developed for natural language processing, and the problem of this method in sound-word extraction is that the input symbolized streams essentially have spectral and temporal uncertainties, which is normally out of matters in natural … Related articles

Understanding object descriptions in robotics by open-vocabulary object retrieval and detection S Guadarrama, E Rodner, K Saenko… – … International Journal of …, 2015 – ijr.sagepub.com … A given description q is parsed for noun groups using the standard natural language processing tagger andparser provided by the nltk framework (http://nltk.org/). A noun group could be, for example, the brand name “Cap’n Crunch”. … Related articles All 3 versions

A novel knowledge management scheme for IRS AA Khodaskar, SA Ladhake – Computer Communication and …, 2015 – ieeexplore.ieee.org … User interface convert and presents the rules in the user understandable form and it design using natural language processing techniques. II. … Human inductive learning and reasoning is very important in high-level scene understanding and content extraction. … Related articles

Use of a Large Image Repository to Enhance Domain Dataset for Flyer Classification P Pourashraf, N Tomuro – International Symposium on Visual Computing, 2015 – Springer … In: Empirical Methods in Natural Language Processing (EMNLP) (2014). 8. Pourashraf, P., Tomuro, N., Apostolova, E.: Genre-based image classification … Processing (ICDIP) (2015). 9. Li, C., Parikh, D., Chen, T.: Automatic discovery of groups of objects for scene understanding. … Related articles

PERCOLATTE: A Multimodal Person Discovery System in TV Broadcast for the MediaEval 2015 Evaluation Campaign M Bendris, D Charlet, G Senay, MY Kim, B Favre… – 2015 – ceur-ws.org … Our PERCOLA- TOR system based on scene understanding features ranked first on the main task in 2014 [2]. The Mediaeval “Multi … In Proceedings of 5th International Joint Conference on Natural Language Processing, pages 1269–1278, Chiang Mai, Thailand, November 2011 … Related articles All 6 versions

Towards an automated intelligence product generation capability AM Smith, TW Hawes, JJ Nolan – SPIE Sensing …, 2015 – proceedings.spiedigitallibrary.org … Images pass through an image processing pipeline, which consists of image classification, text extraction, scene understanding, and object detection. … of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the … Related articles All 3 versions

Topic Modeling for Large-Scale Multimedia Analysis and Retrieval. J Hu, Y Fang, N Ling, L Song – 2015 – books.google.com … In the next paragraph, we are going to introduce LDA in the environment of natural language processing, where we are dealing with text … A traditional decay function such as exponential decay only depends on time, while for the scene-understanding problem, the probability … Related articles

Affordance and k-TR Augmented Alphabet based Neuro-Symbolic language—Af-kTRAANS—A Human-Robot Interaction meta-language KM Varadarajan, M Vincze – Methods and Models in …, 2015 – ieeexplore.ieee.org … Embodiment based scene understanding using RACER logical ontology base and proto-object definitions have been studied in [1]. The most … to’ or ‘Bring from’ or commands in other human languages: all of which are handled by the Natural Language Processing –NLP stage of … Related articles

Cross-Lingual Cross-Media Content Linking: Annotations and Joint Representations (Dagstuhl Seminar 15201) AG Hauptmann, J Hodson, J Li, N Sebe… – Dagstuhl …, 2015 – drops.dagstuhl.de … Seminar May 10–13, 2015 – http://www.dagstuhl.de/15201 1998 ACM Subject Classification I.2.7 Natural Language Processing, I.2.10 Vision and Scene Understanding, H.3.3 Information Search and Retrieval, I.2.4 Knowledge Representation Formalisms and Methods … Related articles All 3 versions

Calibrated Structured Prediction V Kuleshov, PS Liang – Advances in Neural Information Processing …, 2015 – papers.nips.cc … We perform a thor- ough study of which features yield good calibration, and find that domain-general features are quite good for calibrating MAP and marginal estimates over three tasks—object recognition, optical char- acter recognition, and scene understanding. … Cited by 4 Related articles All 6 versions

Automatic Description Generation from Images: A Survey of Models, Datasets, and Evaluation Measures R Bernardi, R Cakici, D Elliott, A Erdem, E Erdem… – pdfs.semanticscholar.org … The task has recently attracted the attention of both computer vision and natural language processing communi- ties, leading to a rapidly evolving … give an example, a large number of researchers have tackled the object detection aspect of visual scene understanding, where the … Cited by 1 Related articles All 2 versions

Technical Report: Image Captioning with Semantically Similar Images M Kolá?, M Hradiš, P Zem?ík – arXiv preprint arXiv:1506.03995, 2015 – arxiv.org … Image Captioning is a challenging problem which re- quires smart and careful combination of Computer Vision with Natural Language Processing. … to perform well on a given dataset, the goal in image captioning should be the creation of a model of scene understanding. … Cited by 2 Related articles All 3 versions

Learning deep structured models LC Chen, AG Schwing, AL Yuille, R Urtasun – Proc. ICML, 2015 – jmlr.org … 1998; Hinton & Salakhutdinov, 2006; Bengio et al., 2007; Salakhutdinov & Hinton, 2012; Zeiler & Fergus, 2014) and shown to be extremely successful in a wide variety of ap- plications including computer vision, speech recognition as well as natural language processing (Lee et … Cited by 38 Related articles All 18 versions

A new method for traffic density estimation based on topic model R Kaviani, P Ahmadi… – 2015 Signal Processing …, 2015 – ieeexplore.ieee.org … For this purpose, we use topic model that was originally studied in the field of natural language processing for the purpose of finding word correlations in a set of textual documents to find latent topics. … [24] W. Fu, J. Wang, H. Lu, S. Ma, ‘Dynamic scene understanding by improved … Related articles All 2 versions

Hierarchical Bayesian models for unsupervised scene understanding DM Steinberg, O Pizarro, SB Williams – Computer Vision and Image …, 2015 – Elsevier … Vision. Edited By Concetto Spampinato, Benoit Huet and Bas Boom. Cover image Cover image. Hierarchical Bayesian models for unsupervised scene understanding. … 3. Bayesian models for unsupervised scene understanding. In … Cited by 7 Related articles All 5 versions

How? Why? What? Where? When? Who? Grounding Ontology in the Actions of a Situated Social Agent S Lallee, PFMJ Verschure – Robotics, 2015 – mdpi.com … Keywords: knowledge representation; human robot interaction; communication; natural language processing; perception action loop; artificial cognitive architecture. … which is an elegant framework for modeling the knowledge generated by the scene-understanding processes. … Cited by 1 Related articles All 7 versions

An iteratively reweighting algorithm for dynamic video summarization P Dong, Y Xia, S Wang, L Zhuo, DD Feng – Multimedia Tools and …, 2015 – Springer … This algorithm was rooted from natural language processing. … Sparse representation was also explored for video summarization. Cong et al. [12] employed both the CENTRIST descriptor [67] for scene understanding and color moments to represent video frames. … Cited by 1 Related articles All 3 versions

Visual affect around the world: A large-scale multilingual visual sentiment ontology B Jou, T Chen, N Pappas, M Redi, M Topkara… – Proceedings of the 23rd …, 2015 – dl.acm.org … H.5.4 [Information Interfaces and Presentation]: Hy- pertext/Hypermedia; I.2.10 [Artificial Intelligence]: Vi- sion and Scene Understanding … poses several challenges in lexical, structural and semantic ambiguities, which are well-known problems in natural language processing. … Cited by 14 Related articles All 6 versions

Knowledge Representation for Image Feature Extraction N Karna, I Suwardi, N Maulidevi – International Conference on Soft …, 2015 – Springer … artificial intelligence coded as I.2, then visual semantic is coded as I.2.10 (Vi- sion and Scene Understanding) along with the … of categories of objects and the relations between them 2. Frame, to store knowledge acquisition from NLP (Natural Language Processing), focuses on … Related articles All 3 versions

LEWIS: Latent Embeddings for Word Images and their Semantics A Gordo, J Almazán, N Murray… – Proceedings of the IEEE …, 2015 – cv-foundation.org … Semantics play a very important role in scene understanding and for scene text, particularly in urban sce- narios, they will allow one to … There has been a recent resurgence of interest in embedding text in semantic Euclidean spaces in the natural language processing community … Cited by 1 Related articles All 8 versions

Document Informatics for Scientific Learning and Accelerated Discovery V Govindaraju, I Nwogu, S Setlur – Big Data Analytics, 2015 – books.google.com … Reasoning with and understanding the output of the DIA and representing and indexing this knowledge such that it is amenable to more nuanced search will entail new research in Ontologies, Natural Language Processing, and Information Retrieval (IR). … Related articles All 3 versions

A unified spatio-temporal human body region tracking approach to action recognition N Al Harbi, Y Gotoh – Neurocomputing, 2015 – Elsevier There are numerous instances in which, in addition to the direct observation of a human body in motion, the characteristics of related objects can also contribu. Cited by 3 Related articles All 5 versions

FPGA-Accelerated Hadoop Cluster for Deep Learning Computations A Alhamali, N Salha, R Morcel… – … Conference on Data …, 2015 – ieeexplore.ieee.org … than conventional machine learning in many research areas such as speech recognition, image processing and natural language processing. … shown unprecedented and record-breaking results in objects classification and recognition and scene understanding, with intelligence … Related articles All 2 versions

Saliency-guided detection of unknown objects in RGB-D indoor scenes J Bao, Y Jia, Y Cheng, N Xi – Sensors, 2015 – mdpi.com This paper studies the problem of detecting unknown objects within indoor environments in an active and natural manner. The visual saliency scheme utilizing both color and depth cues is proposed to arouse the interests of the machine system for detecting unknown objects at salient … Cited by 1 Related articles All 12 versions

A Rotated Character Recognition Method Based on Geometry Correction S Guo, X Shi, L Bao, L Wang – Information Science and Security …, 2015 – ieeexplore.ieee.org … In the character recognition technology the image, researchers carried out research into the character recognition in the field of natural scene understanding, vehicle license plate recognition, digital image libraries, geographic information system and so on, and so many mature … Related articles All 2 versions

Learning Representation for Scene Understanding: Epitomes, CRFs, and CNNs LC Chen – 2015 – escholarship.org … Page 2. UNIVERSITY OF CALIFORNIA Los Angeles Learning Representation for Scene Understanding: Epitomes, CRFs, and CNNs … 2015 Page 4. ABSTRACT OF THE DISSERTATION Learning Representation for Scene Understanding: Epitomes, CRFs, and CNNs by … Related articles

Assistive Robots for Older Adults with Dementia: Challenges in the Design of Collaborative Human-Robot Interaction M Begum, R Huq, R Wang, A Mihailidis – academia.edu … HRI framework may require intelligent modifications to existing machine learning, speech recognition, and natural language processing algorithms. … perceptual awareness 2. For improved awareness in executive functions (eg planning, scene understanding, solving complex … Related articles

Going Deeper with Convolutional Neural Network for Intelligent Transportation T Chen – 2015 – wpi.edu … introduce the deep feature into scene understanding. We experiment each task for … Recurrent Neural Networks(RNN) have been used for many vision tasks for decades. Recently, RNN are explosive to be used in natural language processing(NLP), speech … Related articles All 2 versions

Probabilistic Word Selection via Topic Modeling Y Zhuang, H Gao, F Wu, S Tang… – IEEE Transactions on …, 2015 – ieeexplore.ieee.org … the probabilistic “bag-of-topics” in the topic modeling domain or the flat “bag-of-words” in the traditional natural language processing domain to … As the shorthands, we refer “SUM” and “sLDA” to the scene understanding model [10] and multi- class sLDA [30], respectively, in the … Cited by 1 Related articles All 3 versions

Small-variance nonparametric clustering on the hypersphere J Straub, T Campbell, JP How… – 2015 IEEE Conference …, 2015 – ieeexplore.ieee.org … These statistics contain valuable information that can be used for scene understanding, plane segmentation, or to reg ularize a 3D … other fields, including pro tein backbone configurations in computational biology, se mantic word vectors in natural language processing, and ro … Cited by 3 Related articles All 7 versions

A methodology for extracting standing human bodies from single images A Tsitsoulis, NG Bourbakis – IEEE Transactions on Human- …, 2015 – ieeexplore.ieee.org … Bodies From Single Images Athanasios Tsitsoulis, Member, IEEE, and Nikolaos G. Bourbakis, Fellow, IEEE Abstract—Segmentation of human bodies in images is a chal- lenging task that can facilitate numerous applications, like scene understanding and activity recognition. … Cited by 2 Related articles All 3 versions

Document and natural image applications of deep learning L Kang – 2015 – drum.lib.umd.edu … data. Text in natural images carries important semantic information. Localizing text aids scene understanding and is also relevant to a number of computer vision applications, such as internet image indexing, mobile vision, and low vision aids [5]. … Related articles All 2 versions

Inria @SiliconValley Activity Report 2011-2014 V Issarny, T Castro, H Kirchner, C Morin – 2015 – hal.inria.fr … The same digital models often require mathematical abstractions if high-level scene understanding and structural analysis are required; abstract models such as a city layout, however, require the addition of geometric details to render them more visually realistic. … All 4 versions

Visual chunking: A list prediction framework for region-based object detection N Rhinehart, J Zhou, M Hebert… – 2015 IEEE International …, 2015 – ieeexplore.ieee.org … We call these unions “chunks,” inspired by a well-known task in Natural Language Processing: “chunking,” which involves grouping many … G. Heitz, S. Gould, A. Saxena, and D. Koller, “Cascaded classification models: Combining models for holistic scene understanding,” in NIPS … Related articles All 9 versions

Command and control models of next generation unmanned aircraft systems WD Place, ME Nissen – 2015 – calhoun.nps.edu Page 1. NPS-IS-15-002 NAVAL POSTGRADUATE SCHOOL MONTEREY, CALIFORNIA COMMAND AND CONTROL MODELS OF NEXT GENERATION UNMANNED AIRCRAFT SYSTEMS by W. David Place & Dr. Mark E. Nissen October 2015 … Related articles

Recent Advances in Convolutional Neural Networks J Gu, Z Wang, J Kuen, L Ma, A Shahroudy… – arXiv preprint arXiv: …, 2015 – arxiv.org … Wang, Member, IEEE Abstract—In the last few years, deep learning has lead to very good performance on a variety of problems, such as object recognition, speech recognition and natural language processing. Among different … Cited by 5 Related articles All 2 versions

Analysis of Robustness in Lane Detection using Machine Learning Models WA Adams – 2015 – etd.ohiolink.edu … In this document, I will attempt to convey the radical way in which machine learning has changed the approach taken in scene understanding tasks through the … Everything from the natural language processing their search algorithm employs to the optimized performance … Related articles All 2 versions

Natural hand interaction for augmented reality. T Piumsomboon – 2015 – ir.canterbury.ac.nz Page 1. Natural Hand Interaction for Augmented Reality _____ A thesis submitted in partial fulfilment of the requirements for the Degree of Doctor of Philosophy in the University of Canterbury by Thammathip Piumsomboon _____ … Related articles All 3 versions

A probabilistic theory of deep learning AB Patel, T Nguyen, RG Baraniuk – arXiv preprint arXiv:1504.00641, 2015 – arxiv.org … and volume of speech. Indeed, the main challenge in many sensory perception tasks in vision, speech, and natural language processing is a high amount of such nuisance variation. Nuisance variations complicate perception … Cited by 19 Related articles All 7 versions

ANNOR: Efficient image annotation based on combining local and global features E Kuric, M Bielikova – Computers & Graphics, 2015 – Elsevier … The Imagenet Large Scale Visual Recognition Challenge (ILSVRC) 1 is the venue for evaluating the current state-of-the-art for image classification and recognition. To extract the semantics from data, general object recognition and scene understanding is required. … Cited by 2 Related articles All 3 versions

Team activity recognition in Association Football using a Bag-of-Words-based method R Montoliu, R Martín-Félez, J Torres-Sospedra… – Human movement …, 2015 – Elsevier … A first implementation of BoW within the sport science field was performed in (Rodríguez-Pérez & Montoliu, 2013). The BoW model is a representation used in natural language processing, information retrieval and computer vision fields, among others. … Cited by 1 Related articles All 5 versions

Invariance For Perceptual Recognition Through Deep Learning YW Zou – 2015 – ai.stanford.edu Page 1. INVARIANCE FOR PERCEPTUAL RECOGNITION THROUGH DEEP LEARNING A DISSERTATION SUBMITTED TO THE DEPARTMENT OF ELECTRICAL ENGINEERING AND THE COMMITTEE ON GRADUATE STUDIES OF STANFORD UNIVERSITY … Related articles All 2 versions

Probabilistic event calculus for event recognition A Skarlatidis, G Paliouras, A Artikis… – ACM Transactions on …, 2015 – dl.acm.org … Descriptors: I.2.3 [Deduction and Theorem Proving]: Uncertainty, “Fuzzy,” and Probabilistic Reasoning; I.2.4 [Knowledge Representation Formalisms and Methods]: Temporal Logic; I.2.6 [Learning]: Parameter Learning; I.2.10 [Vision and Scene Understanding]: Video Analysis … Cited by 19 Related articles All 13 versions

Leveraging big data for grasp planning D Kappler, J Bohg, S Schaal – 2015 IEEE International …, 2015 – ieeexplore.ieee.org … the real world. Furthermore, only [21, 26, 31, 35] provide publicly available datasets. Data-driven methods have successfully been applied to complex problems in vision and natural language processing. Having big datasets … Cited by 21 Related articles All 3 versions

Recovering hard-to-find object instances by sampling context-based object proposals T Tuytelaars – arXiv preprint arXiv:1511.01954, 2015 – arxiv.org … According to the topic model formulation, a document di can cover multiple topics tk and the words w that appear in the doc- ument reflect the set of topics tk that it covers. From the per- spective of statistical natural language processing, a topic tk can … Cited by 1 Related articles All 5 versions

Graph Learning on K Nearest Neighbours for Automatic Image Annotation F Su, L Xue – Proceedings of the 5th ACM on International …, 2015 – dl.acm.org … quality. Categories and Subject Descriptors H.3.1 [Information Storage and Retrieval]: Content Analysis and Indexing; I.2.10 [Artificial Intelligence]: Vi- sion and Scene Understanding General Terms Algorithms, Experimentation … Cited by 1 Related articles

Deep Learning for Computer Vision: A comparison between Convolutional Neural Networks and Hierarchical Temporal Memories on object recognition tasks D Maltoni, II Session – 2015 – amslaurea.unibo.it … In recent years, Deep Learning techniques have shown to perform well on a large variety of problems both in Computer Vision and Natural Language Processing, reaching and … Sub-fields of computer vision include object recognition, scene understanding, video … Related articles

Modeling Cognition with Probabilistic Programs: Representations and Algorithms A Stuhlmüller – 2015 – Citeseer … 156 7.4.3 Visual scene understanding . . . . . 158 … 156 7-7 Inference results for a simple scene understanding model . . . . . 158 14 Page 15. List of Tables 3.1 Concept types: prototypes, nested prototypes, parts . . . . . … Cited by 2 Related articles All 4 versions

Learning Visual Attributes from Image and Text S Maharjan, MSK Yamaguchi, NOTOK Inui – 2015 – anlp.jp … In ECCV, 2014. [11] Christopher D. Manning, Mihai Surdeanu, John Bauer, Jenny Finkel, Steven J. Bethard, and David McClosky. The Stanford CoreNLP natural language processing toolkit. … The sun attribute database: Beyond categories for deeper scene understanding. … Related articles

Learning to recognize egocentric activities using RGB-D data S Wan – 2015 – repositories.lib.utexas.edu … tent Semantic Analysis (pLSA) [14] and Latent Dirichlet Allocation (LDA) [15], which originate from statistical natural language processing, has motivated researchers to apply them to visual recognition tasks. … natural language processing [38], and speech recognition [39, 40]. … Related articles

Similarity Reasoning Over Semantic Context–Graphs A BOTEANU – 2015 – m.wpi.edu Page 1. SIMILARITY REASONING OVER SEMANTIC CONTEXT–GRAPHS by ADRIAN BOTEANU A Dissertation Submitted to the Faculty of the WORCESTER POLYTECHNIC INSTITUTE in partial fulfillment of the requirements for the Degree of Doctor of Philosophy in … Related articles All 3 versions

Robust and efficient models for action recognition and localization D Oneata – 2015 – tel.archives-ouvertes.fr … That is an idealized computer vision system as portrayed in the Ter- minator 2 movie, see Figure 1.1. The Terminator view is able to analyze data in real time and accurately detect objects and humans, perform face recognition and scene understanding. … Cited by 1 Related articles All 2 versions

Adjoining Chaos Q Wang – … Methods in Computational Science, Engineering, and …, 2015 – drops.dagstuhl.de Page 30. 28 14371–Adjoint Methods in Computational Science, Engineering, and Finance 3.31 Adjoining Chaos Qiqi Wang (MIT–Cambridge, US) License Creative Commons BY 3.0 Unported license © Qiqi Wang Joint work … Related articles All 8 versions

[BOOK] Robots that talk and listen: technology and social impact J Markowitz – 2015 – books.google.com … 185 D Speech interpretation|187 E Manipulation routines|189 III Demonstration of abilities | 191 A Scene understanding|192 B … For these digital natives, learning systems equipped with networking functionality and real- time, natural language processing can build a greater … Cited by 3 Related articles All 2 versions

Localization of Humans in Images Using Convolutional Networks JJR Tompson – 2015 – Citeseer … ture. ConvNets have been used successfully to solve many difficult machine learning problems: image classification [83, 82, 54, 92], scene understanding [29], video anal- ysis [51] and natural language processing [91, 17]. Likewise, they have recently out- … Related articles All 3 versions

[BOOK] Computer Vision-ECCV 2014 Workshops: Zurich, Switzerland, September 6-7 and 12, 2014, Proceedings L Agapito, MM Bronstein, C Rother – 2015 – books.google.com … Devi Parikh Virginia Tech, USA W14 – Light Fields for Computer Vision Jingyi Yu University of Delaware, USA Bastian Goldluecke Heidelberg University, Germany Rick Szeliski Microsoft Research, USA W15 – Computer Vision for Road Scene Understanding and Autonomous … Related articles All 2 versions

Learning structured prediction models in computer vision F Liu – 2015 – digital.library.adelaide.edu.au Page 1. Learning Structured Prediction Models in Computer Vision by Fayao Liu A thesis submitted in fulfillment for the degree of Doctor of Philosophy in the … Related articles

Supervised Topic Classification for Modeling a Hierarchical Conference Structure E Gaussier, V Strijov – … 2015, Istanbul, Turkey, November 9-12, …, 2015 – books.google.com … 4, 16–20 (2013) 8. Ramage, D., Hall, D., Nallapati, R., Manning, CD: Labeled LDA: a supervised topic model for credit attribution in multi-labeled corpora. In: Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 1, pp. 248–256. … Related articles

Visualization-based active learning for the annotation of SAR images M Babaee, S Tsoukalas, G Rigoll… – IEEE Journal of Selected …, 2015 – ieeexplore.ieee.org Page 1. IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, VOL. 8, NO. 10, OCTOBER 2015 4687 Visualization-Based Active Learning for the Annotation of SAR Images … Cited by 6 Related articles All 4 versions