Visual Dialog 2017


Notes:

  • Goal-oriented visual dialog
  • Image-to-text
  • Visual chatbot
  • Visual dialog agent
  • Visual dialog model
  • Visual dialog system
  • Visual question answering

Resources:

  • visualdialog.org .. agents that can hold dialogs with humans about visual content
  • visualqa.org .. dataset containing open-ended questions about images

References:

See also:

Natural Language Image Recognition | Scene Understanding & Natural Language 2016 | Text-to-Image & Natural Language 2017Text-to-Image SystemsText-to-Scene 2017 | Visual Question Answering


Visual dialog
A Das, S Kottur, K Gupta, A Singh… – Proceedings of the …, 2017 – openaccess.thecvf.com
… in Visual Dialog is the following – given an image I, a history of a dialog con- sisting of a sequence of question-answer pairs (Q1: ‘How many people are in wheelchairs?’, A1: ‘Two’, Q2: ‘What are their genders?’, A2: ‘One male and one female’), and a natural language follow-up …

Learning cooperative visual dialog agents with deep reinforcement learning
A Das, S Kottur, JMF Moura, S Lee… – arXiv preprint arXiv …, 2017 – openaccess.thecvf.com
… First, discrete symbols and natural language are interpretable … After the two bots are trained, we can pair a human questioner with A-BOT to ac- complish the goals of visual dialog (aiding visually/situa- tionally impaired users), and pair a human answerer with Q-BOT to play a …

Natural Language Does Not Emerge ‘Naturally’ in Multi-Agent Dialog
S Kottur, JMF Moura, S Lee, D Batra – arXiv preprint arXiv:1706.08502, 2017 – arxiv.org
Page 1. Natural Language Does Not Emerge ‘Naturally’ in Multi-Agent Dialog … In essence, we find that natural language does not emerge ‘naturally’, despite the semblance of ease of natural-language- emergence that one may gather from re- cent literature …

Human attention in visual question answering: Do humans and deep networks look at the same regions?
A Das, H Agrawal, L Zitnick, D Parikh, D Batra – Computer Vision and …, 2017 – Elsevier
… Das, Kottur, Moura, Lee, Batra, 2017 A. Das, S. Kottur, JM Moura, S. Lee, D. BatraLearning cooperative visual dialog agents with … EMNLP 2016 Workshop on Natural Language Processing for Social Media (2016). Rensink, 2000 RA RensinkThe dynamic representation of scenes …

Inferring and executing programs for visual reasoning
J Johnson, B Hariharan… – arXiv preprint arXiv …, 2017 – openaccess.thecvf.com
… The result is an interpretable mapping of free- form natural language to programs, and a ?9 point improve- ment in accuracy over the best competing models. 2. Related Work … Semantic parsers attempt to map natural language sen- tences to logical forms …

Image-grounded conversations: Multimodal context for natural question and response generation
N Mostafazadeh, C Brockett, B Dolan, M Galley… – arXiv preprint arXiv …, 2017 – arxiv.org
… future. 8 Conclusions We have introduced a new task of multimodal image-grounded conversation, in which, when given an image and a natural language text, the system must generate meaningful conversation turns. To support …

End-to-end optimization of goal-driven and visually grounded dialogue systems
F Strub, H De Vries, J Mary, B Piot, A Courville… – arXiv preprint arXiv …, 2017 – arxiv.org
… Although visually-grounded language models have been studied for a long time [Roy, 2002], important breakthroughs in both visual and natural language understanding has led to a renewed interest in the field … While Visual Dialog considers the chit-chat setting, the GuessWhat …

Modulating early visual processing by language
H De Vries, F Strub, J Mary, H Larochelle… – Advances in Neural …, 2017 – papers.nips.cc
… For instance, signal modulation through batch norm parameters may also be beneficial for reinforcement learning, natural language processing or adversarial training tasks. Acknowledgements … Visual Dialog. In Proc. of CVPR, 2017 …

C-vqa: A compositional split of the visual question answering (vqa) v1. 0 dataset
A Agrawal, A Kembhavi, D Batra, D Parikh – arXiv preprint arXiv …, 2017 – arxiv.org
… This model is different from other VQA models in that it uses multiple hops of attention over the image. Given an image and the natural language question, SAN uses the question to obtain an attention map over the image … Visual Dialog …

Deal or no deal? end-to-end learning for negotiation dialogues
M Lewis, D Yarats, YN Dauphin, D Parikh… – arXiv preprint arXiv …, 2017 – arxiv.org
… We gather a large dataset of human-human negoti- ations on a multi-issue bargaining task, where agents who cannot observe each other’s reward functions must reach an agreement (or a deal) via natural language dialogue …

Emergence of language with multi-agent games: learning to communicate with sequences of symbols
S Havrylov, I Titov – Advances in Neural Information Processing …, 2017 – papers.nips.cc
… Abhishek Das, Satwik Kottur, José MF Moura, Stefan Lee, and Dhruv Batra. Learning Coopera- tive Visual Dialog Agents with Deep Reinforcement Learning … In Proceedings of the 2010 conference on empirical methods in natural language processing, pages 410–419 …

Context-aware captions from context-agnostic supervision
R Vedantam, S Bengio, K Murphy… – … Vision and Pattern …, 2017 – openaccess.thecvf.com
… inference (Sec. 3.3). More details are provided in Sec. 3.1. Beyond Image Captioning: Image captioning, the task of generating natural language description for an image, has seen quick progress [10, 11, 36, 40]. Recently, research …

Tips and Tricks for Visual Question Answering: Learnings from the 2017 Challenge
D Teney, P Anderson, X He, A Hengel – arXiv preprint arXiv:1708.02711, 2017 – arxiv.org
… [6]. Even though the task straddles the fields of computer vision and natural language processing, it has primarily … 1). The increasing interest in VQA parallels a similar trend for other tasks involving vision and language, such as image captioning [12, 32] and visual dialog [10] …

Visual reference resolution using attention memory for visual dialog
PH Seo, A Lehrmann, B Han, L Sigal – Advances in neural …, 2017 – papers.nips.cc
… Visual dialog is the task of building an agent capable of answering a sequence of questions presented in the form of a dialog. Formally, we need to predict an answer yt ? Y, where Y is a set of discrete answers or a set of natural language phrases/sentences, at time t given …

Best of Both Worlds: Transferring Knowledge from Discriminative Learning to a Generative Visual Dialog Model
J Lu, A Kannan, J Yang, D Parikh… – Advances in Neural …, 2017 – papers.nips.cc
… with humans or other agents in natural language. Over the last few years, neural sequence models (eg [47, 44, 46]) have emerged as the dominant paradigm across a variety of setting and datasets – from text-only dialog [44, 40, 23, 3] to more recently, visual dialog [7, 9, 8, 33 …

TGIF-QA: Toward spatio-temporal reasoning in visual question answering
Y Jang, Y Song, Y Yu, Y Kim… – IEEE Conference on …, 2017 – openaccess.thecvf.com
… understands visual content at region-level details and finds their associations with pairs of questions and answers in the natural language form [2 … Also, appearing in the same proceedings are CLEVR [16], VQA2.0 [12], and Visual Dialog [8], which all address image-based VQA …

ParlAI: A Dialog Research Software Platform
AH Miller, W Feng, A Fisch, J Lu, D Batra… – arXiv preprint arXiv …, 2017 – arxiv.org
… Visual Dialog: dialog is often grounded in physical objects in the world, so we also in- clude visual dialog tasks, with images as well as text … CommAI is in a RL set- ting, and contains only synthetic datasets, rather than real natural language datasets as we do here …

Vision-and-Language Navigation: Interpreting visually-grounded navigation instructions in real environments
P Anderson, Q Wu, D Teney, J Bruce… – arXiv preprint arXiv …, 2017 – arxiv.org
… Within the natural language processing (NLP) community, most existing approaches ab- stract away the problem of visual perception to a signif- icant … of new benchmark datasets for image cap- tioning [12], visual question answering (VQA) [4, 15] and visual dialog [14] has …

Composite Task-Completion Dialogue Policy Learning via Hierarchical Deep Reinforcement Learning
B Peng, X Li, L Li, J Gao… – arXiv preprint arXiv …, 2017 – pdfs.semanticscholar.org
… dialogue system is a key example of such personal assistants that can help people to accomplish certain tasks via natural language ex- change … agent (Zhao and Eskénazi, 2016; Li et al., 2017a), and open domain dialogue generation (Li et al., 2016a), Visual Dialogue (Das et …

Teaching machines to describe images via natural language feedback
H Ling, S Fidler – arXiv preprint arXiv:1706.00130, 2017 – papers.nips.cc
… We stress that our work differs from the recent efforts in conversation modeling [19] or visual dialog [4] using Reinforcement … Our framework consists of a new phrase-based captioning model trained with Policy Gradients that incorporates natural language feedback provided by a …

Evaluating visual conversational agents via cooperative human-ai games
P Chattopadhyay, D Yadav, V Prabhu… – arXiv preprint arXiv …, 2017 – arxiv.org
… trained to understand and communicate about the contents of a scene in natural language. For example, in Fig … Specifically, we evaluate two versions of ALICE for GuessWhich: 1. ALICESL which is trained in a supervised manner on the Visual Dialog dataset (Das et al …

Mixed-initiative personal assistants
JW Buck, S Perugini – Proceedings of the 2017 ACM SIGCSE Technical …, 2017 – dl.acm.org
… as Siri, Google Now, Cortana, and Alexa, is driving and expanding progress toward the long- standing, albeit challenging, goal of applying artificial in- telligence to build human-computer dialog systems capable of understanding natural language [8]. There … Visual Dialog Client …

Emergent translation in multi-agent communication
J Lee, K Cho, J Weston, D Kiela – arXiv preprint arXiv:1710.06922, 2017 – arxiv.org
… Remarkable successes have been achieved in natural language processing (NLP) via the use of supervised learning approaches on large-scale datasets (Bahdanau et al., 2015; Wu et al., 2016 … Learning cooperative visual dialog agents with deep reinforcement learning …

Obtaining referential word meanings from visual and distributional information: Experiments on object naming
S Zarrieß, D Schlangen – … of 55th annual meeting of the …, 2017 – pub.uni-bielefeld.de
Page 1. Obtaining referential word meanings from visual and distributional information: Experiments on object naming Sina Zarrieß and David Schlangen Dialogue Systems Group // CITEC // Faculty of Linguistics and Literary …

Learning to Disambiguate by Asking Discriminative Questions
Y Li, C Huang, X Tang, CC Loy – arXiv preprint arXiv …, 2017 – openaccess.thecvf.com
… 1. Introduction Imagine a natural language dialog between a computer and a human (see Fig. 1): Kid : “What sport is the man playing?” … To overcome the challenges, we utilize the Long Short- Term Memory (LSTM) [13] network to generate natural language questions …

Question Part Relevance and Editing for Cooperative and Context-Aware VQA (C2VQA)
AS Toor, H Wechsler, M Nappi – … of the 15th International Workshop on …, 2017 – dl.acm.org
… Sepa- rately, advances in natural language processing (NLP) prediction- based techniques (such as word2vec [7] and GloVe [8]) have vastly increased the … are often used as feature inputs for multi-modal tasks, like image captioning [16], VQA [1], and Visual Dialogue [2]. Image …

iVQA: Inverse visual question answering
F Liu, T Xiang, TM Hospedales, W Yang… – arXiv preprint arXiv …, 2017 – arxiv.org
… 2015), visual question answering (Agrawal et al. 2016), natural language object retrieval (Kazemzadeh et al. 2014) and ‘visual Turing tests’ (Geman et al … 2016). It further requires natural language generation capabilities to synthesise open- ended linguistic descriptions …

Deep Learning for Image-to-Text Generation: A Technical Overview
X He, L Deng – IEEE Signal Processing Magazine, 2017 – ieeexplore.ieee.org
… A technical overview Generating a natural language description from an image is an emerging interdisciplinary problem at the intersection of computer vision, natural language processing, and artificial intelligence (AI) … Natural Language Processing, 2015, Beijing, China, pp …

Multimodal Dialogs (MMD): A large-scale dataset for studying multimodal domain-aware conversations
A Saha, M Khapra, K Sankaranarayanan – arXiv preprint arXiv:1704.00200, 2017 – arxiv.org
… 2, we use a standard recur- rent neural network based decoder with GRU cells. Such a decoder has been used successfully for var- ious natural language generation tasks including text conversation systems (Serban et al., 2016b) …

Embodied question answering
A Das, S Datta, G Gkioxari, S Lee, D Parikh… – arXiv preprint arXiv …, 2017 – arxiv.org
… Visual Dialog presents a novel problem configuration – single-shot QA about videos captured by goal-driven active agents … Like EmbodiedQA, image and video question answering tasks [11–15] require reasoning about natural language questions posed about visual con- tent …

Bidirectional Beam Search: Forward-Backward Inference in Neural Sequence Models for Fill-in-the-Blank Image Captioning
Q Sun, S Lee, D Batra – arXiv preprint arXiv:1705.08759, 2017 – openaccess.thecvf.com
… The application of these models has led to significantly improved performance on a variety of tasks – speech recognition [1,2], machine translation [3–5], conver- sation modeling [6], image captioning [7–11], visual ques- tion answering (VQA) [12–16], and visual dialog [17, 18] …

Examining Cooperation in Visual Dialog Models
M Mironenco, D Kianfar, K Tran, E Kanoulas… – arXiv preprint arXiv …, 2017 – arxiv.org
… Grounding natural language is a difficult problem as models which combine modalities must account for the individual impact of each information source … and has been motivated as a natural training paradigm for dialogue models, and has been applied to visual dialog models in …

Teaching Machines to Describe Images with Natural Language Feedback
S Fidler – Advances in Neural Information Processing Systems, 2017 – papers.nips.cc
… We stress that our work differs from the recent efforts in conversation modeling [19] or visual dialog [4] using Reinforcement … Our framework consists of a new phrase-based captioning model trained with Policy Gradients that incorporates natural language feedback provided by a …

Interactive Reinforcement Learning for Object Grounding via Self-Talking
Y Zhu, S Zhang, D Metaxas – arXiv preprint arXiv:1712.00576, 2017 – arxiv.org
… Visual dialog. CVPR, 2017. [3] A Das, S Kottur, J Moura, S Lee, and D Batra. Learning cooperative visual dialog agents with deep rl. ICCV, 2017 … [7] S Kottur, J. Moura, S Lee, and D Batra. Natural language does not emerge naturally in multi-agent dialog. EMNLP, 2017 …

CS 224N: TensorFlow Tutorial
N Khandwala, B Oshri – 2017 – stanford.edu
… Page 22. Acknowledgments Jon Gauthier, Natural Language Processing Group, Symbolic Systems Bharath Ramsundar, PhD Student, Drug Discovery Research Chip Huyen, Undergraduate, teaching CS20SI: TensorFlow for Deep Learning Research! Page 23. Visual …

CoDraw: Visual Dialog for Collaborative Drawing
JH Kim, D Parikh, D Batra, BT Zhang, Y Tian – arXiv preprint arXiv …, 2017 – arxiv.org
… Computer Science > Computer Vision and Pattern Recognition. Title: CoDraw: Visual Dialog for Collaborative Drawing … clip arts. The two players communicate via two-way communication using natural language. We collect the …

Are You Talking to Me? Reasoned Visual Dialog Generation through Adversarial Learning
Q Wu, P Wang, C Shen, I Reid, A Hengel – arXiv preprint arXiv:1711.07613, 2017 – arxiv.org
… However, in contrast to these classical vision-and-language tasks that only involve at most a single natural language interaction, visual dialog requires the machine to hold a meaningful dialogue in natural language about visual con- tent. Mostafazadeh et al …

Neural Networks As A Tool For Big Data: The State Of The Art
C Tarjano, V Pereira – networkscience.com.br
… 2016. “A Primer on Neural Network Models for Natural Language Processing.” J. Artif. Intell … 2017. “Best of Both Worlds: Transferring Knowledge from Discriminative Learning to a Generative Visual Dialog Model.” In Advances in Neural Information Processing Systems, 313–23 …

Listen, Interact and Talk: Learning to Speak via Interaction
H Zhang, H Yu, W Xu – arXiv preprint arXiv:1705.09906, 2017 – arxiv.org
… Abstract One of the long-term goals of artificial intelligence is to build an agent that can communicate intelligently with human in natural language … (a) During training, teacher interacts in natural language with learner about objects …

Criteria for Human-Compatible AI in Two-Player Vision-Language Tasks
C Han, SW Lee, Y Heo, W Kang, J Jun, BT Zhang – bi.snu.ac.kr
… 1 Introduction Recent advances in computer vision and natural language processing have led researchers’ attentions to the intersec- tion of these two areas, vision-language tasks … ReferIt game [Kazemzadeh et al., 2014] is an example of visual dialogue …

Parallel Attention: A Unified Framework for Visual Object Discovery through Dialogs and Queries
B Zhuang, Q Wu, C Shen, I Reid, A Hengel – arXiv preprint arXiv …, 2017 – arxiv.org
… Most recently, a new vision-and-language task, Visual Dialog [4–6], demands that an agent participate intelligently in a dialog about an image … In this section, we describe our unified model that takes as input an image and a set of natural language expressions and outputs a …

Asking the Difficult Questions: Goal-Oriented Visual Question Generation via Intermediate Rewards
J Zhang, Q Wu, C Shen, J Zhang, J Lu… – arXiv preprint arXiv …, 2017 – arxiv.org
… Question Generation in NLP There is a long history of works on grammar question generation from text do- main in natural language processing (NLP) [6, 9, 25, 29] … VQA [2, 12, 33], and aforementioned visual dialogue sys- tem [8, 18] etc. In [23], Ren et al …

Learning how to learn: an adaptive dialogue agent for incrementally learning visually grounded word meanings
Y Yu, A Eshghi, O Lemon – arXiv preprint arXiv:1709.10423, 2017 – arxiv.org
… Among other competencies, this involves the ability to learn and adapt mappings between words, phrases, and sentences in Natural Language (NL) and perceptual aspects of the ex- ternal environment – this is widely known as the grounding problem … 2016. Visual dialog …

Emerging trends and novel approaches in interaction design
K Marasek, A Romanowski… – Computer Science and …, 2017 – ieeexplore.ieee.org
… learning, image and video question answering. Visual Dialog is a novel task that requires an AI agent to hold a meaningful dialog with humans in natural language about visual content. Specifically, given an image, a dialog …

The Art of Deep Connection-Towards Natural and Pragmatic Conversational Agent Interactions
A Ray – 2017 – vtechworks.lib.vt.edu
… Question generation. There has also been a lot of effort into generating natural language ques- tions about images ([27] [29]) … Our work is probably most similar to the work on Visual Dialog by [8]. They introduce a task where there are two people talking about an image …

Being Negative but Constructively: Lessons Learnt from Creating Better Visual Question Answering Datasets
WL Chao, H Hu, F Sha – arXiv preprint arXiv:1704.07121, 2017 – arxiv.org
… Vqa: Visual question answering. In ICCV, 2015. [4] Steven Bird, Ewan Klein, and Edward Loper. Natural language processing with Python: analyzing text with the natural language toolkit. ” O’Reilly Media, Inc.”, 2009 … Visual dialog. arXiv preprint arXiv:1611.08669, 2016 …

Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 1, Long Papers
M Lapata, P Blunsom, A Koller – Proceedings of the 15th Conference of …, 2017 – aclweb.org
… I will also show a teaser about the next step moving forward: Visual Dialog. Instead of answering individual questions about an image in isolation, can we build machines that can hold a sequential natural language conversation with humans about visual content …

Active Learning for Visual Question Answering: An Empirical Study
X Lin, D Parikh – arXiv preprint arXiv:1711.01732, 2017 – arxiv.org
… 1 Introduction Visual Question Answering (VQA) [1, 8, 9, 11, 23, 28] is the task of taking in an image and a free-form natural language question and automatically answering the question … [4] studies grounded visual dialog [3] between two machines in collaborative image retrieval …

CSE: U: Mixed-initiative Personal Assistant Agents
JW Buck – src.acm.org
… Dialog User Client Dialog Management Engine Dialog Generation Engine Natural Language Processing Unit high-level textual dialog specification generated XML specification Visual Dialog Client tree & hash table data structures prompt Proposition Analysis …

Federated Control with Hierarchical Multi-Agent Deep Reinforcement Learning
S Kumar, P Shah, D Hakkani-Tur, L Heck – arXiv preprint arXiv …, 2017 – arxiv.org
… References [1] A. Das, S. Kottur, J. Moura, and D. Batra. Learning cooperative visual dialog agents with deep reinforcement learning … In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pages 2433–2443, 2017 …

Eigen: A Step Towards Conversational AI
WH Guss, J Bartlett… – Alexa Prize …, 2017 – alexaprize.s3.amazonaws.com
… To train models on a variety of conversation and textual data, it’s necessary for the data to conform to one shared schema, depicted in the next section. To process the above datasets to fit this schema, we make use of Python’s Natural Language Toolkit (NLTK) … Visual dialog …

Visual Question Answering: A Tutorial
D Teney, Q Wu… – IEEE Signal Processing …, 2017 – ieeexplore.ieee.org
… increasing interest from researchers in both the computer vision and natural language processing fields … Textual QA has been studied for a long time in the natural language processing (NLP) community, and VQA is basically its extension to a visual input …

Toward Scalable Social Alt Text: Conversational Crowdsourcing as a Tool for Refining Vision-to-Language Technology for the Blind
E Salisbury, E Kamar, MR Morris – 2017 – www-cs.stanford.edu
… Recently, automated approaches that combine computer vision and natural language processing to describe image content have emerged as a potential solution for improving the accessibility of social media imagery for BVI users …

Eye gaze and viewpoint in multimodal interaction management
G Brône, B Oben, A Jehoul, J Vranjes… – Cognitive …, 2017 – degruyter.com
AbstractIn this paper, we present an embodiment perspective on viewpoint by exploring the role of eye gaze in face-to-face conversation, in relation to and interaction with other expressive modalities. More specifically, we look into gaze patterns, as well as gaze synchronization with …

Bringing Back Hieroglyph
S Dey, A Dutta, J Lladós, U Pal – Document Analysis and …, 2017 – ieeexplore.ieee.org
… The dataset and the network being the just a stepping stone to combine computer vision, natural language processing, and knowledge representation for … This can also be used for generating intelligent chat bots to respond to graphics as a visual dialog (hieroglyph) system …

Reasoning about Fine-grained Attribute Phrases using Reference Games
JC Su, C Wu, H Jiang, S Maji – arXiv preprint arXiv …, 2017 – openaccess.thecvf.com
Page 1. Reasoning about Fine-grained Attribute Phrases using Reference Games Jong-Chyi Su? Chenyun Wu? Huaizu Jiang Subhransu Maji University of Massachusetts, Amherst {jcsu,chenyun,hzjiang,smaji}@cs.umass.edu Abstract …

Convolutional Image Captioning
J Aneja, A Deshpande, A Schwing – arXiv preprint arXiv:1711.09151, 2017 – arxiv.org
Page 1. Convolutional Image Captioning Jyoti Aneja?, Aditya Deshpande?, Alexander Schwing University of Illinois at Urbana-Champaign {janeja2, ardeshp2, aschwing}@illinois.edu Abstract Image captioning is an important …

Visually grounded interaction and language
F Strub, H de Vries, A Das, S Kottur… – Schedule …, 2017 – pdfs.semanticscholar.org
… Recent concurrent works in machine learning have focused on bridging visual and natural language understanding through visually-grounded language learning tasks, eg through natural images (Visual Question Answering, Visual Dialog), or through interactions with virtual …

Grounding Referring Expressions in Images by Variational Context
H Zhang, Y Niu, SF Chang – arXiv preprint arXiv:1712.01892, 2017 – arxiv.org
… Grounding natural language in visual data is a hallmark of AI, since it establishes a communication channel between humans, machines, and the physical world, underpinning a variety of multimodal AI tasks such as robotic naviga- tion [35], visual Q&A [1], and visual chatbot [6 …

Interpretable and Pedagogical Examples
S Milli, P Abbeel, I Mordatch – arXiv preprint arXiv:1711.00694, 2017 – arxiv.org
… Abhishek Das, Satwik Kottur, José MF Moura, Stefan Lee, and Dhruv Batra. Learning cooperative visual dialog agents with deep reinforcement learning … Natural language does not emerge ‘naturally’ in multi-agent dialog. CoRR, abs/1706.08502, 2017 …

Counterfactual multi-agent policy gradients
J Foerster, G Farquhar, T Afouras, N Nardelli… – arXiv preprint arXiv …, 2017 – arxiv.org
… Das, Abhishek, Kottur, Satwik, Moura, José MF, Lee, Stefan, and Batra, Dhruv. Learning cooper- ative visual dialog agents with deep reinforcement learning … Multi-agent cooperation and the emergence of (natural) language. arXiv preprint arXiv:1612.07182, 2016 …

Leveraging Multimodal Perspectives to Learn Common Sense for Vision and Language Tasks
X Lin – 2017 – vtechworks.lib.vt.edu
… (VQA). VQA is the task of answering open-ended natural language questions about im- ages … [75] Visual Question An- swering (VQA) is the task of taking as input an image and a free-form natural language question about the image, and producing an accurate answer …

Consequentialist conditional cooperation in social dilemmas with imperfect information
A Peysakhovich, A Lerer – arXiv preprint arXiv:1710.06975, 2017 – arxiv.org
… Abhishek Das, Satwik Kottur, José MF Moura, Stefan Lee, and Dhruv Batra. Learning cooperative visual dialog agents with deep reinforcement learning. arXiv preprint arXiv:1703.06585, 2017 … Multi-agent cooperation and the emergence of (natural) language …

Maintaining cooperation in complex social dilemmas using deep reinforcement learning
A Lerer, A Peysakhovich – arXiv preprint arXiv:1707.01068, 2017 – arxiv.org
… cooperative or non-cooperative behavior. There has also been recent interest in natural language bargaining which is far less structured than a PD but also not zero-sum (Lewis et al., 2017). 2 Page 3. use sufficient summary statistics …

Deep Learning Based Chatbot Models
RK Csáky, G Recski – researchgate.net
… 2.1 Modeling Conversations Chatbot models usually take as input natural language sentences uttered by a user and output a response … The latter models are also called visual dialog agents, where the conversation is grounded on both textual and visual input [Das et al., 2017] …

Prosocial learning agents solve generalized Stag Hunts better than selfish ones
A Peysakhovich, A Lerer – arXiv preprint arXiv:1709.02865, 2017 – arxiv.org
… 2017] Das, A.; Kottur, S.; Moura, JM; Lee, S.; and Batra, D. 2017. Learning cooperative visual dialog agents with deep reinforcement learning … Multi-agent co- operation and the emergence of (natural) language. In Inter- national Conference on Learning Representations …

Schedule Highlights
P Sturm – Machine Learning, 2017 – pdfs.semanticscholar.org
… 12, 19, 24, 33]. These works have drawn inspiration from and made significant contributions to areas of machine learning as diverse as learning on graphs to models in natural language processing. Recent advances enabled …

Learning with Opponent-Learning Awareness
JN Foerster, RY Chen, M Al-Shedivat… – arXiv preprint arXiv …, 2017 – arxiv.org
Page 1. Learning with Opponent-Learning Awareness Jakob N. Foerster2,† jakob.foerster@ cs.ox.ac.uk Richard Y. Chen1,† richardchen@openai.com Maruan Al-Shedivat4 alshedivat@cs.cmu.edu Shimon Whiteson2 shimon.whiteson@cs.ox.ac.uk …

Learning of Coordination Policies for Robotic Swarms
Q Li, X Du, Y Huang, Q Sykora, AP Schoellig – arXiv preprint arXiv …, 2017 – arxiv.org
Page 1. Learning of Coordination Policies for Robotic Swarms Qiyang Li, Xintong Du, Yizhou Huang, Quinlan Sykora, and Angela P. Schoellig Abstract—Inspired by biological swarms, robotic swarms are envisioned to solve …

(Visited 43 times, 1 visits today)