Natural Language & Text-to-Image

Notes:

Text-to-image generation is a task in natural language processing (NLP) that involves generating an image based on a given text description. It is a challenging task that requires a system to be able to understand the meaning and context of the text description and generate an appropriate visual representation of it.

The state of the art in text-to-image generation has advanced significantly in recent years, thanks to the development of powerful deep learning models and large-scale datasets. One of the most widely used approaches to text-to-image generation is to use a combination of a transformer-based language model, such as GPT-2 or GPT-3, and a convolutional neural network (CNN) image generation model, such as StyleGAN or BigGAN.

These models are trained on large datasets of text-image pairs, which allows them to learn the relationships between language and visual concepts. Once trained, these models can be used to generate realistic and coherent images based on a given text description.

There are also a number of open-source libraries and tools available that make it easier to implement text-to-image generation systems, such as OpenCV, PyTorch, and TensorFlow. These tools allow developers to build and train their own text-to-image generation models, or to use pre-trained models and fine-tune them for specific tasks.

Imagistic modeling is a term used to describe the process of creating a mental image or representation of a concept or idea. It is often used in the context of natural language processing (NLP) and machine learning, where it refers to the ability of a system to generate or understand a visual representation of a text description or concept.
Natural language story is a narrative that is written or spoken in natural language, as opposed to a structured or formalized language such as code or mathematics. Natural language stories can be used to convey information, entertain, or persuade, and they are a common form of communication in human language.
Neural storyboard is a computer-generated sequence of images that is generated based on a natural language description or story. It is typically created using a neural network model that has been trained on a large dataset of text-image pairs, allowing it to learn the relationships between language and visual concepts.
Scene generation is the process of creating a visual representation of a scene or environment based on a given description or set of constraints. It is a task that is often used in computer graphics and computer vision, and it can be used to create realistic or stylized environments for use in games, films, or other applications.
Scene retrieval is the process of retrieving a visual representation of a scene or environment based on a given description or set of constraints. It is a task that is often used in computer vision and image processing, and it can be used to search for and retrieve specific images or video clips from a large dataset. Scene retrieval can be used to find images that match a given query, or to identify images that are similar to a given reference image.
Story visualization is a way of representing a story or narrative using visual elements, such as images, charts, and diagrams. It can be used to illustrate a sequence of events or to highlight key themes and ideas in a story.
Synthetic images are computer-generated images that are created using algorithms and software, rather than being captured by a camera or other physical device. Synthetic images can be used for a variety of purposes, such as in film and video games, for scientific visualization, and for testing and training machine learning algorithms.
Text-graphic generation is the process of automatically creating a visual representation of text-based information, such as a chart or diagram. This can be done using natural language processing techniques to extract relevant information from the text and then creating a visual representation of that information.
Text-to-animation is the process of automatically creating an animated representation of text-based information. This can be done by generating a series of still images or by using computer graphics techniques to create a continuous animation.
Text-to-3D is the process of automatically creating a 3D model or representation of text-based information. This can be done using natural language processing techniques to extract relevant information from the text and then creating a 3D model based on that information. This can be used for a variety of purposes, such as in film and video games, for scientific visualization, and for testing and training machine learning algorithms.
Text-to-image conversion is the process of automatically creating an image representation of text-based information. This can be done using natural language processing techniques to extract relevant information from the text and then creating an image based on that information. This can be used for a variety of purposes, such as in film and video games, for scientific visualization, and for testing and training machine learning algorithms.
Text-To-Scene Conversion System (TTSCS) is a system or software that is designed to automatically generate a visual representation of a scene or environment based on text-based descriptions or instructions. TTSCS can be used for a variety of purposes, such as in film and video games, for scientific visualization, and for testing and training machine learning algorithms.
Text-to-video is the process of automatically creating a video representation of text-based information. This can be done by generating a series of still images or by using computer graphics techniques to create a continuous video. Text-to-video can be used for a variety of purposes, such as in film and video games, for scientific visualization, and for testing and training machine learning algorithms.
Visualizer is a tool or software that is used to create visual representations of data or information. Visualizers can be used for a variety of purposes, such as in scientific visualization, data analysis, and for creating charts and diagrams. There are many different types of visualizers available, and they can be used to create a wide range of visual representations, including 2D and 3D graphics, maps, and charts.

Resources:

muse-project.eu .. machine understanding for interactive storytelling
neuraltalk2 .. efficient image captioning code in torch
visual7w-qa-models .. visual7w visual question answering models
visual7w-toolkit .. toolkit for visual7w visual question answering dataset

References:

See also:

SceneMaker | Text-to-Image Systems

Controllable text-to-image generation
B Li, X Qi, T Lukasiewicz, P Torr – Advances in Neural Information …, 2019 – papers.nips.cc
… Abstract In this paper, we propose a novel controllable text-to-image generative adversar- ial network (ControlGAN), which can effectively synthesise high-quality images and also control parts of the image generation according to natural language de- scriptions …

Object-driven text-to-image synthesis via adversarial training
W Li, P Zhang, L Zhang, Q Huang… – Proceedings of the …, 2019 – openaccess.thecvf.com
… Synthesizing images from text descriptions (known as Text-to-Image synthesis) is an important machine learning task, which requires handling ambiguous and incomplete information in natural language descriptions and learning across vision and language modalities …

Localizing natural language in videos
J Chen, L Ma, X Chen, Z Jie, J Luo – … of the AAAI Conference on Artificial …, 2019 – aaai.org
Page 1. The Thirty-Third AAAI Conference on Artificial Intelligence (AAAI-19) Localizing Natural Language in Videos … 2017) and DiDeMo datasets (Hendricks et al. 2017), the task of natural language video localization (NLVL) has gained considerable attentions. As shown in Fig …

Mirrorgan: Learning text-to-image generation by redescription
T Qiao, J Zhang, D Xu, D Tao – Proceedings of the IEEE …, 2019 – openaccess.thecvf.com
… of applica- tions but its challenging nature, T2I generation has become an active research area in both natural language process- ing … proposed to tackle text to image synthe- sis problem by finding a visually discriminative representa- tion for the text descriptions and using this …

Semantics disentangling for text-to-image generation
G Yin, B Liu, L Sheng, N Yu… – Proceedings of the …, 2019 – openaccess.thecvf.com
… In our work, we focus on disentangling the semantic-related concepts to maintain the generation consistency from complex and various natural language de- scriptions as well as the details for text-to-image generation. Conditional Batch Normalization (CBN) …

Semantically consistent hierarchical text to fashion image synthesis with an enhanced-attentional generative adversarial network
K Emir Ak, J Hwee Lim… – Proceedings of the …, 2019 – openaccess.thecvf.com
… Network (e-AttnGAN) with improved training stability for text-to-image synthesis. e-AttnGAN’s integrated attention module utilizes both sentence and word context features and performs feature-wise linear modula- tion (FiLM) to fuse visual and natural language represen- tations …

Dm-gan: Dynamic memory generative adversarial networks for text-to-image synthesis
M Zhu, P Pan, W Chen, Y Yang – Proceedings of the IEEE …, 2019 – openaccess.thecvf.com
… Single-stage. The text-to-image synthesis problem is de- composed by Reed et al. [20] into two sub-problems: first, the joint embedding is learned to capture the relations be- tween natural language and real-world images; second, a …

Text guided person image synthesis
X Zhou, S Huang, B Li, Y Li, J Li… – Proceedings of the …, 2019 – openaccess.thecvf.com
… In this paper, we propose a new task of editing a person image according to natural language descriptions … The text- to-image approaches [28, 22, 32, 30, 29] synthesize images with given texts without the reference images, where the semantic features extracted from texts are …

See-through-text grouping for referring image segmentation
DJ Chen, S Jia, YC Lo, HT Chen… – Proceedings of the …, 2019 – openaccess.thecvf.com
… Later, Wang et al. consider the structure-preserving constraints in learning a joint embedding for image-to-text and text-to-image retrieval [39]. Rohrbach et al … 2.2. Referring Expression Comprehension Integrating computer vision and natural language is an active research area …

A neighbor-aware approach for image-text matching
C Liu, Z Mao, W Zang, B Wang – ICASSP 2019-2019 IEEE …, 2019 – ieeexplore.ieee.org
… It asso- ciates different modalities and improves the understanding of image and natural language … Method Flickr8K Image-to-Text Text-to-Image R@1 R@5 R@10 R@1 R@5 R@10 m-CNN [15] 24.8 53.7 67.1 20.3 47.6 61.7 m-RNN [16] 14.5 37.2 48.5 11.5 31.0 42.4 …

Position focused attention network for image-text matching
Y Wang, H Yang, X Qian, L Ma, J Lu, B Li… – arXiv preprint arXiv …, 2019 – arxiv.org
… 2015], natural language object retrieval [Hu et al., 2016], image captioning [Xu et al., 2015; Vinyals et al., 2017], and visual question answering … the performances of all methods on Flickr30k dataset, where PFAN ti means only employing the loss of attending text to image to train …

Improving Arabic text to image mapping using a robust machine learning technique
J Zakraoui, S Elloumi, JM Alja’am, SB Yahia – IEEE Access, 2019 – ieeexplore.ieee.org
… Digital Object Identifier 10.1109/ACCESS.2017.Doi Number Improving Arabic text to image mapping using … First, we apply natural language processing techniques to analyze the text in stories and we extract keywords of all characters and events in each sentence …

Cycle-consistent diverse image synthesis from natural language
Z Chen, Y Luo – … Conference on Multimedia & Expo Workshops …, 2019 – ieeexplore.ieee.org
… performance on the image synthesis, a more complex con- ditional model of the text-to-image translation[3] produces higher quality images … To address the model collapse issue, we propose a novel cycle model SuperGAN to synthesize diverse images from natural language …

TIGEr: Text-to-Image Grounding for Image Caption Evaluation
M Jiang, Q Huang, L Zhang, X Wang, P Zhang… – arXiv preprint arXiv …, 2019 – arxiv.org
TIGEr: Text-to-Image Grounding for Image Caption Evaluation … text match- ing between reference captions and machine- generated captions, potentially leading to bi- ased evaluations because references may not fully cover the image content and natural language is inherently …

Mscap: Multi-style image captioning with unpaired stylized text
L Guo, J Liu, P Yao, J Li, H Lu – Proceedings of the IEEE …, 2019 – openaccess.thecvf.com
… captioning, has emerged as a promi- nent interdisciplinary research problem at the intersec- tion of computer vision and natural language processing [36 … 20], photo editing [6, 30], domain adaptation [35, 5], image-to-image translation [26, 15, 9] and text-to-image translation [45] …

Challenges and prospects in vision and language research
K Kafle, R Shrestha, C Kanan – arXiv preprint arXiv:1904.09317, 2019 – arxiv.org
… Ideally, these tasks should test a plethora of capabilities that integrate computer vision, reasoning, and natural language understanding … Natural Language Visual Reasoning (NLVR) requires verifying if image descriptions are true (Suhr et al., 2017, 2018) …

Multi-level multimodal common semantic space for image-phrase grounding
H Akbari, S Karaman, S Bhargava… – Proceedings of the …, 2019 – openaccess.thecvf.com
… Phrase grounding [39, 32] is the task of localizing within an image a given natural language input phrase, as illus- trated in Figure 1. This ability to link text and image con- tent is a key component of many visual semantic tasks such as image captioning [10, 21, 18], visual …

Interpretable Text-to-Image Synthesis with Hierarchical Semantic Layout Generation
S Hong, D Yang, J Choi, H Lee – Explainable AI: Interpreting, Explaining …, 2019 – Springer
… Allowing users to describe visual concepts in natural language provides a natural and flexible interface for conditioning image … Recently, approaches based on conditional Generative Adversarial Network (GAN) have shown promising results on text-to-image synthesis task [6, 23 …

A comprehensive survey of deep learning for image captioning
MDZ Hossain, F Sohel, MF Shiratuddin… – ACM Computing Surveys …, 2019 – dl.acm.org
… [66] proposed a deep, multimodal model embedding of image and natural language data for … GANs have already been used successfully in a variety of applications, including image captioning [26, 126], image-to-image translation [56], text-to-image synthesis [15, 115], and text …

Visual semantic reasoning for image-text matching
K Li, Y Zhang, K Li, Y Li, Y Fu – Proceedings of the IEEE …, 2019 – openaccess.thecvf.com
… When people describe what they see in the picture using natural language, it can be observed that the descriptions will not only include the … Table 1. Quantitative evaluation results of the image-to-text (cap- tion) retrieval and text-to-image (image) retrieval on MS-COCO 1K test …

Deep adversarial graph attention convolution network for text-based person search
J Liu, ZJ Zha, R Hong, M Wang, Y Zhang – Proceedings of the 27th ACM …, 2019 – dl.acm.org
… Text-based person search retrieves the pedestrian through natural language description … Method Text-to-Image Rank-1 Rank-5 Rank-10 CNN-RNN [28] 8.07 – 32.47 Neural Talk [34] 13.66 – 41.72 GNA-RNN [19] 19.05 – 53.64 IATVM [18] 25.94 – 60.48 PWM-ATH [5] 27.14 49.45 …

Learning to understand non-categorical physical language for human robot interactions
L Richards, C Matuszek – From the RSS Workshop on AI and its …, 2019 – iral.cs.umbc.edu
… The intuitive idea to allow robotic agents to comprehend and use natural language in their operations is encapsulated in a multitude of work … Projects such as image caption generation and recognition [14, 24] and text-to-image synthesis [33] showcase the joint interest between …

Language features matter: Effective language representations for vision-language tasks
A Burns, R Tan, K Saenko, S Sclaroff… – Proceedings of the …, 2019 – openaccess.thecvf.com
… Image Captioning. The goal of image captioning is to pro- duce natural language which describes an image scene with a well formed sentence … Visual Question Answering. In VQA [2], the goal is to produce a free-form natural language answer given an im- age and question …

Dualattn-GAN: Text to Image Synthesis With Dual Attentional Generative Adversarial Network
Y Cai, X Wang, Z Yu, F Li, P Xu, Y Li, L Li – IEEE Access, 2019 – ieeexplore.ieee.org
… Synthesizing image from text description has been a hot topic crossing natural language processing and computer vision. It has significant impact on the applications of content pro- duction and advertisement design. The core challenge of text-to-image synthesis lies in gener …

A systematic literature review on image captioning
R Stani?t?, D Šešok – Applied Sciences, 2019 – mdpi.com
… Abstract. : Natural language problems have already been investigated for around five years … As long as machines do not think, talk, and behave like humans, natural language descriptions will remain a challenge to be solved …

Hierarchically-fused generative adversarial network for text to realistic image synthesis
X Huang, M Wang, M Gong – 2019 16th Conference on …, 2019 – ieeexplore.ieee.org
… Current research based on Generative Adversarial Network (GAN) has shown promising results for mapping from natural language feature space to image feature space [1], [2], [3], [4]. Reed et al. [1] first proposed a GAN-based text-to-image synthesis approach, but the gener …

Composing text and image for image retrieval-an empirical odyssey
N Vo, L Jiang, C Sun, K Murphy, LJ Li… – Proceedings of the …, 2019 – openaccess.thecvf.com
… geolocalization [13]. Cross-modal image retrieval allows using other types of query, examples include text to image retrieval [52], sketch to image retrieval [42] or cross view image retrieval [26], and event detection [19]. We consider …

Sketchforme: Composing sketched scenes from text descriptions for interactive applications
F Huang, JF Canny – Proceedings of the 32nd Annual ACM Symposium …, 2019 – dl.acm.org
… Neural Text-to-Image Synthesis Generating graphical content from text description is a popular ongoing research problem … This is similar to Sketchforme’s multi-step approach to generate complete sketched scenes from natural language except in the domain of natural images …

Learning visual relation priors for image-text matching and image captioning with neural scene graph generators
KH Lee, H Palangi, X Chen, H Hu, J Gao – arXiv preprint arXiv:1909.09953, 2019 – arxiv.org
… Vision-and-language refers to a range of tasks that bridge vision and natural language, eg automatically describing visual content … suc- cessful across various tasks including visual question an- swering, caption generation, image-text matching, and text- to-image synthesis [2, 25 …

A Comprehensive Analysis of Semantic Compositionality in Text-to-Image Generation.
C Fujiyama, I Kobayashi – ViGIL@ NeurIPS, 2019 – vigilworkshop.github.io
… problem how those models interpret natural language. In this study, we attempt to disentan- gle how a deep neural network model interprets natural language through a text-to-image generation task. A concept behind this is that it …

Mobile App for Text-to-Image Synthesis
R Kang, A Sunil, M Chen – International Conference on Mobile Computing …, 2019 – Springer
… We have developed an educational application to demonstrate the effectiveness of the proposed approach to visualize natural language sentences … As we can see, the main mobile application UI displays the tab bar buttons to switch between the Text-to-Image conversion view …

Attention-guided generative adversarial networks for unsupervised image-to-image translation
H Tang, D Xu, N Sebe, Y Yan – 2019 International Joint …, 2019 – ieeexplore.ieee.org
… Adversarial Networks (GANs) [8] have received considerable attention across many communi- ties, eg, computer vision, natural language processing, audio and … using a reference images as conditional information have tackled a lot of problems, eg, text-to-image translation [22 …

A Resnet-based Text to Image Conversion Method Using Word2Vec and Generative Adversarial Networks
??? – 2019 – repository.hanyang.ac.kr
… In this paper, we propose a generative adversarial networks (GAN) based text-to-image generating method. In many natural language processing tasks, which word expressions are determined by their term frequency – inverse document frequency scores …

Learning fragment self-attention embeddings for image-text matching
Y Wu, S Wang, G Song, Q Huang – Proceedings of the 27th ACM …, 2019 – dl.acm.org
Page 1. Learning Fragment Self-Attention Embeddings for Image-Text Matching Yiling Wu1,2,3, Shuhui Wang1,?, Guoli Song1,2,3, Qingming Huang1,2,3 1 Key Lab of Intell. Info. Process., Inst. of Comput. Tech., Chinese Academy of Sciences, Beijing, 100190, China …

Attention Driven Image Synthesis from Text Descriptions
A Paul, H Gupta, A Jain – cseweb.ucsd.edu
… fine-grained text to image generation. We intend to use latest image generation frameworks like sequence-based Image Transformers by Parmar et al. [2018] in combination with bi-directional LSTM models to generate realistic images from a given natural language description …

Interactive image generation using scene graphs
G Mittal, S Agrawal, A Agarwal, S Mehta… – arXiv preprint arXiv …, 2019 – arxiv.org
… on generating images from natural language descriptions. Conditioned on given text descriptions, conditional-GANs (Reed et al., 2016) are able to generate images that are highly related to the text meanings. Samples generated by existing text-to-image approaches can …

Semantic object accuracy for generative Text-to-Image synthesis
T Hinz, S Heinrich, S Wermter – arXiv preprint arXiv:1910.13321, 2019 – arxiv.org
… 1 Semantic Object Accuracy for Generative Text-to-Image Synthesis … Furthermore, quantitatively evaluating these text-to-image synthesis models is still challenging, as most evaluation metrics only judge image quality but not the conformity between the image and its caption …

Controllable Text? to? Image Generation
T Lukasiewicz – 2019 – cs.ox.ac.uk
… Controllable Text?to?Image Generation … propose a novel visual attributes manipulation method, called controllable generative adversarial network (ControlGAN), which is able to effectively control parts of the image generation according to natural language descriptions, while …

Language-agnostic visual-semantic embeddings
J Wehrmann, DM Souza, MA Lopes… – Proceedings of the …, 2019 – openaccess.thecvf.com
… Bold values indicate current state-of-the-art results. Image to text Text to image Method R@1 R@5 R@10 R@1 R@5 R@10 ? … One can ob- serve that CLMR outperforms all other approaches in R@1 for both image-to-text (71.8%) and text-to-image (57.9%) …

Project Plan of Generating Comics from Natural Language Description
B Yang, Y Wang – 2019 – i.cs.hku.hk
… been an emergent trend in computer vision combining machine learn- ing related to this possible solution, which is constructing scenes from sentence descriptions based on provided references (known as Text-to-Image synthesis) [5]. It requires a natural language process as …

Multi-mapping image-to-image translation via learning disentanglement
X Yu, Y Chen, S Liu, T Li, G Li – Advances in Neural Information …, 2019 – papers.nips.cc
… [30] Seonghyeon Nam, Yunji Kim, and Seon Joo Kim. Text-adaptive generative adversarial networks: Manipulating images with natural language. In NIPS, 2018 … Attngan: Fine-grained text to image generation with attentional generative adversarial networks. In CVPR, 2018 …

Concepts from unclear textual embeddings for text-to-image synthesis
M Kumar – 2019 – ideals.illinois.edu
… Abstract: Automatically generating images based on a natural language description is a challenging problem with several key applications in the fields of … To this end we propose CuteGAN, our simple text-to-image generation approach that encourages the model to leverage the …

Image Manipulation with Natural Language using Two-sidedAttentive Conditional Generative Adversarial Network
D Zhu, A Mogadala, D Klakow – arXiv preprint arXiv:1912.07478, 2019 – arxiv.org
… To the best of our knowledge, none of the previous works propose attention over conditional Generative Adversarial Network (cGAN) in a generator for fine-grained image manipulation with natural language. However, attention has been used for text-to-image generation Xu …

Text to Image Synthesis in Generative Adversarial Networks
A Bhumgara, A Pitale – academia.edu
… generative adversarial networks, learning (artificial intelligence), machine learning, natural language processing, neural nets, object detection, text-to-image translation, —- Date Of …

Synthesis of Image from Text using Generative Adversarial Networks
R KR, M Jayasree – papers.ssrn.com
… Abstract—One of the primary applications of recent conditional generative models is the generation of images from natural language. In addition to the testing ability to these model conditional, extremely dimensional distributions, text-to-image synthesis has several exciting …

Trends in integration of vision and language research: A survey of tasks, datasets, and methods
A Mogadala, M Kalimuthu, D Klakow – arXiv preprint arXiv:1907.09358, 2019 – arxiv.org
… Abstract Integration of vision and language tasks has seen a significant growth in the recent times due to surge of interest from multi-disciplinary communities such as deep learning, computer vision, and natural language processing …

An Interactive Scene Generation Using Natural Language
Y Cheng, Y Shi, Z Sun, D Feng… – … Conference on Robotics …, 2019 – ieeexplore.ieee.org
… The NLP module deploys a pipeline of domain-general natural language processing components. The input is parsed … [6] S. Reed, Z. Akata, X. Yan, L. Logeswaran, B. Schiele, and H. Lee, “Generative adversarial text to image synthesis,” arXiv preprint arX- iv:1605.05396, 2016 …

Cross-Modal Person Search: A Coarse-to-Fine Framework using Bi-Directional Text-Image Matching
X Yu, T Chen, Y Yang, M Mugo… – Proceedings of the …, 2019 – openaccess.thecvf.com
… by natural language descriptions is an important application instance of cross-modal retrieval. Given a textual description of a specific person, its objective is to find images from gallery which best match the descrip- tion. Based on existing methods for text-to-image retrieval …

A Discrete Event Approach for Scene Generation conditioned on Natural Language
Y Cheng, Y Shi, Z Sun, D Feng… – 2019 IEEE International …, 2019 – ieeexplore.ieee.org
… text to image synthesis with textual data augmentation,” in Proc. of IEEE International Conference on Image Processing. IEEE, 2017, pp. 2015–2019. [15] J. Dzifcak, M. Scheutz, C. Baral, and P. Schermerhorn, “What to do and how to do it: Translating natural language directives …

MCA-GAN: Text-to-Image Generation Adversarial Network Based on Multi-Channel Attention
J Sun, B Zhang – 2019 IEEE 4th Advanced Information …, 2019 – ieeexplore.ieee.org
… Keywords— generative adversarial networks; multi-channel atten- tion ; text-to-image I. INTRODUCTION Text-to-image generation have become an active research area in natural language processing and computer vision communi- ties[22] …

Unified visual-semantic embeddings: Bridging vision and language with structured meaning representations
H Wu, J Mao, Y Zhang, Y Jiang, L Li… – Proceedings of the …, 2019 – openaccess.thecvf.com
… comp ? Share Text-to-Image Retrieval … model does not rely on labelled graphs during training. Researchers have designed various types of representa- tions [5, 32] as well as different models [26, 50] for trans- lating natural language sentences into structured represen- tations …

Generation High resolution 3D model from natural language by Generative Adversarial Network
K Fukamizu, M Kondo, R Sakamoto – arXiv preprint arXiv:1901.07165, 2019 – arxiv.org
… In this paper, we propose a method of generating high resolution 3D shapes from natural language descriptions … Attngan: Fine-grained text to image generation with attentional generative adversarial networks. In CVPR, pages 1316–1324. IEEE Computer Society, 2018 …

MS-GAN: Text to Image Synthesis with Attention-Modulated Generators and Similarity-aware Discriminators.
F Mao, B Ma, H Chang, S Shan, X Chen – BMVC, 2019 – jdl.link
… 1 Introduction Text-to-image synthesis is one of the most important and challenging tasks in computer vi- sion and natural language processing. Given a text description, this task aims to synthesize an image with contents semantically consistent with the text …

CycleMatch: A cycle-consistent embedding network for image-text matching
Y Liu, Y Guo, L Liu, EM Bakker, MS Lew – Pattern Recognition, 2019 – Elsevier
… whole embedding learning. Consequently, our visual-textual embedding method can learn not only inter-modal mappings (ie image-to-text and text-to-image), but also intra-modal mappings (ie image-to-image and text-to-text) …

Learning to follow directions in street view
KM Hermann, M Malinowski, P Mirowski… – arXiv preprint arXiv …, 2019 – arxiv.org
… We believe the problem of grounding such a language is a sen- sible step to solve before attempting the same with natural language; similar trends can be seen in the visual question answering community (Hudson and Manning, 2019; John- son et al., 2017) …

Learning Generative Image Object Manipulations from Language Instructions
M Längkvist, A Persson, A Loutfi – 2019 – openreview.net
… Seonghyeon Nam, Yunji Kim, and Seon Joo Kim. Text-adaptive generative adversarial networks: manipulating images with natural language. In Advances in Neural Information Processing Systems, pp. 42–51, 2018 … Generative adversarial text to image synthesis …

Weakly-supervised spatio-temporally grounding natural sentence in video
Z Chen, L Ma, W Luo, KYK Wong – arXiv preprint arXiv:1906.02549, 2019 – arxiv.org
Page 1. Weakly-Supervised Spatio-Temporally Grounding Natural Sentence in Video Zhenfang Chen1? Lin Ma2† Wenhan Luo2† Kwan-Yee K. Wong1 1The University of Hong Kong 2Tencent AI Lab 1zfchen, kykwongl@cs.hku.hk 1forest.linma, whluo.chinal@gmail.com …

ScanRefer: 3D Object Localization in RGB-D Scans using Natural Language
DZ Chen, AX Chang, M Nießner – arXiv preprint arXiv:1912.08830, 2019 – arxiv.org
… 2.1. Grounding Referring Expressions in Images There has been a plethora of successful work connect- ing images to natural language descriptions across tasks such as image captioning [25, 24, 55, 60], text-to-image retrieval [56, 22], and visual grounding [20, 35, 67] …

Scene text detection and recognition with advances in deep learning: a survey
X Liu, G Meng, C Pan – International Journal on Document Analysis and …, 2019 – Springer
… recognition. We also give a brief introduction of other related works such as script identification, text/non-text classification and text-to-image retrieval … model. Mishra et al. [97] presented a query-driven search method for text-to-image retrieval …

Visual to text: Survey of image and video captioning
S Li, Z Tao, K Li, Y Fu – IEEE Transactions on Emerging Topics …, 2019 – ieeexplore.ieee.org
… The availability of vast 40 amounts of text gave a huge boost to the Natural Language Pro- 41 cessing (NLP) research community, which was critical in order 42 to organize the amount of information that had suddenly become 43 available …

Ckd: Cross-task knowledge distillation for text-to-image synthesis
M Yuan, Y Peng – IEEE Transactions on Multimedia, 2019 – ieeexplore.ieee.org
… the text-to-image synthesis to extract necessary visual information from text descriptions and fit the real image distributions. By contrast, the image captioning holds the ability to produce hierarchical repre- sentations from image pixels to natural language description, which can …

Psyncgan for Data Generation Application of Partially Syncronized GAN to Image and Caption Data Generation
K Valery, SK Siahroudi, W Gang – Proceedings of the 2019 3rd …, 2019 – dl.acm.org
… Memory (LSTM) networks, etc., are being used with additional innovations such as the addition of the attention models [1, 2, 3, 4, 5] and the combination with convolutional networks [6]. For generating images from natural language descriptions, many … 1) Text to Image to Text …

A Dynamic Illustration Approach For Arabic Text
J Zakraoui, JM Al Jaam – 2019 IEEE 10th GCC Conference & …, 2019 – ieeexplore.ieee.org
… [27] MM Eunice, automatic conversion of natural language to 3D animation, University of Ulster, 2006 … [40] J. Zakraoui, S. Elloumi, JM Alja’am and SB Yahia, “Improving Arabic Text to Image Mapping Using a Robust Machine Learning Technique,” IEEE Access, vol. 7, pp …

Deep Learning Approaches for Attribute Manipulation and Text-to-Image Synthesis
KE AK – 2019 – researchgate.net
… In order to resolve this issue, natural language descriptions which are more informative and less strict than attributes can be used to design the desired images. Image generation conditioned on the text description problem is defined as text-to-image synthesis in Reed et al. [20] …

Text Conditional Lyric Video Generation
N Frosst, J Kereliuk, G Kid – neurips2019creativity.github.io
… With their method, and making use of the MSCOCO dataset [Lin et al., 2014] they were able to create convincing images that resembled natural language descriptions … Attngan: Fine-grained text to image generation with attentional generative adversarial networks …

Visual Understanding through Natural Language
LAM Hendricks – 2019 – escholarship.org
Page 1. Visual Understanding through Natural Language by Lisa Anne Marie Hendricks A dissertation submitted in partial satisfaction of the … Spring 2019 Page 2. Visual Understanding through Natural Language Copyright 2019 by Lisa Anne Marie Hendricks Page 3. 1 Abstract …

Bridging images and natural language with deep learning
J Gu – 2019 – dr.ntu.edu.sg
… language due to the different structures and characteristics between them. In this thesis, I seek to bridge images and natural language with deep learning. Five … image) into the cross-modal feature embedding, through which the proposed …

ResFPA-GAN: Text-to-Image Synthesis with Generative Adversarial Network Based on Residual Block Feature Pyramid Attention
J Sun, Y Zhou, B Zhang – … on Advanced Robotics and its Social …, 2019 – ieeexplore.ieee.org
… I. INTRODUCTION In the field of generating model, the text-to-image synthe- sis involves two aspects of natural language processing and computer vision [1], which is one of the hot research direc- tions in recent years, especially in the area of robot aided design …

Integrate Image Representation to Text Model on Sentence Level: a Semi-supervised Framework
L Zhang, Q Chen, D Li, B Tang – arXiv preprint arXiv:1912.00336, 2019 – arxiv.org
… In our future work, we plan to seek improvement through better text- to-image retrieval and generation models, and ex- plore to … 53rd Annual Meeting of the Associa- tion for Computational Linguistics and the 7th Inter- national Joint Conference on Natural Language Pro- cessing …

Referring Image Segmentation by Generative Adversarial Learning
S Qiu, Y Zhao, J Jiao, Y Wei… – IEEE Transactions on …, 2019 – ieeexplore.ieee.org
… In this paper, we focus on the problem of image segmentation from natural language referring expressions … Two examples are shown with the ground truth masks in the middle column. paper, we study the problem of using natural language expres- sions to segment an image …

Adversarial generation of handwritten text images conditioned on sequences
E Alonso, B Moysset, R Messina – … International Conference on …, 2019 – ieeexplore.ieee.org
… To reflect the distribution found in natural language, the words to be generated are sampled from a large list of words 483 Page 4 … [11] S. Reed, Z. Akata, X. Yan, L. Logeswaran, B. Schiele, and H. Lee, “Generative Adversarial Text to Image Synthesis,” in ICML, 2016 …

Image generation from layout
B Zhao, L Meng, W Yin, L Sigal – Proceedings of the IEEE …, 2019 – openaccess.thecvf.com
… Of specific relevance are approaches for text-to-image [11, 15, 25, 33, 41, 47] generation. By allowing users to describe visual concepts in natural language, text-to-image generation provides nat- ural and flexible interface for conditioned image genera- tion …

… on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)
K Inui, J Jiang, V Ng, X Wan – … on Empirical Methods in Natural Language …, 2019 – aclweb.org
Page 1. EMNLP-IJCNLP 2019 2019 Conference on Empirical Methods in Natural Language Processing and 9th International Joint Conference on Natural Language Processing Proceedings of the Conference November 3–7, 2019 Hong Kong, China Page 2 …

Deliberation Learning for Image-to-Image Translation.
T He, Y Xia, J Lin, X Tan, D He, T Qin, Z Chen – IJCAI, 2019 – pdfs.semanticscholar.org
… Although deliberation learning is not widely studied in im- age generation, it has been used in many natural language processing tasks … Another work leveraging deliberation learning is [Xu et al., 2018], which attacks the text-to-image problem: the images are generated from low …

ASurvey OF STATE-OF-THE-ART GAN-BASED APPROACHES TO IMAGE SYNTHESIS
SN Esfahani, S Latifi – CS & IT-CSCP, 2019 – academia.edu
… The task of text to image generation usually means translating text in the form of single-sentence descriptions directly into prediction of … GAN-CLS), which successfully generated realistic images (64 × 64) for birds and flowers that are described by natural language descriptions …

Learning to predict layout-to-image conditional convolutions for semantic image synthesis
X Liu, G Yin, J Shao, X Wang – Advances in Neural Information …, 2019 – papers.nips.cc
… [19] Zhenyang Li, Ran Tao, Efstratios Gavves, Cees GM Snoek, and Arnold WM Smeulders. Tracking by natural language specification … Attngan: Fine-grained text to image generation with attentional generative adversarial networks. CVPR, 2018 …

Image Tag Core Generation
N Sharonova – 2019 – pdfs.semanticscholar.org
… Firstly, this is the problem of generating text from an image [3]. Secondly, it is the problem of images generating from natural language [4-7]. Analysis of publications shows the relevance of the … (2019). 4. Bodnar, C.: Text to Image Synthesis Using Generative Adversarial Networks …

CommonGen: A constrained text generation dataset towards generative commonsense reasoning
BY Lin, M Shen, Y Xing, P Zhou, X Ren – arXiv preprint arXiv:1911.03705, 2019 – arxiv.org
… Our data and code are publicly available at http://inklab.usc.edu/CommonGen/. 1 Introduction Commonsense reasoning has long been acknowl- edged as a critical bottleneck of artificial intelli- gence and especially in natural language process- ing …

An improved Image Description Method Using Recurrent Neural Network with Gated Recurrent Unit
H Wang, K Li – 2019 IEEE 1st International Conference on Civil …, 2019 – ieeexplore.ieee.org
… Image description means that machine automatically generates a natural language which can describe image … different word embedding methods and recurrent neurons on improving image description, and apply the generated Chinese description text to image description tasN …

Robust Visual Object Tracking with Natural Language Region Proposal Network
Q Feng, V Ablavsky, Q Bai, S Sclaroff – arXiv preprint arXiv:1912.02048, 2019 – arxiv.org
… are worth a thousand pixels.” We opt for a natural-language (NL) description of the target. However, conditioning a tracker on NL description is not straightforward. First, since tracking implies temporal coherence, applying an algorithm for matching text to image regions [37, 40 …

Joint generation of image and text with GANs
B Shimanuki – 2019 – dspace.mit.edu
… Abstract The computer vision and natural language processing communities have come to- gether on image captioning related problems, but the fields have remained largely disjoint … The computer vision and natural language processing communities have developed …

Investigating Semantic Properties of Images Generated from Natural Language Using Neural Networks
SW Schrader – 2019 – scholarworks.boisestate.edu
… process. If it can only process natural language as binary representations of text, it can … in the images created from textual information, and are encoded in a measurable way. Examination of the results of text to image generative neural networks provide strong …

CanvasGAN: A simple baseline for text to image generation by incrementally patching a canvas
A Singh, S Agrawal – Science and Information Conference, 2019 – Springer
… Most notable recent works in the text to image task are AttnGAN [31] and HDGAN [36]. In AttnGAN, the authors improved StackGAN by using attention to focus on relevant words in the natural language description and proposed a deep attentional multimodal similarity model to …

Focal visual-text attention for memex question answering
J Liang, L Jiang, L Cao, Y Kalantidis… – IEEE transactions on …, 2019 – ieeexplore.ieee.org
… ANSWERING 1895 Page 4. Flickr’s personal photo search traffic. We choose not to include the “show me” questions, as such questions should be addressed by a separate text-to-image/video mod- ule [56]. We acknowledge …

Language-based colorization of scene sketches
C Zou, H Mo, C Gao, R Du, H Fu – ACM Transactions on Graphics (TOG), 2019 – dl.acm.org
… https://doi.org/10.1145/3355089.3356561 1 INTRODUCTION In recent years, deep learning techniques have significantly im- proved the performance of natural language processing [Kim 2014; Lai et al … 2016; Yu et al. 2017] based on a natural language query …

Dual encoding for zero-example video retrieval
J Dong, X Li, C Xu, S Ji, Y He… – Proceedings of the …, 2019 – openaccess.thecvf.com
… Abstract This paper attacks the challenging problem of zero- example video retrieval. In such a retrieval paradigm, an end user searches for unlabeled videos by ad-hoc queries described in natural language text with no visual example provided …

Integrating semantic knowledge to tackle zero-shot text classification
J Zhang, P Lertvittayakumjorn, Y Guo – arXiv preprint arXiv:1903.12626, 2019 – arxiv.org
… vwj ,c shows how the word wj and the class c are related considering the relations in a general knowledge graph. In this work, we use ConceptNet providing general knowledge of natural language words and phrases (Speer and Havasi, 2013) …

Simgan: Photo-realistic semantic image manipulation using generative adversarial networks
S Yu, H Dong, F Liang, Y Mo, C Wu… – 2019 IEEE International …, 2019 – ieeexplore.ieee.org
… Zeynep Akata, Xinchen Yan, Lajanugen Lo- geswaran, Bernt Schiele, and Honglak Lee, “Generative adversarial text to image synthesis,” in … Yunji Kim, and Seon Joo Kim, “Text-Adaptive generative adversarial networks: manip- ulating images with natural language.,” in NeurIPS …

Using GAN to Generate Sport News from Live Game Stats
C Li, Y Su, J Qi, M Xiao – International Conference on Cognitive Computing, 2019 – Springer
… arXiv preprint arXiv:1611.00712 (2016). 13. Reed, S., et al.: Generative adversarial text to image synthesis … 8(3–4), 229–256 (1992)zbMATHGoogle Scholar. 23. Goldberg, E., Driedger, N., Kittredge, RI: Using natural-language processing to produce weather forecasts. IEEE Intell …

Scene graph generation with external knowledge and image reconstruction
J Gu, H Zhao, Z Lin, S Li, J Cai… – Proceedings of the IEEE …, 2019 – openaccess.thecvf.com
Page 1. Scene Graph Generation with External Knowledge and Image Reconstruction Jiuxiang Gu1?, Handong Zhao2, Zhe Lin2, Sheng Li3, Jianfei Cai1, Mingyang Ling4 1 ROSE Lab, Interdisciplinary Graduate School, Nanyang …

Data2vis: Automatic generation of data visualizations using sequence-to-sequence recurrent neural networks
V Dibia, Ç Demiralp – IEEE computer graphics and applications, 2019 – ieeexplore.ieee.org
Page 1. 0272-1716 (c) 2019 IEEE. Personal use is permitted, but republication/ redistribution requires IEEE permission. See http://www.ieee.org/ publications_standards/publications/rights/index.html for more information. This …

Vital: A visual interpretation on text with adversarial learning for image labeling
T Hu, C Long, L Zhang, C Xiao – arXiv preprint arXiv:1907.11811, 2019 – arxiv.org
… In this paper, we propose a novel way to interpret text information by extracting visual feature presentation from multiple high-resolution and photo-realistic synthetic im- ages generated by Text-to-image Generative Adversarial Network (GAN) to improve the performance of …

CPGAN: Full-Spectrum Content-Parsing Generative Adversarial Networks for Text-to-Image Synthesis
J Liang, W Pei, F Lu – arXiv preprint arXiv:1912.08562, 2019 – arxiv.org
… and image modalities to obtain a thorough understanding of involved multimodal information, which is beneficial for modeling the text-to-image consistency in … It was then extensively applied in tasks of natural language processing (NLP) [4, 5, 16, 29] and com- puter vision (CV …

Caption-to-Image Conditional Generative Modeling
J Chen, W Looi – looiwenli.com
… JMLR. org, 2017. [3] Attngan: Fine-grained text to image generation with attentional generative adversarial networks … Supervised learning of universal sentence representations from natural language inference data. arXiv preprint arXiv:1705.02364, 2017 …

Symmetrical Adversarial Training Network: A Novel Model for Text Generation
Y Gao, CJ Wang – International Conference on Artificial Neural Networks, 2019 – Springer
… Reed, S., Akata, Z., Yan, X., Logeswaran, L., Schiele, B., Lee, H.: Generative adversarial text to image synthesis … In: Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, pp …

Style transfer-based image synthesis as an efficient regularization technique in deep learning
A Miko?ajczyk, M Grochowski – 2019 24th International …, 2019 – ieeexplore.ieee.org
… DNN) can reach near human-level performance in many types of tasks, such as classification, segmentation, natural language processing, image … GANs are found to be very useful in many different image generation and manipulation problems like text-to-image synthesis [22 …

Hybrid Attention Driven Text-to-Image Synthesis via Generative Adversarial Networks
Q Cheng, X Gu – International Conference on Artificial Neural Networks, 2019 – Springer
… mechanism shows effectiveness in many applications, especially in natural language process and computer vision. More specifically, self-attention mechanism is introduced in image generation [19]. Besides, attention mechanism is also adopted in text to image generative task …

Matching images and text with multi-modal tensor fusion and re-ranking
T Wang, X Xu, Y Yang, A Hanjalic, HT Shen… – Proceedings of the 27th …, 2019 – dl.acm.org
Page 1. Matching Images and Text with Multi-modal Tensor Fusion and Re-ranking Tan Wang Center for Future Media and School of Information and Communication Engineering University of Electronic Science and Technology of China, China …

Text-to-image Synthesis for Fashion Design
Z Yi – 2019 – diva-portal.org
… It is found that the choice of text encoder can have a large impact on the quality of synthesized images, though it is more related to natural language processing and not the emphasis of research in text-to-image synthesis. A …

Text2scene: Generating compositional scenes from textual descriptions
F Tan, S Feng, V Ordonez – Proceedings of the IEEE …, 2019 – openaccess.thecvf.com
… text requires a level of language and visual understanding which could lead to applications in image retrieval through natural language queries, representation … images [5, 21, 16, 14, 31, 32, 22, 2]. Recently, there is work in the opposite direction of text-to-image synthesis [25, 26 …

End-to-End Learning Using Cycle Consistency for Image-to-Caption Transformations
K Hagiwara, Y Mukuta, T Harada – arXiv preprint arXiv:1903.10118, 2019 – arxiv.org
… is close to the input image from a generated caption, ie, if it is possible to generate a natural language caption containing … Here, we introduce related research on image-to-text generation, text-to-image generation, and mutual transfor- mations between different domains using …

Fine-Grained Image Classification Combined with Label Description
X Shi, L Xu, P Wang – … on Tools with Artificial Intelligence (ICTAI …, 2019 – ieeexplore.ieee.org
… scores. The model combines image features and label description features to achieve better classification. image information, natural language description features can complement the detailed features of the image. Therefore …

Understanding, categorizing and predicting semantic image-text relations
C Otto, M Springstein, A Anand, R Ewerth – Proceedings of the 2019 on …, 2019 – dl.acm.org
… sensory fusion network. They handle the cog- nitive and semantic gap by improving the comparability of hetero- geneous media features and obtain good results for image-to-text and text-to-image retrieval. Liang et al. [25] propose …

Question-Conditioned Counterfactual Image Generation for VQA
J Pan, Y Goyal, S Lee – arXiv preprint arXiv:1911.06352, 2019 – arxiv.org
… As such, recent work on gen- erating images based on natural language captions [14] or dialogs [12] about the image is closely related … Chatpainter: Improving text to image generation using dialogue. CoRR, abs/1802.08216, 2018 …

Social media based event summarization by user–text–image co-clustering
X Qian, M Li, Y Ren, S Jiang – Knowledge-Based Systems, 2019 – Elsevier
… proposed a bilateral correspondence LDA model to address the problem of association modeling in multimedia microblog data that is to discover both text-to-image and image … In this paper, we adopt existing natural language processing tool FudanNLP 2 to do this for each text …

Cross-Modal Dual Learning for Sentence-to-Video Generation
Y Liu, X Wang, Y Yuan, W Zhu – Proceedings of the 27th ACM …, 2019 – dl.acm.org
… The re-embedding module of CMDL generates the embedded sentence vectors instead of the original words in sentences because each video can be described by various sentences with the same semantic meaning due to the ambiguity of natural language …

Seq-sg2sl: Inferring semantic layout from scene graph through sequence to sequence learning
B Li, B Zhuang, M Li, J Gu – Proceedings of the IEEE …, 2019 – openaccess.thecvf.com
… Therefore, our goal in this work solves the underlying task, inferring semantic layout from scene graph, for connecting text to image … com- plemented their work by introducing an automatic approach to create a scene graph from unstructured natural language scene descriptions …

Text synthesis from keywords: a comparison of recurrent-neural-network-based architectures and hybrid approaches
N Kolokas, A Drosou, D Tzovaras – Neural Computing and Applications, 2019 – Springer
… 1 Introduction Keywords-to-text synthesis appertains to the general field of the well-known natural language processing (NLP) … References about text synthesis regard mainly text-to- speech, speech-to-text and text-to-image synthesis …

Story-oriented Image Selection and Placement
SN Chowdhury, S Razniewski, G Weikum – arXiv preprint arXiv …, 2019 – arxiv.org
… Generation of natural language im- age descriptions is a popular problem at the intersection of com- puter vision, natural language processing, and artificial intelli- gence [1]. Alignment of image regions to textual concepts is a prerequisite for generating captions …

Contextual language understanding Thoughts on Machine Learning in Natural Language Processing
B Favre – 2019 – hal-amu.archives-ouvertes.fr
… Contextual language understanding Thoughts on Machine Learning in Natural Language Processing Benoit Favre To cite this version: Benoit Favre … Page 17. 16 CHAPTER 1. INTRODUCTION Page 18. Chapter 2 Modern Natural Language Processing …

A novel text representation which enables image classifiers to perform text classification, applied to name disambiguation
SM Petrie, TMD Julius – arXiv preprint arXiv:1908.07846, 2019 – arxiv.org
… processing algorithms (eg im- age classification networks), to be applied to text- based natural language processing (NLP) prob- lems involving pairwise comparisons (eg named entity disambiguation). We demonstrate this by combining our text-to-image conversion method …

Recursive visual attention in visual dialog
Y Niu, H Zhang, M Zhang, J Zhang… – Proceedings of the …, 2019 – openaccess.thecvf.com
… 1. Introduction Vision and language understanding has become an attractive and challenging interdisciplinary field in computer vision and natural language processing … It is thus impractical to exhaust all cases using a natural language parser …

A novel text representation which enables image classifiers to perform text classification
SM Petrie, DJ T’Mir – 2019 – openreview.net
… processing algorithms (eg image classification networks), to be applied to text- based natural language processing (NLP) problems involving pairwise comparisons (eg named entity disambiguation). We demonstrate this by combining our text-to-image conversion method with …

Controlling Style and Semantics in Weakly-Supervised Image Generation
D Pavllo, A Lucchi, T Hofmann – arXiv preprint arXiv:1912.03161, 2019 – arxiv.org
… Page 2. or natural language [46, 47, 41], but again, these methods only show compelling results on single-domain datasets … An approach for controlling the style of the scene and its instances, either using high-level attributes or natural language with an attention mechanism …

Referring expression object segmentation with caption-aware consistency
YW Chen, YH Tsai, T Wang, YY Lin… – arXiv preprint arXiv …, 2019 – arxiv.org
… Abstract Referring expressions are natural language descriptions that identify a particular ob- ject within a scene and are widely used in our … We introduce the spatial-aware dynamic filters to transfer knowledge from text to image, and effec- tively capture the spatial information of …

Text to Artistic Image Generation using GANs
Y Chen, Z Wang – yuxingch.github.io
… synthetic stylish image. 3.1. Text to Image Generation Text Encoder The first challenge is to correctly connect the content of im- ages and the natural language concepts in the correspond- ing text descriptions. As mentioned in …

IMAGE GENERATION WITH GANS-BASED TECHNIQUES: ASurvey
SNES Latifi – researchgate.net
… The task of text to image generation usually means translating text in the form of single-sentence descriptions directly into prediction of … GAN- CLS), which successfully generated realistic images (64 × 64) for birds and flowers that are described by natural language descriptions …

Multimodal Word Discovery and Retrieval with Phone Sequence and Image Concepts.
L Wang, MA Hasegawa-Johnson – INTERSPEECH, 2019 – isca-speech.org
… images that also appeared in Flickr8k are used, so that we can compare our results to other speech-to-image [21] and text-to-image [19] retrieval … [25] P. Brown, VJD Pietra, PV deSouza, JC Lai, and RL Mer- cer, “Class-based n-gram models of natural language,” Computa- tional …

Variational Conditional GAN for Fine-grained Controllable Image Generation
M Hu, D Zhou, Y He – arXiv preprint arXiv:1909.09979, 2019 – arxiv.org
… on natural language can be classified into two cate- gories: sentence-level image generation and class-conditional image generation. Sentence- level image generation learns to generate related image from one sentence, which is also called text-to-image generation (Reed et …

ParNet: Position-aware Aggregated Relation Network for Image-Text matching
Y Xia, L Huang, W Wang, X Wei – arXiv preprint arXiv:1906.06892, 2019 – arxiv.org
… 2 RELATED WORKS 2.1 Attention Mechanisms A ention mechanism have recently been successfully applied in the Natural Language Processing eld[25][2][3]. ea ention … Methods Image-to-Text Text-to-Image R@1 R@5 R@10 R@1 R@5 R@10 DVSA [10] 38.4 69.9 80.5 27.4 …

Character profiling in low-resource language documents
T Wong, J Lee – Proceedings of the 24th Australasian Document …, 2019 – dl.acm.org
… [17] Lewis Lancaster. 2010. From Text to Image to Analysis: Visualization of Chinese Buddhist Canon. In Proc. Digital Humanities … In Proc. 10th Conference on Computational Natural Language Learning (CoNLL-X). [21] M. Mintz, S. Bills, R. Snow, and D. Jurafsky. 2009 …

A Research on Generative Adversarial Networks Applied to Text Generation
C Zhang, C Xiong, L Wang – 2019 14th International …, 2019 – ieeexplore.ieee.org
… NIPS workshop on Adversarial Training, 2016. [3] Reed S, Akata Z, Yan X, Logeswaran L, Schiele B, Lee H.Generative adversarial text to image synthesis … 2017. [8] Rajeswar S, Subramanian S, Dutil F, Pal C, Courville A.Adversarial generation of natural language …

Annotation efficient cross-modal retrieval with adversarial attentive alignment
PY Huang, G Kang, W Liu, X Chang… – Proceedings of the 27th …, 2019 – dl.acm.org
… We consider learning under a sparsely an- notated parallel corpus with abundant un-annotated im- ages and limited (image, natural language sentence) pairs. (Right) Performance degeneration of state-of-the-art cross- modal retrieval models in the text-to-image retrieval task …

MemeFaceGenerator: Adversarial Synthesis of Chinese Meme-face from Natural Sentences
Y Chen, Z Wang, B Wu, M Li, H Zhang, L Ma… – arXiv preprint arXiv …, 2019 – arxiv.org
… erating images or emojis according to the given natural language descriptions via the End-to-End learning becomes achievable (Reed et al … Based on the current progress on text to image generation models, we propose a GAN architec- ture with the attention module named …

Multimodal image news article alignment
H Jeyaram, M Hasanuzzaman, I Calixto, A Way – 2019 – doras.dcu.ie
… Euronews Image to Text Text to Image Method Our model Our model Recall@K = 15 13.0 10.1 Recall@K = 25 17.0 14.2 Recall@K = 50 26.0 27.0 … vol. 1 (2010) 5. Calixto, I., Liu, Q.: Sentence-level multilingual multi-modal embedding for natural language processing …

Query is GAN: Scene Retrieval With Attentional Text-to-Image Generative Adversarial Network
R Yanagi, R Togo, T Ogawa, M Haseyama – IEEE Access, 2019 – ieeexplore.ieee.org
… Text-to-Image Generative Adversarial Network … In this paper, we try to solve this problem by utilizing a text- to-image Generative Adversarial Network (GAN), which has become one of the most attractive research topics in recent years …

Generating multiple objects at spatially distinct locations
T Hinz, S Heinrich, S Wermter – arXiv preprint arXiv:1901.00686, 2019 – arxiv.org
… One way to exert control over the image layout is by using natural language descriptions of the image, eg image captions, as shown by … We tested our approach with two commonly used architectures for text-to-image synthesis, namely the StackGAN (Zhang et al., 2017) and the …

High-Resolution Realistic Image Synthesis from Text Using Iterative Generative Adversarial Network
A Ullah, X Yu, A Majid, HU Rahman… – Pacific-Rim Symposium …, 2019 – Springer
… In this way, this research problem links two research topics of Natural language Processing (NLP) and Computer Vision (CV). This technology of the text-to-image synthesis has a huge demand for applications such as removing unlike objects in your photographs, producing …

A food dish image generation framework based on progressive growing gans
S Wang, H Gao, Y Zhu, W Zhang, Y Chen – International Conference on …, 2019 – Springer
… (10). The decoder is a natural language model with the condition on the output $ h_{i} $ of the encoder … 3020–3028 (2017)Google Scholar. 9. Reed, S., Akata, Z., Yan, X., Logeswaran, L., Schiele, B., Lee, H.: Generative adversarial text to image synthesis …

A Novel Image Captioning Method Based on Generative Adversarial Networks
Y Fan, J Xu, Y Sun, Y Wang – International Conference on Artificial Neural …, 2019 – Springer
… Li, S., Kulkarni, G., Berg, TL, et al.: Composing simple image descriptions using web-scale n-grams. In: Proceedings of the Fifteenth Conference on Computational Natural Language Learning, pp … Reed, S., Akata, Z., Yan, X., et al.: Generative adversarial text to image synthesis …

Neural Storyboard Artist: Visualizing Stories with Coherent Image Sequences
S Chen, B Liu, J Fu, R Song, Q Jin, P Lin, X Qi… – Proceedings of the 27th …, 2019 – dl.acm.org
… Most of them focus on single sentence to single image gen- eration [1–3]. Reed et al. [1] propose to use conditional GAN with adversarial training of a generator and a discriminator to improve text-to-image generation ability. Zhang et al …

A New End-to-End Long-Time Speech Synthesis System Based on Tacotron2
R Liu, J Yang, M Liu – Proceedings of the 2019 International Symposium …, 2019 – dl.acm.org
… CCS Concepts • Computing methodologies?Natural language processing? Speech Recognition and Synthesis … design three discriminator with reference to the GAN structure in SPSS (Statistical Parameter Speech Synthesis)[13][14] and the GAN structure in text to image[15][16 …

From knowledge map to mind map: Artificial imagination
R Liu, B Chen, X Guo, Y Dai, M Chen… – … IEEE Conference on …, 2019 – ieeexplore.ieee.org
… 7] T. Xu, P. Zhang, Q. Huang, H. Zhang, Z. Gan, X. Huang, and X. He, “Attngan: Fine-grained text to image generation with … C. Manning, “Glove: Global vectors for word representation,” in Proceedings of the 2014 conference on empirical methods in natural language pro- cessing …

Incremental Alignment of Metaphoric Language Model for Poetry Composition
M Oita – Intelligent Computing-Proceedings of the Computing …, 2019 – Springer
… In: Proceedings of the 10th International Conference on Natural Language Generation, pp. 11–20 (2017)Google Scholar. 29 … Sharma, S., Suhubdy, D., Michalski, V., Kahou, SE, Bengio, Y.: Chatpainter: improving text to image generation using dialogue …

Is This an Example Image?–Predicting the Relative Abstractness Level of Image and Text
C Otto, S Holzki, R Ewerth – European Conference on Information Retrieval, 2019 – Springer
… a multi-sensory fusion network, which improves the comparability of heterogeneous media features and is therefore well suited for image-to-text and text-to-image retrieval … In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp …

Investigating GAN and VAE to Train DCNN
S Ezekiel, L Pearlstein, AA Alshehri, A Lutz… – International Journal of …, 2019 – ijmlc.org
… CNNs have shown remarkable performance in machine vision tasks such as image classification, natural language processing and speech recognition. There is evidence that the depth of a CNN plays an important role in performance of CNNs …

Jointly Learning of Visual and Auditory: A New Approach for RS Image and Audio Cross-Modal Retrieval
M Guo, C Zhou, J Liu – IEEE Journal of Selected Topics in …, 2019 – ieeexplore.ieee.org
… basically categorized into two types, text-to-image retrieval [1]– [4] and image-to-image retrieval [5]–[8]. The text-to-image based approaches highly depend on the availability and the qual- ity of … [22] introduced a multimodal embedding of vi- sual and natural language data to …

Image Generation and Recognition (Emotions)
H Carlsson, D Kollias – arXiv preprint arXiv:1910.05774, 2019 – arxiv.org
… [15] in 2014. GANs are applicable for both semi-supervised and unsupervised learning tasks [11], and they have achieved impressive results in various image generation tasks, such as: image-to-image syn- thesis, text-to-image synthesis, and image super-resolution [66] …

Learning to Learn Words from Narrated Video
D Surís, D Epstein, H Ji, SF Chang… – arXiv preprint arXiv …, 2019 – arxiv.org
… Visual language modeling: Recent advances in natural language processing have yielded neural language models able to learn from large amounts of text, such as BERT [10], ELMo [35] and GPT [37], which have achieved state-of-the art results on a variety of linguistic tasks …

Convolutional Neural Networks in Predicting Missing Text in Arabic
A Souri, A Zbakh, M Alachhab, B Elmohajir – 2019 – hal.archives-ouvertes.fr
… Abstract—Missing text prediction is one of the major concerns of Natural Language Processing deep learning community’s at- tention … Keywords—Natural Language Processing; Convolutional Neu- ral Networks; deep learning; Arabic language; text prediction; text generation …

Robust change captioning
DH Park, T Darrell, A Rohrbach – Proceedings of the IEEE …, 2019 – openaccess.thecvf.com
… Besides, our task is not only to detect the changes, but also to describe them in natural language, going beyond the discussed … for various vision and language tasks, eg visual question answering [26], referring expres- sion comprehension [22, 34], text-to-image generation [13 …

Obj-glove: Scene-based contextual object embedding
C Xu, Z Chen, C Li – arXiv preprint arXiv:1907.01478, 2019 – arxiv.org
… The text-to-image synthesis problem is split by Reed et al. [21] into two sub-problems: learning a joint embedding between natural language and images and train a deep convolutional generative adversarial network (GAN) to synthesise realistic images. Dong et al …

Story-oriented Image Selection and Placement
S Nag Chowdhury, S Razniewski… – arXiv preprint arXiv …, 2019 – pure.mpg.de
… Generation of natural language im- age descriptions is a popular problem at the intersection of com- puter vision, natural language processing, and artificial intelli- gence [1]. Alignment of image regions to textual concepts is a prerequisite for generating captions …

Hierarchical Image-to-image Translation with Nested Distributions Modeling
S Qiao, R Wang, S Shan, X Chen – 2019 – openreview.net
… appearance from an image. In natural language processing, Athi- waratkun & Wilson (2018) propose a probabilistic word embedding method to capture the semantics described by the WordNet hierarchy. Our method first introduces …

Progressive Semantic Image Synthesis via Generative Adversarial Network
K Yue, Y Li, H Li – 2019 IEEE Visual Communications and …, 2019 – ieeexplore.ieee.org
… 1316–1324. [7] W. Li, P. Zhang, L. Zhang, Q. Huang, X. He, S. Lyu, and J. Gao, “Object-driven text-to-image synthesis via … 988–993. [11] S. Nam, Y. Kim, and SJ Kim, “Text-adaptive generative adversarial networks: Manipulating images with natural language,” in Advances in …

IMAGETOTEXT: IMAGE CAPTION GENERATION USING HYBRID RECURRENT NEURAL NETWORK
MDA JISHAN – 2019 – academia.edu
… This BNLIT (Bangla Natural Language Image To Text) dataset made for implementing … Zhe Gan, Xiaolei Huang, Xiaodong He, “AttnGAN: Fine-Grained Text to Image Generation with Attentional Generative Adversarial Networks” [8] and Shikhar …

Generation of Image by Sentence Based on Impression Words in Image.
S Namisato, K Yokota, H Yamaoka, S Ooi, M Sano – JCP, 2019 – pdfs.semanticscholar.org
… Key words: Impression words, image processing, morphological analysis, text to image … Adversarial Network (AttnGAN) that can synthesize fine-grained details at different subregions of the image by paying attentions to the relevant words in the natural language description …

Good, Better, Best: Textual Distractors Generation for Multi-Choice VQA via Policy Gradient
J Lu, X Ye, Y Ren, Y Yang – arXiv preprint arXiv:1910.09134, 2019 – arxiv.org
… al. 2011; Vinyals et al. 2015), text to image synthe- sis (Reed et al. 2016), visual question answering (An- tol et al. 2015) that combines natural language process- ing and computer vision has dramatically increased. Among …

CONTRIBUTION OF INTERNAL REFLECTION IN LANGUAGE EMERGENCE WITH AN UNDER-RESTRICTED SITUATION
K Todo, M Yamamura – 2019 – openreview.net
… Timothy Niven and Hung-Yu Kao. Probing neural network comprehension of natural language arguments. arXiv preprint arXiv:1907.07355, 2019. Stefano Nolfi and Marco Mirolli … Attngan: Fine-grained text to image generation with attentional generative adversarial net- works …

Attention-Based GAN for Single Image Super-Resolution
D Huo, R Wang, J Ding – Chinese Conference on Image and Graphics …, 2019 – Springer
… performance in image-to-image translation [19, 20, 21, 22], image super-resolution [3, 4, 23] and text-to-image synthesis [24 … Attention mechanism has been widely used in Natural Language Processing (NLP), Image Recognition, Voice Recognition and any other types of Deep …

Connecting Vision and Language with Localized Narratives
J Pont-Tuset, J Uijlings, S Changpinyo… – arXiv preprint arXiv …, 2019 – arxiv.org
… Image Text Speech Grounding Task In Out – – Image captioning [54, 58, 55] Out In – – Text-to-image Generation [45, 50, 59] In Out – Out Dense image captioning [23, 60, 27], Dense Rela- tional Captioning [27] In Out – In Controllable and Grounded Captioning [11] In In – Out …

Learning Based Image and Video Editing
L Karacan – 2019 – openaccess.hacettepe.edu.tr
… Our methods produce competitive or better results against state-of-the-art methods on benchmark datasets quantitatively and qualitatively while providing simple high-level interactions such as natural language and visual attributes …

Recurrent Deconvolutional Generative Adversarial Networks with Application to Video Generation
H Yu, Y Huang, L Pi, L Wang – Chinese Conference on Pattern …, 2019 – Springer
… It can be solved by exploiting recent advances in areas of natural language processing and multimodal learning … In: NeurIPS (2016)Google Scholar. 13. Reed, S., Akata, Z., Yan, X., Logeswaran, L., Schiele, B., Lee, H.: Generative adversarial text to image synthesis …

Learning Image Information for eCommerce Queries
U Porwal – Proceedings of the 24th Australasian Document …, 2019 – dl.acm.org
… Scott et al. [8] proposed a text to image synthesis ap- proach using a deep convolutional generative adversarial network (DC-GAN) conditioned on text features. Later, Zhang et al … Queries are not descriptive in a traditional natural language sense …

Variational Hetero-Encoder Randomized GANs for Joint Image-Text Modeling
H Zhang, B Chen, L Tian, Z Wang, M Zhou – arXiv preprint arXiv …, 2019 – arxiv.org
Page 1. Published as a conference paper at ICLR 2020 VARIATIONAL HETERO-ENCODER RANDOMIZED GANS FOR JOINT IMAGE-TEXT MODELING Hao Zhang, Bo Chen?, Long Tian, Zhengjue Wang National Laboratory …

Visual Feature Enhancement Algorithm of Eco-landscape Image Based on Deep Learning
M Guo, M Xiao – Ekoloji, 2019 – ekolojidergisi.com
… Firstly, in terms of natural language processing, the analysis of user search terms failed to reach the semantic level … (Sudhakaran et al., 2018)This search not only supports text-to-image search, but also supports searching between images, because the same visual features of …

Generating Descriptive and Accurate Image Captions with Neural Networks
L Wu – 2019 – opus.lib.uts.edu.au
… Page 7. ABSTRACT Image captioning is to automatically describe an image with a sentence, which is a topic connecting computer vision and natural language processing. Research on … Generally, the communication between humans is through the natural language. The …

Secure Multi-Modal Summarization using Machine Learning
G Mundada, P Nimonkar, R Kabra, R Sudke… – ijresm.com
… Whereas, abstractive methods build an internal semantic representation and then use natural language generation techniques to create a summary … [5] Amal Joshy, Amitha Baby KX, Padma S. and Fasila KA “Text to Image Encryption Technique using RGB Substitution and AES …

Adversarial inference for multi-sentence video description
JS Park, M Rohrbach, T Darrell… – Proceedings of the …, 2019 – openaccess.thecvf.com
… the popular ActivityNet Captions dataset. 1 1. Introduction Being able to automatically generate a natural language description for a video has fascinated researchers since the early 2000s [27]. Despite the high interest in this …

Cscore: A Novel No-Reference Evaluation Metric for Generated Images
Y Zhang, Z Zhang, W Yu, N Jiang – Proceedings of the 2019 8th …, 2019 – dl.acm.org
… 1. INTRODUCTION In recent years, deep learning, which has been applied to image segmentation, image filling, natural language processing and many other fields, has made great progress. As a branch of image processing, text-to-image synthesis has made more and more …

From Intra-Modal to Inter-Modal Space: Multi-Task Learning of Shared Representations for Cross-Modal Retrieval
J Choi, M Larson, G Friedland… – 2019 IEEE Fifth …, 2019 – ieeexplore.ieee.org
… trieved ranked list. We report the mAP performance of both retrieval directions, image-to-text (I Ñ T) and text-to-image (T Ñ I). For video-text retrieval, we adopt rank-based metric, R@K, Median Rank and Mean Rank. R@K (Recall …

The Design Patent Images Classification Based on Image Caption Model
H Liu, Q Dai, Y Li, C Zhang, S Yi, T Yuan – International Conference on …, 2019 – Springer
… The image captioning is intended to automatically generate a natural language description of an image, which is one of the hot research areas … of which are reported in Table 2. The results of the high-level semantic classification are presented using the recall rate (text to image) …

Designovel’s system description for Fashion-IQ challenge 2019
J Li, J Lee, W Song, K Shin, B Go – arXiv preprint arXiv:1910.11119, 2019 – arxiv.org
… built the systems by combining methods from recent work on deep metric learn- ing, multi-modal retrieval and natural language processing … We participated in the Fashion IQ Challenge 2019 by building image+text to image retrieval systems on fashion items in three pre-defined …

Dynamic Scene Creation from Text
H Kadiyala – 2019 – scholar.uwindsor.ca
… xi LIST OF ACRONYMS MEL Maya Embedded Language NLP Natural Language Processing API Application Programming Interface VFX Visual Effects … PTSD Post-Traumatic Stress Disorder XML Extended Markup Language NLTK Natural Language Tool Kit Page 13. 1 …

A Food Dish Image Generation Framework Based on Progressive Growing GANs
Y Chen – … : Networking, Applications and Worksharing: 15th EAI …, 2019 – books.google.com
… þC zhiÞ ð8Þ ht ¼ tanhðWdxtÀ1 þUdðrt htÀ1ÞþCh iÞ ð9Þ hti þ 1 ¼ ð1 À ztÞ htÀ1 þzt ht ð10Þ The decoder is a natural language model with the … 3020–3028 (2017) 9. Reed, S., Akata, Z., Yan, X., Logeswaran, L., Schiele, B., Lee, H.: Generative adversarial text to image synthesis …

Advances in Deep Generative Modeling With Applications to Image Generation and Neuroscience
G Loaiza Ganem – 2019 – academiccommons.columbia.edu
Page 1. Advances in Deep Generative Modeling With Applications to Image Generation and Neuroscience Gabriel Loaiza Ganem Submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy in the Graduate School of Arts and Sciences …

Towards Generating Remote Sensing Images of the Far Past
MB Bejiga, F Melgani – IGARSS 2019-2019 IEEE International …, 2019 – ieeexplore.ieee.org
… The problem of text-to-image synthesis requires to combine two types of data, text and image … Page 2. objective is utilizing natural language processing (NLP) methods to learn a function ( .) that maps a single/multiple sentence text description into a feature vector ? . Fig …

PororoGAN: An Improved Story Visualization Model on Pororo-SV Dataset
G Zeng, Z Li, Y Zhang – Proceedings of the 2019 3rd International …, 2019 – dl.acm.org
… RELATED WORK The most relevant to Story-Visualization task is conditional text- to-image transformation [1-4], which commonly generates natural scene images from natural language descriptions by combining recurrent neural networks and Generative Adversarial Networks …

Storygan: A sequential conditional gan for story visualization
Y Li, Z Gan, Y Shen, J Liu, Y Cheng… – Proceedings of the …, 2019 – openaccess.thecvf.com
… Learning to generate meaningful and coherent sequences of images from a natural language story is a challenging task that requires understanding and reasoning on both … This task is highly related to text-to-image generation [35, 28, 17, 36, 34], where an image is generated …

Pose-Guided Image Generation Using Generative Adversarial Networks
H Alqahtani, M Kavakli-Thorne, Z Hussain – researchgate.net
… Rajeswar, S., Subramanian, S., Dutil, F., Pal, C., Courville, A.: Adversarial Gen- eration of Natural Language (may 2017), http://arxiv.org/abs/1705.10929 34. Reed, S., Akata, Z., Yan, X., Logeswaran, L., Schiele, B., Lee, H.: Generative Adversarial Text to Image Synthesis (may …

Color Extraction from Lyrics
G Hori – Proceedings of the 2019 4th International Conference …, 2019 – dl.acm.org
… has been reported that recently developed methods exploiting word2vec outperform previ- ous methods in many natural language processing tasks … One direction is to employ recently developed text-to-image techniques based on GAN (generative ad- versarial network) such as …

A System of Associated Intelligent Integration for Human State Estimation
A Matsufuji, WF Hsieh, E Sato-Shimokawara… – davidpublisher.com
… learning. This method converted image to text. In contrast, text2image [20] converted text to image. As application, visual question answering [21] used natural language processing for conversation related to image information. But …

Relationship-Aware Spatial Perception Fusion for Realistic Scene Layout Generation
H Zheng, Y Bai, W Zhang, T Mei – arXiv preprint arXiv:1909.00640, 2019 – arxiv.org
… com Abstract The significant progress on Generative Adversarial Net- works (GANs) have made it possible to generate surpris- ingly realistic images for single object based on natural language descriptions. However, controlled …

Automatic Generation of Photorealistic Image Fillers for Privacy Enabled Urban Basemaps using Generative Adversarial Networks
A Agoub, Y Filippovska, V Schmidt, M Kada – Proceedings of the 29th …, 2019 – d-nb.info
… 1570-1579). Collobert, R., Weston, J., Bottou, L., Karlen, M., Kavukcuoglu, K., & Kuksa, P. (2011). Natural language processing (almost) from scratch … Reed, S., Akata, Z., Yan, X., Logeswaran, L., Schiele, B., & Lee, H. (2016). Generative Adversarial Text to Image Synthesis …

Reconstructed similarity for faster GANs-based word translation to mitigate hubness
D Zhang, M Luo, F He – Neurocomputing, 2019 – Elsevier
… classification [2], etc. Then, considerable researches focus on word-level semantic representation [3], [4], which effectively promotes the performances of various Natural Language Processing (NLP) tasks. In word representation …

Visual question answering via Attention-based syntactic structure tree-LSTM
Y Liu, X Zhang, F Huang, X Tang, Z Li – Applied Soft Computing, 2019 – Elsevier
… sentence is issued for the corresponding image [8]. By considering the correlations between the image and the question sentence, a natural-language answer is produced. Similar with other visual-language tasks such as image captioning [9], [10] and text-to-image retrieval [11 …

Mappa mundi: an interactive artistic mind map generator with artificial imagination
R Liu, B Chen, M Chen, Y Wu, Z Qiu, X He – arXiv preprint arXiv …, 2019 – arxiv.org
… Attngan: Fine-grained text to image gener- ation with attentional generative adversarial networks … In Proceedings of the 2014 Conference on Empir- ical Methods in Natural Language Processing (EMNLP), pages 670–680, 2014.

Bots Work Better than Human Beings: An Online System to Break Google’s Image-based reCaptcha v2
MI Hossen, Y Tu, MF Rabby, MN Islam, H Cao, X Hei – regmedia.co.uk
Page 1. Bots Work Better than Human Beings: An Online System to Break Google’s Image-based reCaptcha v2 Md Imran Hossen, Yazhou Tu, Md Fazle Rabby, Md Nazmul Islam University of Louisiana at Lafayette Hui Cao Xi’an Jiaotong University …

Generative Adversarial Networks for Image Synthesis
H Zhang – 2019 – search.proquest.com
… (Figures reproduced from [6, 7]). super-resolution [18, 19] and text-to-image synthesis [1, 2, 5]. However, GANs are … a two-stage generative adversarial network architecture, StackGAN-v1, is proposed. for text-to-image synthesis. The Stage-I GAN sketches the primitive shape and …

Bootstrapping Disjoint Datasets for Multilingual Multimodal Representation Learning
Á Kádár, G Chrupa?a, A Alishahi, D Elliott – arXiv preprint arXiv …, 2019 – arxiv.org
… In natural language process- ing, many approaches have been proposed that inte- grate visual information in the learning of word and sentence representations … The stopping criterion is the sum of text- to-image (T?I) and image-to-text (I?T) recall scores at ranks 1, 5 and 10 …

A Real-time Global Inference Network for One-stage Referring Expression Comprehension
Y Zhou, R Ji, G Luo, X Sun, J Su, X Ding, C Lin… – arXiv preprint arXiv …, 2019 – arxiv.org
… It aims to locating the target region in an image based on a natural language query, eg, “man in coat and pants … about 3.0 FPS (frame per second) [7], [33], [35], [38], which poses a huge obstacle to a lot of practical applications such as video surveillance and text-to-image retrieval …

An overview of deep learning in medical imaging focusing on MRI
AS Lundervold, A Lundervold – Zeitschrift für Medizinische Physik, 2019 – Elsevier

Baby steps towards few-shot learning with multiple semantics
E Schwartz, L Karlinsky, R Feris, R Giryes… – arXiv preprint arXiv …, 2019 – arxiv.org
… Building upon recent ad- vances in few-shot learning with additional seman- tic information, we demonstrate that further im- provements are possible by combining multiple and richer semantics (category labels, attributes, and natural language descriptions) …

Text2FaceGAN: Face Generation from Fine Grained Textual Descriptions
OR Nasir, SK Jha, MS Grover, Y Yu… – 2019 IEEE Fifth …, 2019 – ieeexplore.ieee.org
… from text. Keywords-Datasets, Generative Adversarial Networks, Text to Image, Facial Attributes, Face Generation I … encoding space. Natural language provides a generic interface to represent information on facial features. Hence …

Cross domain Image Transformation and Generation by Deep Learning
Y Song – 2019 – trace.tennessee.edu
… Many cross domain learning applications are listed here according to their source and target domains, • Audio2Text: For example, speech recognition in natural language processing aims to extract feature in the audio domain, and translate into words. It has been widely …

Sentence simplification from non-parallel corpus with adversarial learning
T Kawashima, T Takagi – 2019 IEEE/WIC/ACM International …, 2019 – ieeexplore.ieee.org
… CCS CONCEPTS • Computing methodologies ? Natural language genera- tion … 2.3 Style Transfer in Natural Language The style transfer task in the parallel corpus is often regarded as a monolingual machine translation task …

Unsupervised Annotation of Phenotypic Abnormalities via Semantic Latent Representations on Electronic Health Records
J Zhang, X Zhang, K Sun, X Yang… – … on Bioinformatics and …, 2019 – ieeexplore.ieee.org
… Index Terms—Phenotype Annotation, Unsupervised Learning, Natural Language Processing, Deep Learning, Electronic Health Records … expensive and impractical, automatic annotation techniques based on natural language processing (NLP) are demanded …

On Architectures for Including Visual Information in Neural Language Models for Image Description
M Tanti, A Gatt, KP Camilleri – arXiv preprint arXiv:1911.03738, 2019 – arxiv.org
… Text-to-image generation: generating an image based on a textual description (Xu et al., 2017; Mansimov et al., 2016) … The work in chapter 4 has been published in the International Conference on Natural Language Generation (Tanti et al., 2017) and in the Natural Language …

Generative design in Minecraft: chronicle challenge
C Salge, C Guckelsberger, MC Green… – arXiv preprint arXiv …, 2019 – arxiv.org
… To advance this aspect of our challenge we have there- fore decided to introduce an optional bonus challenge, the Chronicle Competition, which adds the task of generating an explicit narrative, captured in natural language text … Generative adversarial text to image synthesis …

Referring Expression Comprehension with Semantic Visual Relationship and Word Mapping
C Zhang, W Li, W Ouyang, Q Wang, WS Kim… – Proceedings of the 27th …, 2019 – dl.acm.org
… to text to image task such as referring expression comprehension. In this paper, we design a novel referring expression comprehension network with semantic visual relationship module. 2.3 Word to Vector (word2vec) In early works of Natural Language Processing, researchers …

Multiview Deep Learning
S Sun, L Mao, Z Dong, L Wu – Multiview Machine Learning, 2019 – Springer
… Multiview deep learning has a wide range of applications such as natural language descriptions, multimedia content indexing and retrieval, and understanding human multiview behaviors during social interactions. Recently …

Improving What Cross-Modal Retrieval Models Learn Through Object-Oriented Inter-and Intra-Modal Attention Networks
PY Huang, X Chang, AG Hauptmann – Proceedings of the 2019 on …, 2019 – dl.acm.org
… multimedia retrieval. In text-to-image retrieval task (ie, searching images with natural language queries), a system needs to exploit both fine-grained intra-modal discrepancies and inter-modal dependencies. For instance, the …

Image generation from bounding box-represented semantic labels
C Liu, Z Yang, F Xu, JH Yong – Computers & Graphics, 2019 – Elsevier
… generation process. For example, DCGAN [6] applies GAN with class names as conditions. Reed et al. [25] used GAN as a bridge to generate images from detailed natural language descriptions. Reed et al. [26] also proposed …

Relation-aware graph attention network for visual question answering
L Li, Z Gan, Y Cheng, J Liu – Proceedings of the IEEE …, 2019 – openaccess.thecvf.com
… Interdisciplinary area between language and vision, such as image captioning, text-to- image synthesis and visual question answering (VQA), has attracted rapidly … to be useful for the VQA task, but there still exists a significant semantic gap between image and natural language …

Hierarchical cross-modal talking face generation with dynamic pixel-wise loss
L Chen, RK Maddox, Z Duan… – Proceedings of the IEEE …, 2019 – openaccess.thecvf.com
… These works have in- spired us to use the facial landmarks to bridge audio with row pixel generation. Attention Mechanism Attention mechanism is an emerging topic in natural language tasks [20] and im- age/video generation task [26, 37, 22, 36]. Pumarola et al …

Toward Fusing Domain Knowledge with Generative Adversarial Networks to Improve Supervised Learning for Medical Diagnoses
FC Chang, JJ Chang, CN Chou… – 2019 IEEE Conference …, 2019 – ieeexplore.ieee.org
… (Application: text to image.) • Pixel-to-Pixel GAN … 2) Encoding knowledge into GANs: We can convey to GANs knowledge about the information to be mod- eled via the knowledge layers/structures and/or via the knowledge graph/dictionary using natural language processing …

Generative adversarial networks in computer vision: A survey and taxonomy
Z Wang, Q She, TE Ward – arXiv preprint arXiv:1906.01529, 2019 – arxiv.org
… GANs have been applied to various domains such as computer vision [25, 48, 55, 62, 72, 88, 96, 107], natural language processing [17 … of different applications eg, image to image transfer [108], image super resolution [50], image completion [37], and text-to-image generation [78 …

Narrative Text Generation via Latent Embedding from Visual Stories
??? – 2019 – s-space.snu.ac.kr
… Those agents observe the surroundings, translate them into the story in natural language, and predict the following event or multiple ones sequentially … From the philosophy of end-to-end training, we can describe it using natural language, which is human-readable …

Developing the Bangladeshi National Corpus-a Balanced and Representative Bangla Corpus
KMA Salam, M Rahman… – … Technologies for Industry …, 2019 – ieeexplore.ieee.org
… How to develop universal vocabularies using automatic generation of the meaning of each word. 7th International Conference on Natural Language Processing and Knowledge Engineering. IEEE … Text to image generative model using constrained embedding space mapping …

Aligning Multilingual Word Embeddings for Cross-Modal Retrieval Task
A Mohammadshahi, R Lebret, K Aberer – arXiv preprint arXiv:1910.03291, 2019 – arxiv.org
… We show that our approach enables better general- ization, achieving state-of-the-art performance in text-to-image and image-to-text retrieval task, and caption-caption similarity task … Page 3. Image to Text Text to Image R@1 R@5 R@10 Mr R@1 R@5 R@10 Mr Alignment …

Adversarial Attacks and Defense on Deep Learning Models for Big Data and IoT
N Nami, M Moh – Handbook of Research on Cloud Computing and …, 2019 – igi-global.com
… Artificial assistants like Siri, Bixby, Alexa, and Google assistant extensively use deep learning for effective natural language process- ing and … learning and are used in style transfer, image transposition, transforming low-resolution images to high resolution, text to image and text …

Unsupervised multi-modal neural machine translation
Y Su, K Fan, N Bach, CCJ Kuo… – Proceedings of the …, 2019 – openaccess.thecvf.com
… many natural language processing (NLP) tasks as well, such as image caption [27] and some task-specific translation– sign language translation [6]. However, [24] demonstrates that most multi-modal translation algorithms are not signif- icantly better than an off-the-shelf text …

Self-supervised Adversarial Hashing Cross-modal Retrieval with Generative Models Based on Attention Mechanism
J XIAO, Z ZHOU, X ZHOU – DEStech Transactions on …, 2019 – dpi-proceedings.com
… The specific overview is as follows: (1) text-to image: by giving a natural language query, the given candidate image is sorted; (2)image-to-text: Given an image query, the predefined candidate text sentences are sorted. Implementation Details …

Retro-remote sensing: Generating images from ancient texts
MB Bejiga, F Melgani, A Vascotto – IEEE Journal of Selected …, 2019 – ieeexplore.ieee.org
… Fig. 1. General block diagram of the proposed method. II. METHODOLOGY The problem of text-to-image synthesis (see Fig. 1) combines two heterogeneous data types: text and image … The field of natural language processing provides a wide range of text encoding techniques …

Learning DALTS for cross-modal retrieval
Z Yu, W Wang – CAAI Transactions on Intelligence Technology, 2019 – ieeexplore.ieee.org
… Therefore, natural language is inherently divergent … In this section, we report experimental results for cross-modal retrieval including image-to-text retrieval (Img2Text) and text-to-image retrieval (Text2Img) on the benchmark Flickr8K, Flickr30K and MSCOCO datasets …

Manipulating Attributes of Natural Scenes via Hallucination
L Karacan, Z Akata, A Erdem, E Erdem – ACM Transactions on Graphics …, 2019 – dl.acm.org
… 2019] generates high-quality, high-resolution images conditioned on visual classes in ImageNet. Reed et al. [2016a, 2016b] generate images using natural language descriptions; Antipov et al. [2017] follow similar pipelines to edit a given facial appearance based on age …

Image steganography based on foreground object generation by generative adversarial networks in mobile edge computing with Internet of Things
Q Cui, Z Zhou, Z Fu, R Meng, X Sun, QMJ Wu – IEEE Access, 2019 – ieeexplore.ieee.org
… The architecture of MC-GAN inherits StackGAN [25], whose outputs are improved in images details. As the state-of-the- art work of text to image translation, StackGAN generates realistic images with higher inception score. The …

Tell, draw, and repeat: Generating and modifying images based on continual linguistic instruction
A El-Nouby, S Sharma, H Schulz… – Proceedings of the …, 2019 – openaccess.thecvf.com
… There has also been recent work in performing recurrent image generation outside of text-to-image generation tasks. Yang et al … Corresponding to every scene, there is a conversation between a Teller and a Drawer (both Amazon Mechanical Turk workers) in natural language …

Multi-interactive memory network for aspect based multimodal sentiment analysis
N Xu, W Mao, G Chen – Proceedings of the AAAI Conference on Artificial …, 2019 – aaai.org
… Page 3. a co-memory attentional mechanism to interactively model the interaction between text and image. Their model con- siders the influence of one modality to another (ie text to image and image to text) and achieves better performance than other related methods …

Polysemous visual-semantic embedding for cross-modal retrieval
Y Song, M Soleymani – Proceedings of the IEEE …, 2019 – openaccess.thecvf.com
… [14] learn the embeddings with image- to-text and text-to-image synthesis tasks in the adversarial learning framework … 1984 Page 7. Method 1K Test Images 5K Test Images Image-to-Text Text-to-Image Image-to-Text Text-to-Image …

Adversarial training in affective computing and sentiment analysis: Recent advances and perspectives
J Han, Z Zhang, B Schuller – IEEE Computational Intelligence …, 2019 – ieeexplore.ieee.org
… the former mainly relates to instantaneous emotion- al expressions and is more commonly associated with speech or image/video processing, the later mainly relates to longer-term opinions or attitudes and is more commonly associated with natural language processing …

Multi-Head Attention with Diversity for Learning Grounded Multilingual Multimodal Representations
PY Huang, X Chang, A Hauptmann – arXiv preprint arXiv:1910.00058, 2019 – arxiv.org
… 4.4 Qualitative Results and Grounding In Figure 2 we samples some qualitative multilin- gual text-to-image matching results … 2017. Multilingual multi-modal embeddings for natural language processing. arXiv preprint arXiv:1702.01101. David Chen and William Dolan. 2011 …

Reinforced cross-media correlation learning by context-aware bidirectional translation
Y Peng, J Qi – IEEE Transactions on Circuits and Systems for …, 2019 – ieeexplore.ieee.org
… B. Neural Machine Translation Machine translation is a classical research topic in natural language process (NLP), which aims to establish a corre- sponding relationship … While the other text-to-image pathway translates the text representation st p back to image on the contrary …

Object grounding via iterative context reasoning
L Chen, M Zhai, J He, G Mori – Proceedings of the IEEE …, 2019 – openaccess.thecvf.com
… 1. Introduction Modern artificial intelligence systems focus heavily on extracting knowledge from visual and textual information captured from the real world. Promising progress has been made within the computer vision and natural language pro- cessing research communities …

Survey on Deep Neural Networks in Speech and Vision Systems
M Alam, MD Samad, L Vidyaratne, A Glandon… – arXiv preprint arXiv …, 2019 – arxiv.org
… Index Terms—Vision and speech processing, computational intelligence, deep learning, computer vision, natural language processing, hardware constraints, embedded systems, convolutional neural networks, deep auto-encoders, recurrent neural networks …

Development of CNN and its Application in Education and Learning
Y Zhou, Y Su – 2019 – webofproceedings.org
… 2.2 Application of CNN in the field of education CNN is mostly applied in computer vision and natural language processing … In further development, devices that help blind people see the world could start with this principle. Another area of extension is text-to-image …

Self-Attentional Models Application in Task-Oriented Dialogue Generation Systems
M Saffar Mehrjardi – 2019 – era.library.ualberta.ca
… Page 2. Abstract Dialogue generation systems (chatbots) are currently one of the most noted topics in natural language processing, and many companies are investing in … guage understanding, user simulator, and natural language generation com …

Listening While Speaking and Visualizing: Improving ASR Through Multimodal Chain
J Effendi, A Tjandra, S Sakti… – 2019 IEEE Automatic …, 2019 – ieeexplore.ieee.org
… systems. In this research, we constructed the first framework that accommodates triangle modality (speech, text, and image) and addressed the problems of speech-to-text, text-to-speech, text-to-image, and image-to-text. Our …

Pathologist-level interpretable whole-slide cancer diagnosis with deep learning
Z Zhang, P Chen, M McGough, F Xing… – Nature Machine …, 2019 – nature.com
… pathologists. Specifically, the system generates natural language descriptions of microscopic findings (diagnostic tissue cell and nucleus characteristics), whose structures conform to the clinical pathology report standard. The …

Multi-negative samples with Generative Adversarial Networks for image retrieval
R Li, X Zhang, G Chen, Y Mao, X Wang – Neurocomputing, 2019 – Elsevier
JavaScript is disabled on your browser. Please enable JavaScript to use all the features on this page. Skip to main content Skip to article …

iComposer: An Automatic Songwriting System for Chinese Popular Music
HP Lee, JS Fang, WY Ma – Proceedings of the 2019 Conference of the …, 2019 – aclweb.org
… Inspired by work on text-to-image synthesis and image caption generation, we propose iComposer, a simple and effective bi-directional songwriting system that … In Proceedings of the 2015 Con- ference on Empirical Methods in Natural Language Processing, pages 1919–1924 …

Intrusion Detection System Using Deep Learning and its Application to Wi-Fi Network
K KIMt – IEICE TRANS, 2019 – caislab.kaist.ac.kr
… This paper presents the state-of-the- art advances and challenges in IDS using deep learning models, which have been achieved the big performance enhancements in the field of computer vision, natural language processing, and image/audio processing than the traditional …

Generative adversarial network in medical imaging: A review
X Yi, E Walia, P Babyn – Medical image analysis, 2019 – Elsevier

Face Attribute Transformation Based On ConStarGAN
Q Zhang, J Du, J Yu – 2019 6th International Conference on …, 2019 – ieeexplore.ieee.org
… A. Generative Adversarial Networks (GANs) Generative adversarial network (GAN) [7] is a powerful generation model, which has achieved good results in many computer vision tasks and natural language processing tasks … Generative adversarial text to image synthesis …

Compact scene graphs for layout composition and patch retrieval
S Tripathi, S Nittur Sridhar… – Proceedings of the …, 2019 – openaccess.thecvf.com
… Natural language representa- tions such as captions require overcoming their inherently linear ordering to infer relationships … [7] S. Hong, D. Yang, J. Choi, and H. Lee. Inferring semantic layout for hierarchical text-to-image synthesis …

Hierarchical Cross-Modal Talking Face Generationwith Dynamic Pixel-Wise Loss
L Chen, RK Maddox, Z Duan, C Xu – arXiv preprint arXiv:1905.03820, 2019 – arxiv.org
… These works have in- spired us to use the facial landmarks to bridge audio with row pixel generation. Attention Mechanism Attention mechanism is an emerging topic in natural language tasks [20] and im- age/video generation task [26, 37, 22, 36]. Pumarola et al …

CM-GANs: Cross-modal generative adversarial networks for common representation learning
Y Peng, J Qi – ACM Transactions on Multimedia Computing …, 2019 – dl.acm.org
Page 1. 22 CM-GANs: Cross-modal Generative Adversarial Networks for Common Representation Learning YUXIN PENG and JINWEI QI, Peking University, China It is known that the inconsistent distributions and representations …

Visual-textual sentiment classification with bi-directional multi-level attention networks
J Xu, F Huang, X Zhang, S Wang, C Li, Z Li… – Knowledge-Based …, 2019 – Elsevier
… sentiment classification. Specifically, a visual attention network is proposed first to focus on emotional image regions related to the corresponding text description, which is denoted as the text-to-image attention. To excavate …

A New Method of System-level EMC Evaluation Based on Deep Learning
Y Wang, C Li, H Shan, D Zhang… – 2019 Photonics & …, 2019 – ieeexplore.ieee.org
… speech recognition [1,2], the simultaneous interpretation is realized by using deep learning [3], in the field of natural language processing to … Processing Systems, 2672–2680, 2014 7. Reed, S., Z. Akata, X. Yan, et al., “Generative adversarial text to image synthesis,” Proceedings …

An empirical study of spatial attention mechanisms in deep networks
X Zhu, D Cheng, Z Zhang, S Lin… – Proceedings of the IEEE …, 2019 – openaccess.thecvf.com
… than on irrel- evant parts. They were first studied in natural language processing (NLP), where encoder-decoder attention mod- ules were developed to facilitate neural machine transla- tion [2, 28, 14]. In computing the output …

Neural Story Teller using RNN and Generative Algorithm
MS Rathore, S Patel, MP Rao, AP Sathe – 2019 – academia.edu
… For text-to-image alignment, [7] find correspondences be- tween nouns and pronouns in a caption and visual objects using several visual and … MS COCO dataset made it possible to use computer vision and NLP (Natural Language Processing) to carry out the task of image …

HUSE: Hierarchical Universal Semantic Embeddings
P Narayana, A Pednekar, A Krishnamoorthy… – arXiv preprint arXiv …, 2019 – arxiv.org
Page 1. HUSE: Hierarchical Universal Semantic Embeddings Pradyumna Narayana1, Aniket Pednekar1, Abishek Krishnamoorthy?2, Kazoo Sone1, Sugato Basu1 1Google, 2Georgia Institute of Technology {pradyn,aniketvp,sone,sugato}@google.com, akrishna61@gatech.edu …

Coloring with limited data: Few-shot colorization via memory augmented networks
S Yoo, H Bahng, S Chung, J Lee… – Proceedings of the …, 2019 – openaccess.thecvf.com
… Key-value memory networks for directly reading documents. Proceedings of the 2016 Conference on Empiri- cal Methods in Natural Language Processing, 2016. 2 [18] M. Mirza and S. Osindero … Generative adversarial text-to-image synthesis …

Layoutvae: Stochastic scene layout generation from a label set
AA Jyothi, T Durand, J He, L Sigal… – Proceedings of the …, 2019 – openaccess.thecvf.com
Page 1. LayoutVAE: Stochastic Scene Layout Generation From a Label Set Akash Abdu Jyothi1,3, Thibaut Durand1,3, Jiawei He1,3, Leonid Sigal2,3, Greg Mori1,3 1Simon Fraser University 2University of British Columbia 3Borealis AI {aabdujyo, tdurand, jha203}@sfu.ca …

Univse: Robust visual semantic embeddings via structured semantic representations
H Wu, J Mao, Y Zhang, Y Jiang, L Li, W Sun… – arXiv preprint arXiv …, 2019 – arxiv.org
… Researchers have designed various types of rep- resentations (Banarescu et al., 2013; Montague, 1970) as well as different models (Liang et al., 2013; Zettlemoyer and Collins, 2005) for translat- ing natural language sentences into structured rep- resentations …

User Input Based Style Transfer While Retaining Facial Attributes
S Pai, N Sachdeva, R Shah… – 2019 IEEE Fifth …, 2019 – ieeexplore.ieee.org
… FRGAN on a vast dataset such as CelebA, which contains 40 attributes, because of the nature of natural language queries an … Reed, Zeynep Akata, Xinchen Yan, Lajanugen Lo- geswaran, Bernt Schiele, and Honglak Lee, “Genera- tive adversarial text to image synthesis,” arXiv …

Text2Scene: Generating Compositional Scenes from Textual Descriptions Supplementary Material
F Tan, S Feng, V Ordonez – openaccess.thecvf.com
… Glove: Global vectors for word representa- tion. In Empirical Methods in Natural Language Processing (EMNLP), pages 1532–1543, 2014 … Attngan: Fine- grained text to image generation with attentional generative adversarial networks …

TAB-VCR: Tags and Attributes based VCR Baselines
J Lin, U Jain, A Schwing – Advances in Neural Information Processing …, 2019 – papers.nips.cc
… New tags for better text to image grounding … The set of unique class labels in ô is assigned to ˆL. Both q and r are modified such that all tags (pointers to detections in the image) are remapped to natural language (class label of the detection). This is done via the remap function …

Panet: A context based predicate association network for scene graph generation
Y Chen, Y Wang, Y Zhang, Y Guo – 2019 IEEE International …, 2019 – ieeexplore.ieee.org
… tion, namely relationship triplets in the images, which is uti- lized to improve other computer vision tasks such as image retrieval [1], text to image [2] and … In addi- tion, attention mechanism is widely used in natural language processing (NLP) and other computer vision tasks …

TAB-VCR: Tags and Attributes based Visual Commonsense Reasoning Baselines
J Lin, U Jain, AG Schwing – arXiv, 2019 – papers.neurips.cc
… New tags for better text to image grounding … The set of unique class labels in ô is assigned to ˆL. Both q and r are modified such that all tags (pointers to detections in the image) are remapped to natural language (class label of the detection). This is done via the remap function …

Multimodal Learning with Triplet Ranking Loss for Visual Semantic Embedding Learning
Z Yang, L Li, J He, Z Wei, L Liu, J Liao – International Conference on …, 2019 – Springer
… 2.4 Word Embedding. Word embedding is one of the most breakouts in Natural Language Processing (NLP) domain … 4 Experiments. Consistent with UVSE [15], we perform the following query tasks: image annotation (image to text, i2t), and image search (text to image, t2i) …

Dermgan: Synthetic generation of clinical skin images with pathology
A Ghorbani, V Natarajan, D Coz, Y Liu – arXiv preprint arXiv:1911.08716, 2019 – arxiv.org
… success of supervised deep learning in many domains including computer vision (Mahajan et al., 2018), natural language processing (Devlin … GAN have been effectively used in many applications, including super resolution (Ledig et al., 2017), text-to-image generation (Zhang …

Clevr-dialog: A diagnostic dataset for multi-round reasoning in visual dialog
S Kottur, JMF Moura, D Parikh, D Batra… – arXiv preprint arXiv …, 2019 – arxiv.org
… CLEVR-Dialog goes beyond CLEVR-Ref+, which focuses on ground- ing objects given a natural language expression, and deals with additional visual and linguistic chal- lenges that require multi-round reasoning in visual dialog …

A Brief Overview on Generative Adversarial Networks
R Patel – Data and Communication Networks, 2019 – Springer
… Whether it would be image recognition, understanding natural language, speech recognition or to learn and play games, computers are capable of performing … are in music and video generation, in image-to-image translation as shown by pix2pix and in text to image translation …

Audio-visual fusion for sentiment classification using cross-modal autoencoder
SH Dumpala, R Chakraborty… – 32nd Conference on …, 2019 – nips2018vigil.github.io
… [15] Scott Reed, Zeynep Akata, Xinchen Yan, Lajanugen Logeswaran, Bernt Schiele, and Honglak Lee. Generative adversarial text to image synthesis … Text2shape: Generating shapes from natural language by learning joint embeddings. arXiv preprint arXiv:1803.08495, 2018 …

Recent progress on generative adversarial networks (GANs): A survey
Z Pan, W Yu, X Yi, A Khan, F Yuan, Y Zheng – IEEE Access, 2019 – ieeexplore.ieee.org
… For text-to-image translation, [59] and [60] used the textual description to gen- erate images. B. NATURAL LANGUAGE PROCESSING At present, GANs also has some achievements in the field of language and speech processing. Yu et al …

SUN-Spot: An RGB-D Dataset With Spatial Referring Expressions
C Mauceri, M Palmer… – Proceedings of the IEEE …, 2019 – openaccess.thecvf.com
… In ECCV, 2014. 3 [6] R. Hu et al. Natural Language Object Retrieval. In CVPR, 2016. 3 [7] A. Janoch et al … arXiv, 2014. 2 [11] C. Kong et al. What are you talking about? Text-to-Image Coreference. In CVPR, 2014. 2 [12] E. Krahmer and K. Van Deemter …

A review: Generative adversarial networks
L Gonog, Y Zhou – 2019 14th IEEE Conference on Industrial …, 2019 – ieeexplore.ieee.org
… 2) Text to Image: This field is the result of a collision between NLP (Natural Language Processing) and CV (Com- puter Vision). The task is described as: to generate a picture that matches the image text from a given textual description …

Improving Dataset Distillation
I Sucholutsky, M Schonlau – arXiv preprint arXiv:1910.02551, 2019 – arxiv.org
… 3 increase over the state-of-the-art (SOTA), but also achieves almost 92% accuracy with just 5 distilled images, which is less than 1 image per class. In addition to soft labels, we also extend dataset distillation to the natural language/sequence mod- elling domain …

Survey on generative adversarial networks
N Yashwanth, P Navya, M Rukhiya, KSV Prasad… – 2019 – academia.edu
… This GAN is a text to image synthesis GAN … [5] Tim Salimans Ian Goodfellow Vicki Cheung et al.Improved Techniques for Training GANs [6] Jon Gauthier, Symbolic Systems Program, Natural Language Processing Group Stanford University – Conditional generative adversarial …

Switching GAN-Based Image Filters to Improve Perception for Autonomous Driving
Z Masud – 2019 – uwspace.uwaterloo.ca
… This boom in deep learning research has also found application in various domains such as healthcare, natural language processing, finance … been used for a number of tasks including generating human faces [36], image- to-image translation [33], text-to-image translation [77 …

Modality Consistent Generative Adversarial Network for Cross-Modal Retrieval
Z Wu, F Wu, X Luo, X Dong, C Wang… – Chinese Conference on …, 2019 – Springer
… like canonical correlation analysis (CCA)-based methods [2]. Deep learning technology is widely used in image recognition, natural language processing and … In can be seen from the table that in both image to text and text to image retrieval tasks, GAN-based methods such as …

Photo-realistic face age progression/regression using a single generative adversarial network
J Zeng, X Ma, K Zhou – Neurocomputing, 2019 – Elsevier
… representations for high dimensional raw data, win high reputations in Computer Vision (CV) [36], [37], [41], Natural Language Processing (NLP … tasks, including image-to-image translation [25], [26], [27], [29], image super-resolution [45], [46] and text-to-image synthesis [47], [48] …

The Role of Media Conversion for Positive Learning
K Kise – Frontiers and Advances in Positive Learning in the Age …, 2019 – Springer
… However, with the help of generative adversarial networks (GAN) and its successors, it is possible to obtain natural pictures produced from a natural language phrase (Reed et al. 2016) … Generative adversarial text to image synthesis …

Adapting Visual Question Answering Models for Enhancing Multimodal Community Q&A Platforms
A Srivastava, HW Liu, S Fujita – … of the 28th ACM International Conference …, 2019 – dl.acm.org
… The most preva- lent applications of works combining computer vision (CV), natural language processing (NLP), and knowledge representation & rea- soning (KR) have been in image captioning and VQA … The second challenge is to correctly ground the text to image relation …

A state-of-the-art survey on deep learning theory and architectures
MZ Alom, TM Taha, C Yakopcic, S Westberg, P Sidike… – Electronics, 2019 – mdpi.com
… to traditional machine learning approaches in the fields of image processing, computer vision, speech recognition, machine translation, art, medical imaging, medical information processing, robotics and control, bioinformatics, natural language processing, cybersecurity, and …

Synthetic Generation of Clinical Skin Images with Pathology
A Ghorbani, V Natarajan, D Coz, Y Liu – ml4health.github.io
… success of supervised deep learning in many domains including computer vision (Mahajan et al., 2018), natural language processing (Devlin … GAN have been effectively used in many applications, including super resolution (Ledig et al., 2017), text-to-image generation (Zhang …

Beyond supervised learning: A computer vision perspective
L Chum, A Subramanian, VN Balasubramanian… – Journal of the Indian …, 2019 – Springer
… outcomes in machine learning. In the past few years, the advent of deep learning techniques has greatly benefited the areas of computer vision, speech, and Natural Language Processing (NLP). However, supervised deep …

Visual Dialog: Towards Communicative Visual Agents
S Kottur – 2019 – kilthub.cmu.edu
… (AI). Still, we are far from intelligent agents that can visually perceive their surroundings, reason, and interact with humans in natural language, thereby being an integral part of our lives … vision (CV), natural language processing (NLP), and AI in general, across a spectrum …

Fine-grained visual-textual representation learning
X He, Y Peng – IEEE Transactions on Circuits and Systems for …, 2019 – ieeexplore.ieee.org
… box. The textual information comes from [20]. They expand the CUB-200-2011 dataset by collecting fine-grained natural language descriptions. Ten single-sentence descriptions are collected for each image, as shown in Fig …

Deep Bayesian Active Learning for Multiple Correct Outputs
K Jedoui, R Krishna, M Bernstein, L Fei-Fei – arXiv preprint arXiv …, 2019 – arxiv.org
… VQA systems expect an input image and a natural language question and attempt to output the correct answer [3]. VQA has received a con- siderable amount of attention in recent years with the de- velopment of several datasets, proposed as benchmarks [3, 50, 27, 20, 39, 57 …

Finegan: Unsupervised hierarchical disentanglement for fine-grained object generation and discovery
KK Singh, U Ojha, YJ Lee – Proceedings of the IEEE …, 2019 – openaccess.thecvf.com
… Visual object discovery and clustering. Early work on unsupervised object discovery [41, 17, 42, 31, 32, 39] use handcrafted features to cluster object categories from un- labeled images. Others explore the use of natural language dialogue for object discovery [10, 59] …

Zero Shot License Plate Re-Identification
M Gupta, A Kumar… – 2019 IEEE Winter …, 2019 – ieeexplore.ieee.org
… 2. Related Work In this section, we briefly review previous work related to our proposed approach, in the domains of Face/Vehicle Re-Identification, Fisher Vectors, Text to Image Retrieval and Zero-Shot Learning. 2.1 … [23], who report on a natural language interface for attributes …

Common Semantic Representation Method Based on Object Attention and Adversarial Learning for Cross-Modal Data in IoV
F Kou, J Du, W Cui, L Shi, P Cheng… – IEEE Transactions on …, 2019 – ieeexplore.ieee.org
… seq2seq [23]. These text features are all successfully used in many natural language processing tasks [23], [26]. Notably, the sequential features perform well in the translation [27] and sentence summarization tasks [28]. The …

Multimodal Generative Models for Compositional Representation Learning
M Wu, N Goodman – arXiv preprint arXiv:1912.05075, 2019 – arxiv.org
… However, previous ex- periments have been limited to straightforward domains with simple images (eg MNIST) and labels. These models struggle with richer domains such as natural language or more naturalistic images (eg CIFAR) …

Justifying diagnosis decisions by deep neural networks
G Spinks, MF Moens – Journal of biomedical informatics, 2019 – Elsevier
… To evaluate the outcomes of both experiments, the quality of the outputs of the text-to-image GANs is compared by calculating … These results underline the necessity of creating a smooth representation space for natural language, and the ARAE embeddings are therefore used in …

Diverse Video Captioning Through Latent Variable Expansion with Conditional GAN
H Xiao, J Shi – arXiv preprint arXiv:1910.12019, 2019 – arxiv.org
… factors. In this paper we also pay attention to the la- tent variables, but we try to generate diverse captions from the video-level. GAN for natural language processing Generative adversarial network (GAN) (Goodfellow et al. 2014 …

The research of virtual face based on Deep Convolutional Generative Adversarial Networks using TensorFlow
S Liu, M Yu, M Li, Q Xu – Physica A: Statistical Mechanics and its …, 2019 – Elsevier
… learning framework TensorFlow [3] and it has been applied to image recognition, image segmentation, speech recognition, natural language processing and … used the DCGAN to create a text-to-image application [4], and the DCGAN will generate relevant images in terms of …

The Next Analytics Age in Cyber through Artificial Intelligence
S Kesharwani – CYBERNOMICS, 2019 – cybernomics.in
… Artificial Intelligence Predictive Text to Image Analytics Speech Recognition Deep Speech Machine | Learning to Text vision Machine … at the moment, robotics, image recognition, natural language processing, real-time analytics tools and various associated systems within the …

How generative adversarial networks and their variants work: An overview
Y Hong, U Hwang, J Yoo, S Yoon – ACM Computing Surveys (CSUR), 2019 – dl.acm.org
… SRGAN [65] Object detection SeGAN [28], Perceptual GAN for small object detection [69] Object transfiguration GeneGAN [144], GP-GAN [132] Joint image generation Coupled GAN [74] Video generation VGAN [125], Pose-GAN [126], MoCoGAN [122] Text to image Stack GAN …

Multimodal Sparse Representation Learning and Cross-Modal Synthesis
M Cha – 2019 – dash.harvard.edu
… 59 6.3 Training dataset only captures many-to-one mappings in the text-to-image direction (solid lines) … Specifically, my interest is to directly generate image pixels from natural language. Recently, generative adversarial network (GAN) [18] has been a promising approach for …

Solving Bayesian inverse problems from the perspective of deep generative networks
TY Hou, KC Lam, P Zhang, S Zhang – Computational Mechanics, 2019 – Springer
… Recent advances in machine learning community show that deep generative networks are effective in approximating various complex (even singular) distributions, eg, distributions of natural images [20, 30, 38] and those of natural language [23, 39], by serving as the transport …

Reinforcement learning with attention that works: A self-supervised approach
A Manchin, E Abbasnejad… – … Conference on Neural …, 2019 – Springer
… models were applied with remarkable success to complex visual tasks such as video and scene understanding [6, 10, 19], natural language understanding including … 24. Xu, T., et al.: AttnGAN: Fine-grained text to image generation with attentional generative adversarial networks …

Learning Target-aware Attention for Robust Tracking with Conditional Adversarial Network (Supplementary Material)
X Wang, T Sun, R Yang, B Luo… – bmvc2019.org
… They adopt attention mechanism which is first proposed in natural language processing domain and achieve better tracking performance, but ignore the target object provided in first … Attngan: Fine-grained text to image generation with attentional generative adversarial net- works …

Online heterogeneous transfer learning by knowledge transition
H Wu, Y Yan, Y Ye, H Min, MK Ng, Q Wu – ACM Transactions on …, 2019 – dl.acm.org
Page 1. 26 Online Heterogeneous Transfer Learning by Knowledge Transition HANRUI WU, YUGUANG YAN, YUZHONG YE, and HUAQING MIN, South China University of Technology, China MICHAEL K. NG, Hong Kong Baptist …

Brain encoding and decoding in fMRI with bidirectional deep generative models
C Du, J Li, L Huang, H He – Engineering, 2019 – Elsevier
… DNN-based deep learning methods have achieved great success in image recognition, speech recognition, natural language understanding, and … used in various applications, including image generation [53], image-to-image translation [54], and text-to-image synthesis [55], [56 …

Neural network modeling of in situ fluid-filled pore size distributions in subsurface shale reservoirs under data constraints
H Li, S Misra, J He – Neural Computing and Applications, 2019 – Springer
… LSTM applications include machine translation [18, 19], natural language generation [20], and time series prediction. Joint application of LSTM and CNN can accomplish the work of video description [21], image captioning [1], and text to image generation [22] …

Cross-modality personalization for retrieval
N Murrugarra-Llerena… – Proceedings of the IEEE …, 2019 – openaccess.thecvf.com
Page 1. Cross-Modality Personalization for Retrieval Nils Murrugarra-Llerena Adriana Kovashka Department of Computer Science University of Pittsburgh {nineil, kovashka}@cs.pitt.edu Abstract Existing captioning and gaze …

HSME: Hypersphere manifold embedding for visible thermal person re-identification
Y Hao, N Wang, J Li, X Gao – Proceedings of the AAAI Conference on …, 2019 – aaai.org
… (Ye et al. 2015) and Li et al. (Li et al. 2017b)(Li et al. 2017a) proposed a series of text- to-image person retrieval methods. However, these methods cannot be directly applied to VT-REID. In VT-REID, a two- stage framework is proposed in (Ye et al. 2018a), which 8386 Page 3. F …

Deep semantic mapping for heterogeneous multimedia transfer learning using co-occurrence data
L Zhao, Z Chen, LT Yang, MJ Deen… – ACM Transactions on …, 2019 – dl.acm.org
Page 1. 9 Deep Semantic Mapping for Heterogeneous Multimedia Transfer Learning Using Co-Occurrence Data LIANG ZHAO and ZHIKUI CHEN, Dalian University of Technology, China LAURENCE T. YANG, St. Francis Xavier …

Web Searching and Mining
D Mukhopadhyay – 2019 – Springer
Cognitive Intelligence and Robotics Web Searching and Mining Debajyoti Mukhopadhyay Editor Page 2. Cognitive Intelligence and Robotics Series editors Amit Konar, Department of Electronics and Tele-communication Engineering …

Revisiting Paraphrase Question Generator using Pairwise Discriminator
BN Patro, D Chauhan, VK Kurmi… – arXiv preprint arXiv …, 2019 – arxiv.org
… AD: Adversarial, CE:Cross Entropy,PA: Pairwise SA: Sentiment Analysis, RL: Reinforcement Learning, KL: KL divergence Loss Learning 2. Related Work Given the flexibility and diversity of natural language, it has been a challenging task to represent text efficiently …

An Exploration of Cross-Modal Retrieval for Unseen Concepts
F Zhong, Z Chen, G Min – … Conference on Database Systems for Advanced …, 2019 – Springer
… Two tasks ie text to image (T2I) and image to text (I2T), are designed to validate the proposed approach in handling “seen” and “unseen” cross-modal retrieval … In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing, pp …

Front End Development Automation Tool: Missing Features?
HH Walpola, G Poravi – 2019 IEEE 5th International …, 2019 – ieeexplore.ieee.org
… [5] (2016) by Ling able to generate the source code from a mixed natural language. They use structured program specification as input … [15] S. Reed, Z. Akata, X. Yan, L. Logeswaran, B. Schiele, and H. Lee. Generative adversarial text to image synthesis …

Attend to the difference: Cross-modality person re-identification via contrastive correlation
S Zhang, Y Yang, P Wang, X Zhang… – arXiv preprint arXiv …, 2019 – arxiv.org
… Cross Modality ReID. For cross modality person ReID, some endeavors are devoted on text-to-image person retrieval [16], [17], [18], whose approaches can not be directly transferred to RGB-IR ReID problem due to the obvious difference between the two tasks …

Stabilizing Generative Adversarial Network Training: A Survey
M Wiatrak, SV Albrecht – arXiv preprint arXiv:1910.00927, 2019 – arxiv.org
… An alternative to maximum likelihood approaches, GANs have been shown to achieve state-of-the-art re- sults in the generation of images [2, 3, 4, 5], natural language [6] time-series synthesis [7, 8] and other domains [9, 10] … Generative adversarial text to image synthesis …

Adaptive appearance rendering
R Deng – 2019 – summit.sfu.ca
… large amount of computing resources. When these studies are combined with natural language processing, approaches have been proposed for text to image synthesis or image to text synthesis [24, 31]. Works [27, 7] have also …

Learning generative models across incomparable spaces
C Bunne, D Alvarez-Melis, A Krause… – arXiv preprint arXiv …, 2019 – arxiv.org
Page 1. Learning Generative Models across Incomparable Spaces Charlotte Bunne 1 David Alvarez-Melis 2 Andreas Krause 1 Stefanie Jegelka 2 Abstract Generative Adversarial Networks have shown re- markable success …

Extracting high-dimensional features from medical images by utilizing deep learning techniques.
E Trivizakis – 2019 – apothesis.lib.teicrete.gr
… seen significant benefits from the application of advanced machine learning techniques including image classification, text-to-image retrieval, object recognition, enhancement, registration, segmentation and generation. Regarding …

Cross-modal retrieval in challenging scenarios using attributes
T Dutta, S Biswas – Pattern Recognition Letters, 2019 – Elsevier
JavaScript is disabled on your browser. Please enable JavaScript to use all the features on this page. Skip to main content Skip to article …

Modal-Dependent Retrieval Based on Mid-Level Semantic Enhancement Space
S Zheng, H Zhang, Y Qi, B Zhang – IEEE Access, 2019 – ieeexplore.ieee.org
… Modal-dependent [17] differs from the previous methods of learning a pair of projections by learning two pairs of mappings that project Image to Text retrieval(I2T) and Text to Image retrieval(T2I) from their original feature space into two public potential subspaces …

Robust and graph regularised non-negative matrix factorisation for heterogeneous co-transfer clustering
Y Ma, Z Chen, X Qiu, L Zhao – International Journal of …, 2019 – inderscienceonline.com
… In this case, heterogeneous transfer learning has been proposed to process heterogeneous data and has been successfully applied to text-to-image transferring and cross language classification problems (Yang et al., 2014; Ng et al., 2012) …

Multi-scale dilated convolution network based depth estimation in intelligent transportation systems
Y Tian, Q Zhang, Z Ren, F Wu, P Hao, J Hu – IEEE Access, 2019 – ieeexplore.ieee.org
Page 1. SPECIAL SECTION ON ARTIFICIAL INTELLIGENCE (AI)-EMPOWERED INTELLIGENT TRANSPORTATION SYSTEMS Received November 21, 2019, accepted December 10, 2019, date of publication December 18, 2019, date of current version December 31, 2019 …

Transfer Neural Trees: Semi-Supervised Heterogeneous Domain Adaptation and Beyond
WY Chen, TMH Hsu, YHH Tsai… – … on Image Processing, 2019 – ieeexplore.ieee.org
Page 1. 4620 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 28, NO. 9, SEPTEMBER 2019 Transfer Neural Trees: Semi-Supervised Heterogeneous Domain Adaptation and Beyond Wei-Yu Chen, Tzu-Ming Harry Hsu …

Multimodal Hate Speech Detection in Memes
B Oriol Sàbat – 2019 – upcommons.upc.edu
Page 1. Page 2. Multimodal Hate Speech Detection in Memes Degree’s Thesis Audiovisual Systems Engineering Author: Benet Oriol S`abat Advisors: Xavier Giró-i-Nieto Cristian Canton Universitat Polit`ecnica de Catalunya (UPC) 2019 Page 3. Abstract …

Classical Music Generation in Distinct Dastgahs with AlimNet ACGAN
S Malekzadeh, M Samami, S RezazadehAzar… – arXiv preprint arXiv …, 2019 – arxiv.org
… using deep neural networks over the last three years [3]. Recurrent neural networks (RNNs) with long short-term memory (LSTM) cells have illustrated great results both in generating natural language and hand … Reed, S., et al., Generative adversarial text to image synthesis …

Adversarial large-scale root gap inpainting
H Chen, M Valerio Giuffrida… – Proceedings of the …, 2019 – openaccess.thecvf.com
Page 1. Adversarial Large-scale Root Gap Inpainting Hao Chen University of Edinburgh s1786991@ed.ac.uk Mario Valerio Giuffrida University of Edinburgh v.giuffrida@ed.ac. uk Peter Doerner University of Edinburgh Peter.Doerner@ed.ac.uk …

Generative Models for Fashion Industry using Deep Neural Networks
I Lomov, I Makarov – 2019 2nd International Conference on …, 2019 – ieeexplore.ieee.org
… In [40], the authors apply GAN model to synthesize images using its descriptions on natural language … [30] S. Reed, Z. Akata, X. Yan, L. Logeswaran, B. Schiele, and H. Lee. Generative Adversarial Text to Image Synthesis. ArXiv e-prints, May 2016 …

Generative adversarial networks with denoising penalty and sample augmentation
Y Gan, K Liu, M Ye, Y Zhang, Y Qian – Neural Computing and Applications, 2019 – Springer
… These methods can be roughly divided into four categories: noise-to-image [27, 30, 31], which is sampling from a noise distribution and takes the sampling as the input of generative model. Text-to-image [26, 32] encodes a text by an encoder …

Gastric cancer detection from endoscopic images using synthesis by GAN
T Kanayama, Y Kurose, K Tanaka, K Aida… – … Conference on Medical …, 2019 – Springer
… In [19], the authors dealt with the task of generating photographic images which were conditioned on image description expressed in natural language … Zhang, Z., Xie, Y., Yang, L.: Photographic text-to-image synthesis with a hierarchically-nested adversarial network …

Deep Learning–Based Multimedia Analytics: A Review
W Zhang, T Yao, S Zhu, AE Saddik – ACM Transactions on Multimedia …, 2019 – dl.acm.org
… in slots with the most likely labeling. For video captioning, [53] builds a concept hierarchy of actions for natural-language description of human activities. Later, in [106], Rohrbach et al. teach a CRF to model the relationships …

Latent translation: Crossing modalities by bridging generative models
Y Tian, J Engel – arXiv preprint arXiv:1902.08261, 2019 – arxiv.org
… Natural language processing has also recently seen significant progress through transfer learning of very large pretrained models (Devlin et al., 2018). However, easily combining multiple models together in a modular way is still an unsolved problem …

MULTI-VIEW REPRESENTATION LEARNING FOR UNIFYING LANGUAGES, KNOWLEDGE AND VISION
MSA Mogadala – core.ac.uk
Page 1. Zur Erlangung des akademischen Grades eines Doktors der Ingenieurwissenschaften (Dr.-Ing.) von der KIT-Fakultät für Wirtschaftswissenschaften des Karlsruher Instituts für Technologie (KIT) genehmigte Dissertation von M.Sc. Aditya Mogadala …

Stylistic scene enhancement GAN: mixed stylistic enhancement generation for 3D indoor scenes
S Zhang, Z Han, YK Lai, M Zwicker, H Zhang – The Visual Computer, 2019 – Springer
… voxel input to a latent space. Chen et al. [17] proposed a conditional Wasserstein GAN framework [9, 18]-based method to generate colored 3D shapes from natural language. While these methods focus on modeling images …

Tagging: Semantics at the Iconic/Symbolic Interface
G Greenberg – Proceedings of the, 2019 – events.illc.uva.nl
… Here a range of defeasible conventions may be invoked to indicate structural links: 1Such links are a sub-segmental variant of the text-to-image links posited by Alikhani and Stone (2018b) … Iconic pragmatics. Natural Language & Linguistic Theory, 36(3):877–936, 2018a …

Attentive Sequences Recurrent Network for Social Relation Recognition from Video
J Lv, B Wu, Y Zhang, Y Xiao – IEICE TRANSACTIONS on …, 2019 – search.ieice.org
… SUMMARY Recently, social relation analysis receives an increasing amount of attention from text to image data … It has a wide range of applications in Natural Language Processing (NLP) [15], [17] and video descrip- tion [26], [27] …

Diachronic Cross-modal Embeddings
D Semedo, J Magalhães – Proceedings of the 27th ACM International …, 2019 – dl.acm.org
… 2011 2014 Text to Image over the years grandslam grasscourttennis roger federer tennis wimbledon championship jamiemurray jankovic match mixeddoubles tennis wimbledon 2014 bahrami england london mansour bahrami tennis wimbledon …

Discrete robust supervised hashing for cross-modal retrieval
T Yao, Z Zhang, L Yan, J Yue, Q Tian – IEEE Access, 2019 – ieeexplore.ieee.org
Page 1. 2169-3536 (c) 2018 IEEE. Translations and content mining are permitted for academic research only. Personal use is also permitted, but republication/ redistribution requires IEEE permission. See http://www.ieee.org …

Brain Encoding and Decoding in fMRI with Bidirectional Deep Generative Models
D Changde, L Jinpeng, H Lijie, H Huiguang – 2019 – ir.ia.ac.cn
… Page 3. 3 recognition, natural language understanding and other aspects … GAN has been widely used in various applications, eg, image generation [54], image-to-image translation [55] and text-to-image synthesis [56, 57]. Page 5. 5 …

Knowledge Science, Engineering and Management: 12th International Conference, KSEM 2019, Athens, Greece, August 28–30, 2019, Proceedings
C Douligeris, D Karagiannis, D Apostolou – 2019 – books.google.com
… Yuqiang Xie, Yue Hu, Luxi Xing, and Xiangpeng Wei Automated Mining and Checking of Formal Properties in Natural Language Requirements … Page 20. xx Contents–Part II Text to Image Synthesis Using Two-Stage Generation and Two-Stage Discrimination …

A review on Generative Adversarial Networks for unsupervised Machine Learning
PJ Paz Carbajo – 2019 – e-archivo.uc3m.es
Page 1. Bachelor’s Degree in Telecommunication Technologies Engineering Academic Year 2018/2019 Bachelor Thesis “A review on Generative Adversarial Networks for unsupervised Machine Learning” Pablo Javier de Paz Carbajo Supervised by: Pablo Martínez Olmos …

of retracted article: Realization of Virtual Human Face Based on Deep Convolutional Generative
Z Zhu, X Deng, J Li, E Wei – Journal of Signal and Information Processing …, 2019 – scirp.org
… 3] used deep convolutional generative adversarial networks to create a text-to-image application in 2016 … framework Tensor- Flow [7], which has been applied to scenes such as image recognition, image segmentation, speech recognition, and natural language processing, and …

Bi-directional center-constrained top-ranking for visible thermal person re-identification
M Ye, X Lan, Z Wang, PC Yuen – IEEE Transactions on …, 2019 – ieeexplore.ieee.org
… For cross-modality person re-identification, several works have studied the text-to-image person retrieval problem [41]–[43] … For text-to-image retrieval, several end-to-end deep learning methods have been introduced on top of dual-path network …

GAN-based semi-supervised learning approach for clinical decision support in health-IoT platform
Y Yang, F Nan, P Yang, Q Meng, Y Xie, D Zhang… – IEEE …, 2019 – ieeexplore.ieee.org
… data. Such approach has been successfully applied to natural language processing [15], [16], disease prediction [17], image classification [18] [19], face recognition [20], action recognition [21] and person re- identification [22] …

GANai: standardizing CT images using generative adversarial network with alternative improvement
G Liang, S Fouladvand, J Zhang… – 2019 IEEE …, 2019 – ieeexplore.ieee.org
… image x by specifying a high-level goal that the image features of x are significantly more similar to that of the target image y than the source image x. Image synthesis algorithms have been successfully used in image conversion and natural language processing, such as the …

Deep neural network architectures to approximate the fluid-filled pore size distributions of subsurface geological formations
S Misra, H Li – Machine Learning for Subsurface Characterization, 2019 – books.google.com
… across various domains and easily adapted to new problems, unlike domain specific methods such as those traditional ones specialized for natural language processing and image … GANs have been successfully applied to image generation [15] and text to image synthesis [16] …

Deep IA-BI and Five Actions in Circling
L Xu – International Conference on Intelligent Science and Big …, 2019 – Springer
… Typical examples include language to language, text to image, text to sketch, sketch to image, image to image, 2D image to 3D … Those languages used in current computers are simplified from a natural language by adding restrictions, such that searching complexity becomes …

Automatic Cluster Analysis of Texts in Simplified German
B Alessia – 2019 – cl.uzh.ch
… LDA Latent Dirichlet Allocation LIX Läsbarhetsindex NER Named Entity Recognition NLP Natural Language Processing NLTK Natural Language Processing Toolkit NMT Neural Machine Translation NNMF Non-Negative Matrix Factorisation OCR Optical Character Recognition …

Semi-supervised Deep Quantization for Cross-modal Search
X Wang, W Zhu, C Liu – Proceedings of the 27th ACM International …, 2019 – dl.acm.org
… Given the success of deep representation in computer vision and natural language processing, we resort to a modified deep neural network structure to achieve a better feature representation for both image and text domain …

Transfer adaptation learning: A decade survey
L Zhang – arXiv preprint arXiv:1903.04687, 2019 – arxiv.org
… Sample Selection K-Means &l21-norm [104], [105] Co-training-Based Double classifiers [106], [107] 3.1 Intuitive Weighting Instance re-weighting based domain adaptation was first proposed for natural language processing (NLP) [98], [99] …

Neural machine translation for multimodal interaction
K Dutta Chowdhury – 2019 – doras.dcu.ie
… Introduction While the holy grail of full natural language understanding still remains a distant dream in artificial intelligence, progress is being made in developing machine learning algorithms to comprehend what humans are talking or writing in natural language …

The Future of Data-driven Content’, Data-driven Marketing Content
L Wilson, L Wilson – 2019 – emerald.com
… Natural Language Generation (NLG): another subfield of Computer Science and Artificial Intelligence, NLG is a program that turns data into language … Consumer movement from desktop to mobile and content type migration from text to image, audio and video all help shape the …

On the Impact of Machine Learning
C Belém, L Santos, A Leitão – researchgate.net
… Regarding textual representations, we can explore the ML techniques typically used in natural language processing tasks, such as, automatic … This technique is called text-to-image synthesis and there are examples, particularly in the translation of visual concepts from …

Disentanglement in conceptual space during sensorimotor interaction
J Zhong, T Ogata, A Cangelosi… – Cognitive Computation …, 2019 – ieeexplore.ieee.org
… 4 Discussion 4.1 Learning concepts by affordance learning The grounding theory in psychology suggests that the usage of natural language relies on situational context. To understand the language dependent on the physical environment and capture such Fig …

An advanced deep generative framework for temporal link prediction in dynamic networks
M Yang, J Liu, L Chen, Z Zhao… – IEEE Transactions on …, 2019 – ieeexplore.ieee.org
… Subsequently, the GANs [14] has dominated the research of image gen- eration. Reed et al. [33] introduced a conditional GAN to synthesize images based on the detailed text descriptions, which was the first end-to-end text-to-image generation model. Van den Oord et al …

GANs for Children: A Generative Data Augmentation Strategy for Children Speech Recognition
P Sheng, Z Yang, Y Qian – 2019 IEEE Automatic Speech …, 2019 – ieeexplore.ieee.org
… prediction[17], sketch retrieval[18] etc., and then quickly applied to many other fields such as natural language pro- cessing [19 … Reed, Zeynep Akata, Xinchen Yan, Lajanugen Lo- geswaran, Bernt Schiele, and Honglak Lee, “Genera- tive adversarial text to image synthesis,” arXiv …

Machine Learning and Deep Learning frameworks and libraries for large-scale data mining: a survey
G Nguyen, S Dlugolinsky, M Bobák, V Tran… – Artificial Intelligence …, 2019 – Springer
… 2018). Recurrent Neural Networks (RNN) are specially designed to deal with sequential data. They are widely used in Natural Language Processing (NLP) like Neural Machine Translation (Wu et al … 2018), text-to-image synthesis (Zhang et al …

Deep Learning Applications in Chest Radiography and Computed Tomography
SM Lee, JB Seo, J Yun, YH Cho… – Journal of thoracic …, 2019 – ingentaconnect.com
… With the availability of large data sets and increased computing power, CNNs have produced promising results for many tasks including image classi- fication, correct image detection, correct image segmenta- tion, and understanding speech (eg, natural language processing) …

THE BUILDING BLOCKS FOR OPEN ECOSYSTEMS OF ONLINE RESOURCES SERVING BUDDHIST COMMUNITIES
A Amies – Buddhism and the fourth industrial revolution, 2019 – books.google.com
… on the development of Buddhist resources and digital humanities in general:(1) the improvement of deep learning methods for processing natural language,(2) the … 2010.“From Text to Image to Analysis: Visualization of Chinese Buddhist Canon.” Digital Humanities 2010, 184 …

Applications of generative adversarial networks (gans): An updated review
H Alqahtani, M Kavakli-Thorne, G Kumar – Archives of Computational …, 2019 – Springer
Page 1. Vol.:(0123456789) 1 3 Archives of Computational Methods in Engineering https://doi.org/10.1007/s11831-019-09388-y ORIGINAL PAPER Applications of Generative Adversarial Networks (GANs): An Updated Review …

Transferring multiscale map styles using generative adversarial networks
Y Kang, S Gao, RE Roth – International Journal of Cartography, 2019 – Taylor & Francis
… Latest AI technology advancements in the past decade include a range of deep learning methods developed primarily in computer science for image classification, segmentation, objection localization, style transfer, natural language processing, and so forth (Gatys, Ecker, & …

Cross-Modality Retrieval by Joint Correlation Learning
S Wang, D Guo, X Xu, L Zhuo, M Wang – ACM Transactions on …, 2019 – dl.acm.org
… https://doi.org/10.1145/3314577 1 INTRODUCTION Multimodal understanding is a challenging task that has received content from both computer vision and natural language processing [51] … extracted the HOG feature to solve the problem of text-to-image retrieval [36] …

Towards conceptual generalization in the embedding space
L Nenadovi?, V Prelovac – arXiv preprint arXiv:1906.01873, 2019 – arxiv.org
… states and laws So far, there are several methods developed in natural language processing (NLP) which can learn geometric representation of words, where the meaning of the words follows solely from their co-occurrence. The models are all based on 4 Page 5 …

Multimodal representation and learning
S Nawaz – 2019 – insubriaspace.cineca.it
… and 4. Page 21. 2 Visual Word Embedding for Text 2.1 Introduction Text classification is a common task in Natural Language Processing. Its goal is to assign a label to a text document from a predefined set of classes. In recent …

Deep learning based unsupervised concept unification in the embedding space
L Nenadovic, V Prelovac – arXiv preprint arXiv:1906.01873, 2019 – researchgate.net
… Phrase-based & neural unsupervised machine transla- tion. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pages 5039–5049, Brussels, Belgium, October-November 2018 … Generative adversarial text to image synthesis …

Zero-Shot Generation of Human-Object Interaction Videos
M Nawhal, M Zhai, A Lehrmann, L Sigal – arXiv preprint arXiv:1912.02401, 2019 – arxiv.org
Page 1. Zero-Shot Generation of Human-Object Interaction Videos Megha Nawhal1,4, Mengyao Zhai1,4, Andreas Lehrmann4, Leonid Sigal2,3,4 1Simon Fraser University, 2University of British Columbia, 3Vector Institute for AI, 4Borealis AI Abstract …

Spatial and temporal representations for multi-modal visual retrieval
N Garcia Docampo – 2019 – publications.aston.ac.uk
Page 1. Some pages of this thesis may have been removed for copyright restrictions. If you have discovered material in Aston Research Explorer which is unlawful eg breaches copyright, (either yours or that of a third party) or any other law, including but not limited to …

Deep multimodal representation learning: A survey
W Guo, J Wang, S Wang – IEEE Access, 2019 – ieeexplore.ieee.org
… 14], and text-to-image synthesis [15]. In recent years, due to the powerful representation ability with multiple levels of abstraction, deep learning has demon- strated outstanding results in various applications involving computer vision, natural language processing, and speech …

A novel two-stage separable deep learning framework for practical blind watermarking
Y Liu, M Guo, J Zhang, Y Zhu, X Xie – Proceedings of the 27th ACM …, 2019 – dl.acm.org
… Many variants of GAN were soaring, such as CGAN [28], WGAN [4], DCGAN [34], InfoGAN [7], which gave birth to the application of GAN in various image tasks. For example, text-to- image [35], image-to-image [21], image caption [9] were based on CGAN …

Multi-modal graph regularization based class center discriminant analysis for cross modal retrieval
M Zhang, H Zhang, J Li, Y Fang, L Wang… – Multimedia Tools and …, 2019 – Springer
… r(U,V) = {\kern 1pt} {\kern 1pt} {\kern 1pt} \frac{1}{2}(\left\| U \right\|_{F}^{2} + \left\| V \right\|_{F}^{2})\ $$. 16). 3.3 Optimization algorithm. Based on (2), (3), (15) and (16), the objective functions of Image to Text (I2T) and Text to Image (T2I) are as follows …

Lemotif: Abstract Visual Depictions of your Emotional States in Life
D Parikh – arXiv preprint arXiv:1903.07766, 2019 – arxiv.org
… Future work also includes incorporating the free- form text in the journal entry while generating the lemotif, and ideally extracting the salient topics and feelings from the free-form text directly using natural language process- ing, freeing the user from having to explicitly specify …

Mode collapse and regularity of optimal transportation maps
N Lei, Y Guo, D An, X Qi, Z Luo, ST Yau… – arXiv preprint arXiv …, 2019 – arxiv.org
… Optimal transport problem attracted the researchers attentions since it was proposed in 1940s, and there were vast amounts of literature in various kinds of fields like computer vision and natural language processing … Generative adversarial text to image synthesis …

A survey of open-world person re-identification
Q Leng, M Ye, Q Tian – … on Circuits and Systems for Video …, 2019 – ieeexplore.ieee.org
… 3) Text-to-Image Re-ID: In addition to person images cap- tured from various cameras, the natural textual statements of eyewitnesses are regarded as the probe in video investigation applications. Thus, how to match corresponding …

Hands-On Generative Adversarial Networks with Keras: Your guide to implementing next-generation generative adversarial networks
R Valle – 2019 – books.google.com
… The generator GANs Summary Chapter 8: Generation of Discrete Sequences Using GANs Technical requirements Natural language generation with … Generator Inference Model trained on words Model trained on characters Summary Chapter 9: Text-to-Image Synthesis with …

A survey of deep learning methods for cyber security
DS Berman, AL Buczak, JS Chavis, CL Corbett – Information, 2019 – mdpi.com
… Afterward, the output from that is used as an input along with the next step. This type of model has been used for various natural language processing tasks [53–56] and image segmentation … These characteristics of malware communications resemble those of natural language …

Task-Aware Feature Generation for Zero-Shot Compositional Learning
X Wang, F Yu, T Darrell, JE Gonzalez – arXiv preprint arXiv:1906.04854, 2019 – arxiv.org
Page 1. Task-Aware Feature Generation for Zero-Shot Compositional Learning Xin Wang, Fisher Yu, Trevor Darrell, and Joseph E. Gonzalez UC Berkeley Abstract. Visual concepts (eg, red apple, big elephant) are often semantically …

Applications of Generative Adversarial Networks (GANs): An Updated
H Alqahtani, M Kavakli?Thorne, G Kumar – 2019 – researchgate.net
Page 1. Vol.:(0123456789) 1 3 Archives of Computational Methods in Engineering https://doi.org/10.1007/s11831-019-09388-y ORIGINAL PAPER Applications of Generative Adversarial Networks (GANs): An Updated Review …

An overview and perspectives on bidirectional intelligence: Lmser duality, double IA harmony, and causal computation.
L Xu – IEEE CAA J. Autom. Sinica, 2019 – ieee-jas.org
… Typically, bidirectional deep learning per- forms various transformations, such as language to language, text to image, text to sketch, sketch to image, image to image, 2D image to 3D image, past to future, image to caption, im- age to sentence, music to dance,…, etc, on which …

An efficient approach for geo-multimedia cross-modal retrieval
L Zhu, J Long, C Zhang, W Yu, X Yuan, L Sun – IEEE Access, 2019 – ieeexplore.ieee.org
… T 2I and Ak I2T , which are named kNN geo-multimedia text to image query (kT2IQ) and kNN geo-multimedia image to text query (kI2TQ) respectively … Thus, the cross-modal text to image query can be denoted as AT 2I ?? (A.MT ) …

Pedestrian attribute recognition: A survey
X Wang, S Zheng, R Yang, B Luo, J Tang – arXiv preprint arXiv …, 2019 – arxiv.org
… Over the past several years, deep learning have achieved an impressive performance due to their success on automatic feature extraction using multi-layer nonlinear transformation, especially in computer vision, speech recognition and natural language processing …

Hands-On Generative Adversarial Networks with PyTorch 1. x: Implement next-generation neural networks to build powerful GAN models using Python
J Hany, G Walters – 2019 – books.google.com
… Zero-shot learning GAN architecture and training Generating photo-realistic images with StackGAN++ High-resolution text-to-image synthesis with … You will apply GAN models to areas such as computer vision, multimedia, and natural language processing using a sample …

Network representation learning: Models, methods and applications
A Mohan, KV Pramod – SN Applied Sciences, 2019 – Springer
… With the success of representation learning on image [44, 60, 128, 134], speech [23, 40, 48], and natural language processing [19, 21, 108 … second component represents the loss wrt image to image similarity and the third component represents the loss wrt text to image similarity …

Classification of diabetic retinopathy using pre-trained deep learning models
IMRK Al-Kamachy – 2019 – earsiv.cankaya.edu.tr
Page 1. CLASSIFICATION OF DIABETIC RETINOPATHY USING PRE-TRAINED DEEP LEARNING MODELS INAS AL-KAMACHY OCTOBER 2019 Page 2. CLASSIFICATION OF DIABETIC RETINOPATHY USING PRE-TRAINED DEEP LEARNING MODELS …

Review of deep learning algorithms and architectures
A Shrestha, A Mahmood – IEEE Access, 2019 – ieeexplore.ieee.org
… Neural Networks can be used in a variety of prob- lems including pattern recognition, classification, clustering, dimensionality reduction, computer vision, natural language processing (NLP), regression, predictive analysis, etc. Here is an example of image recognition …

Hands-On Deep Learning for IoT: Train neural network models to develop intelligent IoT applications
MR Karim – 2019 – books.google.com
Page 1. Hancs-On Deep Learning for ICT Train neural network models to develop intelligent lo? applications Mohammad Abdur Razzaque, PhD www.packticom and Md. Rezaul Karim Page 2. Hands-On Deep Learning for IoT …

Locality and compositionality in zero-shot learning
T Sylvain, L Petrini, D Hjelm – arXiv preprint arXiv:1912.12179, 2019 – arxiv.org
… Page 3. Published as a conference paper at ICLR 2020 graphs (Kipf & Welling, 2016) and natural language processing (Yu et al., 2018). For supervised image classification, a bag of local features processed independently can …

Survey on multi-output learning
D Xu, Y Shi, IW Tsang, YS Ong… – IEEE transactions on …, 2019 – ieeexplore.ieee.org
… Super- resolution construction means constructing a high-resolution image from a low-resolution image. Other image output applications include text-to-image synthesis [48], which generates images from natural language descriptions, and face generation [49] …

Building generative models over discrete structures: from graphical models to deep learning
GA Gane – 2019 – dspace.mit.edu
… 26 1.1 Motivation Structured prediction problems are prevalent in computer vision, natural language … structure is latent, whether modeled via one-shot decoding or sequential approaches, in problems such as natural language modeling via latent tree representations [156] …

Deep materials informatics: Applications of deep learning in materials science
A Agrawal, A Choudhary – MRS Communications, 2019 – cambridge.org
… are trained together, they make each other progressively strong till they achieve the Nash equilibrium.[52] It is not surprising that GANs have found numerous interesting applications in image analysis, such as high-resolution image synthesis,[53] text to image synthe- sis,[54 …

Multimodal Summarization and Beyond
A Khullar – 2019 – amankhullar.github.io
… In this thesis, the foundations of natural language processing in general and multimodal summarization in specific have been explored … Layer 38 6.1.3 Image Embedding Layer 38 6.1.4 Encoder Layer 39 6.1.5 Attention Flow Layer 39 6.1.5.1 Text-to-Image Attention 40 6.1.5.2 …

Heterogeneous Transfer Clustering for Partial Co-occurrence Data
X Ye, L Yang, Q Hu, C Shen, L Jing… – 2019 IEEE 31st …, 2019 – ieeexplore.ieee.org
Page 1. Heterogeneous Transfer Clustering for Partial Co-occurrence Data 1th Xiangyang Ye College of Intelligence and Computing Tianjin University Tianjin, China yexiangyang@tju.edu.cn 2nd Liu Yang College of Intelligence …

Generating synthetic intestine images
S Ivanov – 2019 – diposit.ub.edu
… ing (repairing an image by filling up the missing part, see Yu et al., 2018) and text to image translation (Zhang … Most importantly, they allowed breakthroughs in areas previously lagging like speech recognition (Hannun et al., 2014), natural language processing (Hochreiter and …

Defacement detection with passive adversaries
F Bergadano, F Carretto, F Cogno, D Ragno – Algorithms, 2019 – mdpi.com
Page 1. algorithms Article Defacement Detection with Passive Adversaries Francesco Bergadano 1,* , Fabio Carretto 2, Fabio Cogno 2 and Dario Ragno 2 1 Dipartimento di Informatica, Università di Torino, Corso Svizzera 185 …

Generating Datasets for Classification Task and Predicting Best Classifiers with Conditional Generative Adversarial Networks
I Kachalsky, A Zabashta, A Filchenkov… – Proceedings of the 2019 …, 2019 – dl.acm.org
… Generative adversarial text to image synthesis. arXiv preprint arXiv:1605.05396, 2016 … In 2015 Artificial Intelligence and Natural Language and Information Extraction, Social Media and Web Search FRUCT Conference (AINL-ISMW FRUCT), pages 11–18, 2015 …

Event Extraction With Imitation Learning and Cross-Media Inference on Streaming Data
T Zhang – 2019 – search.proquest.com
… Figure 2.1 illustrates an example of each aforementioned component extracted from a natural language sentence … Many approaches combine the information, resources, and features from both Com- puter vision (CV) and natural language processing (NLP) sides …

Video Compression Coding via Colorization: A Generative Adversarial Network (GAN)-Based Approach
Z Pan, F Yuan, J Lei, S Kwong – arXiv preprint arXiv:1912.10653, 2019 – arxiv.org
… In recent years, GANs have been widely used for multi- media processing, and the applications are summarized as that image super-resolution [6], image translation [7], video content analysis [8][9], natural language processing [10][11], unmanned vehicle [12], medicine …

Logic could be learned from images
Q Guo, Y Qian, X Liang, Y She, D Li, J Liang – arXiv preprint arXiv …, 2019 – arxiv.org
… (4) Image captioning is to describe the content of an image by using reasonably formed natural sentences. (5) Visual question answering (VQA) is to automatically answer natural language questions according to related the image content …

The market generator
A Kondratyev, C Schwarz – Available at SSRN 3384948, 2019 – papers.ssrn.com
Page 1. The Market Generator Alexei Kondratyev † Christian Schwarz ‡ January 27, 2020 Abstract We propose to use a special type of generative neural networks ? a Restricted Boltz- mann Machine (RBM) ? to build a powerful …

Techniques All Classifiers Can Learn from Deep Networks: Models, Optimizations, and Regularization
A Ghods, DJ Cook – arXiv preprint arXiv:1909.04791, 2019 – arxiv.org
… Many deep learning survey pa- pers have been published that provide a primer on the topic [36], or highlight the many application areas such as object detection [37], medical record analysis [38], activity recognition [39], and natural language processing [40] …

Generative Adversarial Network based machine for fake data generation
E Piacentino – 2019 – upcommons.upc.edu
… MRI Magnetic Resonance Image NLP Natural Language Processing NN Neural Network … Applications for this type of problem are quite diverse, below are listed some examples: Generative Adversarial Text to Image Synthesis by Scott Reed et al. [10] …

ASurvey OF TECHNIQUES ALL CLASSIFIERS CAN LEARN FROM DEEP NETWORKS: MODELS, OPTIMIZATIONS, AND REGULARIZATION
A Ghods, DJ Cook – arXiv preprint arXiv:1909.04791, 2019 – researchgate.net
… Many deep learning survey papers have been published that provide a primer on the topic [36] or highlight diverse applications such as object detection [37], medical record analysis [38], activity recognition [39], and natural language processing [40] …

Speech-driven expressive talking lips with conditional sequential generative adversarial networks
N Sadoughi, C Busso – IEEE Transactions on Affective …, 2019 – ieeexplore.ieee.org
… [34] for text-to- image synthesis, our learning strategy includes two kinds of fake samples during the training of the discriminator: samples generated by the generator, and original samples with lip motion and speech features extracted from different recordings …

An Exploration into Synthetic Data and Generative Adversarial Networks
CM Shorten – 2019 – search.proquest.com
Page 1. AN EXPLORATION INTO SYNTHETIC DATA AND GENERATIVE ADVERSARIAL NETWORKS by Connor M. Shorten A Thesis Submitted to the Faculty of The College of Engineering and Computer Science in Partial Fulfillment of the Requirements for the Degree …

Smart libraries: an emerging and innovative technological habitat of 21st century
S Gul, S Bano – The Electronic Library, 2019 – emerald.com
… of all kinds must be indexed effectively, from small communities to large disciplines, from formal to informal communications, from text to image and video … and hits a user that can lead a user to a right information track), smart intuitive searching in natural language (what users …

4: THE GOOD, THE BAD AND THE BEAUTY OF’GOOD ENOUGH DATA’
M GUTIÉRREZ – Good Data, 2019 – books.google.com
… 34 Ibid. 35 Harvey J. Miller,’The Data Avalanche is Here: Shouldn’t we be Digging?’Wiley Periodicals, Inc., 22 August 2009, 2. 36 Lewis Lancaster,’From Text to Image to Analysis: Visualization of Chinese Buddhist Canon’, Digital Humanities, 2010, pp …

Towards improving the network architecture of GANs and their evaluation methods
S Barua – 2019 – minerva-access.unimelb.edu.au
Page 1. Towards Improving the Network Architecture of GANs and Their Evaluation Methods Author: Sukarna Barua ORCID: 0000-0003-3978-4757 Master of Philosophy Thesis June 2019 School of Computing and Information Systems The University of Melbourne …

Robust deep softmax regression against label noise for unsupervised domain adaptation
G Wu, D Zhang, W Chen, W Zuo… – International Journal of …, 2019 – World Scientific
Page 1. Robust Deep Softmax Regression Against Label Noise for Unsupervised Domain Adaptation Guangbin Wu*,§ , David Zhang †,?, Weishan Chen*,||, Wangmeng Zuo ‡,** and Zhuang Xia*,†† *State Key Laboratory of …

Pattern Recognition and Computer Vision: Second Chinese Conference, PRCV 2019, Xi’an, China, November 8–11, 2019, Proceedings, Part III
Z Lin, L Wang, J Yang, G Shi, T Tan, N Zheng, X Chen… – 2019 – books.google.com
Page 1. Zhouchen Lin · Liang Wang· Jian Yang · Guangming Shi· Tieniu Tan · Nanning Zheng· Xilin Chen · Yanning Zhang (Eds.) Pattern Recognition and Computer Vision Second Chinese Conference, PRCV 2019 Xi’an, China, November 8–11, 2019 Proceedings, Part III …

Fine-grained Fitting Experience Prediction: A 3D-slicing Attention Approach
S Huang, Z Wang, L Cui, Y Jiang, R Gao – Proceedings of the 27th ACM …, 2019 – dl.acm.org
… parallel slice sequences. With the development of attention mechanisms, a series of works are proposed which has been widely used in both natural language processing [6, 37] and computer vision [7, 35]. Attention mech- anisms …

Report of 2017 NSF Workshop on Multimedia Challenges, Opportunities and Research Roadmaps
SF Chang, A Hauptmann, LP Morency, S Antani… – arXiv preprint arXiv …, 2019 – arxiv.org
… More recently, multimodal translation has seen renewed interest due to combined efforts of the computer vision and natural language processing communities [Bernardi et al., 2016], tackling problems such as image description [Vinyals et al., 2014] and video captioning …

Synthesis of Realistic ECG using Generative Adversarial Networks
AM Delaney, E Brophy, TE Ward – arXiv preprint arXiv:1909.09150, 2019 – arxiv.org
… 3.3 Convolutional Neural Networks Convolutional Neural Networks have shown great success in computer vision and natural language processing. In this work, one-dimensional CNNs are used as they are well-suited to time series data. CNNs consist of a convolution layer …

Enhanced Deep Network Designs Using Mitochondrial DNA Based Genetic Algorithm and Importance Sampling
A Shrestha – 2019 – scholarworks.bridgeport.edu
… 68 2.6.10 Very Deep Convolutional Networks for Natural Language Processing ….. 69 … classification, clustering, dimensionality reduction, computer vision, natural language processing (NLP), regression, predictive analysis, etc. Figure 1 is an example of image …

Leveraging long and short-term information in content-aware movie recommendation via adversarial training
W Zhao, B Wang, M Yang, J Ye, Z Zhao… – IEEE transactions on …, 2019 – ieeexplore.ieee.org
… At each time step, the context information assists the inference of the hidden states of LSTM model. 4) LSIC-V4: The last strategy is inspired by the recent success of attention mechanism in natural language process- ing and computer vision [36], [37] …

Massive Online Data Streams (MODS)
S Dlugolinsky, G Nguyen – 2019 – digital.csic.es
… 2018). – Recurrent Neural Networks (RNN) are specially designed to deal with sequential data. They are widely used in Natural Language Processing (NLP) like Neural Machine Trans- lation (Wu et al … 2018), text-to-image synthesis (Zhang et al …

Multi-modal Deep Analysis for Multimedia
W Zhu, X Wang, H Li – … on Circuits and Systems for Video …, 2019 – ieeexplore.ieee.org
… candidate. Several works on natural language processing (NLP) [53], [54] and computer vision (CV) [55] have gained success through employing stacked denoising autoencoders to learn the domain-invariant deep representations …

Leveraging zero-knowledge succinct arguments of knowledge for efficient verification of outsourced training of artificial neural networks
MJ Zande – 2019 – pdfs.semanticscholar.org
… of common network structures by Salvaris et al.[47] and a generalisation of the works described by Schmidhu- ber[48] we describe four general constructs that have led to applica- tions in image classification, object detection, speech recognition and natural language processing …

A survey on canonical correlation analysis
X Yang, L Weifeng, W Liu, D Tao – IEEE Transactions on …, 2019 – ieeexplore.ieee.org
Page 1. 1041-4347 (c) 2019 IEEE. Personal use is permitted, but republication/ redistribution requires IEEE permission. See http://www.ieee.org/ publications_standards/publications/rights/index.html for more information. This …

RGB-D-based object recognition using multimodal convolutional neural networks: A survey
M Gao, J Jiang, G Zou, V John, Z Liu – IEEE Access, 2019 – ieeexplore.ieee.org
Page 1. 2169-3536 (c) 2018 IEEE. Translations and content mining are permitted for academic research only. Personal use is also permitted, but republication/ redistribution requires IEEE permission. See http://www.ieee.org …

Founding Editors
G Goos, J Hartmanis, E Bertino, W Gao, B Steffen… – researchgate.net
Page 1. Page 2. Lecture Notes in Computer Science 11731 Founding Editors Gerhard Goos Karlsruhe Institute of Technology, Karlsruhe, Germany Juris Hartmanis Cornell University, Ithaca, NY, USA Editorial Board Members …

INTELLIGENCE SCIENCE AND BIG DATA ENGINEERING. VISUAL DATA ENGINEERING: 9th
Z Cui, J Pan, S Zhang, L Xiao, J Yang – 2019 – books.google.com
Page 1. Zhen Cui · Jinshan Pan · Shanshan Zhang· Liang Xiao · Jian Yang (Eds.) Intelligence Science and Big Data Engineering Visual Data Engineering 9th International Conference, IScIDE 2019 Nanjing, China, October 17–20, 2019 Proceedings, Part I Page 2 …

Artificial Neural Networks and Machine Learning–ICANN 2019
IV Tetko, V Kurková, P Karpov, F Theis – researchgate.net
Page 1. Igor V. Tetko · Vera Ku?rková · Pavel Karpov · Fabian Theis (Eds.) LNCS 11731 28th International Conference on Artificial Neural Networks Munich, Germany, September 17–19, 2019 Proceedings Artificial Neural Networks and Machine Learning – ICANN 2019 …

Knowledge Science, Engineering and Management
C Douligeris, D Karagiannis, D Apostolou – Lecture Notes in Computer …, 2019 – Springer
… Page 13. Text to Image Synthesis Using Two-Stage Generation and Two-Stage Discrimination . . . . . 110 Zhiqiang Zhang, Yunye Zhang, Wenxin Yu, Gang He, Ning Jiang, Gang He, Yibo Fan, and Zhuo Yang …

Near-future Prediction in Videos: Applications in Video Annotation and Frame Reconstruction
TB Mahmud – 2019 – search.proquest.com
… Generating description of visual content is an interesting problem in both computer vision and natural language processing community since it exploits the relationship between two of the richest modalities to make semantic representation meaningful …

Machine Learning for Flavor Development
D Xu – 2019 – dash.harvard.edu
… In addition to excelling at generating human faces, GANs have also proven effective at tasks ranging from text-to-image generation [25] to drug development [8]. For example, GANs were used in the promising chemical structure development algorithm described in section 1.2 [8 …

Security and Privacy for Smart Cyber-Physical Systems
L Ma – downloads.hindawi.com
Page 1. Security and Communication Networks Security and Privacy for Smart Cyber-Physical Systems Lead Guest Editor: Liran Ma Guest Editors: Yan Huo, Chunqiang Hu, and Wei Li Page 2. Security and Privacy for Smart Cyber-Physical Systems Page 3 …