Copyright: © the author(s), publisher and licensee Technoscience Academy. This is an open-access article distributed under the
terms of the Creative Commons Attribution Non-Commercial License, which permits unrestricted non-commercial use,
distribution, and reproduction in any medium, provided the original work is properly cited
International Journal of Scientific Research in Science, Engineering and Technology
Print ISSN: 2395-1990 | Online ISSN : 2394-4099 (www.ijsrset.com)
doi : https://doi.org/10.32628/IJSRSET
623
Deep Learning Based Text to Image Generation
G. Ajay
*1
, Ch. Sai Teja
2
, P. Baswaraj
3
, V. Vasanth
4
, Dr. G. Sreenivasulu
5
*1-4
B.Tech. Student,
5
Professor
CSE Department, JB Institute of Engineering and Technology, Hyderabad, India
A R T I C L E I N F O A B S T R A C T
Article History:
Accepted: 05 April 2023
Published: 23 April 2023
Text-to-image generation is a method used for generating images related to
given textual descriptions. It has a significant influence on many research
areas as well as a diverse set of applications (e.g., photo-searching, photo-
editing, art generation, computer-aided design, image re-construction,
captioning, and portrait drawing). The most challenging task is to
consistently produce realistic images according to given conditions. Existing
algorithms for text-to-image generation create pictures that do not properly
match the text. We considered this issue in our study and built a deep
learning-based architecture for semantically consistent image generation:
recurrent convolutional generative adversarial network (RC-GAN). RC-
GAN successfully bridges the advancements in text and picture modelling,
converting visual notions from words to pixels. The proposed model was
trained on the Oxford-102 flowers dataset, and its performance was
evaluated using an inception score and PSNR. The experimental results
demonstrate that our model is capable of generating more realistic photos of
flowers from given captions, with an inception score of 4.15 and a PSNR
value of 30.12 dB, respectively. Generating images from natural language is
one of the primary applications of conditional generative models. This
project uses Generative Adversarial Networks (GANs) to generate an image
given a text description. GANs are Deep Neural Networks that are generative
models of data. Given a group of coaching data, GANs can learn to estimate
the underlying probability distribution of the info. In this project, the model
is trained on the Caltech birds dataset. Recent progress has been made using
GANs.
Keywords: PSNR, GAN, Caltech birds dataset, NLP, CNN, RNN, CNN
Publication Issue
Volume 10, Issue 2
March-April-2023
Page Number
623-628
I. INTRODUCTION
When people listen to or read a narrative, they quickly
create pictures in their mind to visualize the content.
Many cognitive functions, such as memorization,
reasoning ability, and thinking, rely on visual mental
imaging or “seeing with the mind’s eye”. Developing a
technology that recognizes the connection between
vision and words and can produce pictures that
represent the meaning of written descriptions is a big
step toward user intellectual ability.Image- processing
techniques and applications of computer vision (CV)