A Deep Neural Network-Based Approach for Extracting  Textual Images from Deteriorate Images

Binay Pandey; Digvijay Pandey; Subodh Wariya; Gaurav Agarwal

Research Article

A Deep Neural Network-Based Approach for Extracting Textual Images from Deteriorate Images

Download539 downloads

Cite: BibTeX Plain Text

@ARTICLE{10.4108/eai.17-9-2021.170961,
    author={Binay Kumar Pandey and Digvijay Pandey and Subodh Wariya and Gaurav Agarwal},
    title={A Deep Neural Network-Based Approach for Extracting  Textual Images from Deteriorate Images},
    journal={EAI Endorsed Transactions on Industrial Networks and Intelligent Systems},
    volume={8},
    number={28},
    publisher={EAI},
    journal_a={INIS},
    year={2021},
    month={9},
    keywords={Deep neural network, Databases, Images, Segments},
    doi={10.4108/eai.17-9-2021.170961}
}

Binay Kumar Pandey
Digvijay Pandey
Subodh Wariya
Gaurav Agarwal
Year: 2021
A Deep Neural Network-Based Approach for Extracting Textual Images from Deteriorate Images
INIS
EAI
DOI: 10.4108/eai.17-9-2021.170961

Binay Kumar Pandey¹^,*, Digvijay Pandey², Subodh Wariya³, Gaurav Agarwal⁴

1: Department of Information Technology, College of Technology, Govind Ballabh Pant University of Agriculture and Technology Pantnagar, Uttarakhand, India
2: Department of Electronics Engineering, Institute of Engineering and Technology, Dr. A.P.J. Abdul Kalam Technical University, Lucknow, India
3: Department of Electronics Engineering, Institute of Engineering and Technology, Lucknow, India
4: Invertis University Bareilly, U.P India

*Contact email: binaydece@gmail.com

Abstract

INTRODUCTION: The quantity of audio and visual data is increasing exponentially due to the internet's rapid growth. The digital information in images and videos could be used for fully automated captions, indexing, and image structuring. The online image and video data system has seen a significant increase. In such a dataset, images and videos must be retrieved, explored, as well as inspected.

OBJECTIVES: Text extraction is crucial for locating critical as well as important data. Disturbance is indeed a critical factor that affects image quality, and this is primarily generated during image acquisition and communication operations. An image can be contaminated by a variety of noise-type disturbances. A text in the complex image includes a variety of information which is used to recognise textual as well as non-textual particulars. The particulars in the complicated corrupted images have been considered important for individuals seeing the entire issue. However, text in complicated degraded images exhibits a rapidly changing form in an unconstrained circumstance, making textual data recognition complicated.

METHODS: The naïve bayes algorithm is a weighted reading technique is used to generate the correct text data from the complicated image regions. Usually, images hold some disturbance as a result of the fact that filtration is proposed during the early pre-processing step. To restore the image's quality, the input image is processed employing gradient and contrast image methods. Following that, the contrast of the source images would be enhanced using an adaptive image map. Stroke width transform, Gabor transform, and weighted naïve bayes classifier methodologies have been used in complicated degraded images to segment, features extraction, and detect textual and non-textual elements.

RESULTS: Finally, to identify categorised textual data, the confluence of deep neural networks and particle swarm optimization is being used. The dataset IIIT5K is used for the development portion, and also the performance of the suggested methodology is assessed by utilizing parameters like as accuracy, recall, precision, and F1 score. It performs well enough for record collections such as articles, even when significantly distorted, and is thus suitable for creating library information system databases.

CONCLUSION: A combination of deep neural network and particle swarm optimization is being used to recognise classified text. The dataset IIIT5K is used for the development portion, and while high performance is achieved with parameters such as accuracy, recall, precision, and F1 score, characters may occasionally deviate. Alternatively, the same character is frequently extracted [3] multiple times, which may result in incorrect textual data being extracted from natural images. As a result, an efficient technique for avoiding such flaws in the text retrieval process must be implemented in the near future.

Keywords: Deep neural network, Databases, Images, Segments

Received: 2021-08-28
Accepted: 2021-09-12
Published: 2021-09-17
Publisher: EAI

: http://dx.doi.org/10.4108/eai.17-9-2021.170961

Copyright © 2021 Binay Kumar Pandey et al., licensed to EAI. This is an open access article distributed under the terms of the Creative Commons Attribution license, which permits unlimited use, distribution and reproduction in any medium so long as the original work is properly cited.