inis 21(28): e3

Research Article

A Deep Neural Network-Based Approach for Extracting Textual Images from Deteriorate Images

Download539 downloads
  • @ARTICLE{10.4108/eai.17-9-2021.170961,
        author={Binay Kumar Pandey and Digvijay Pandey and Subodh Wariya and Gaurav Agarwal},
        title={A Deep Neural Network-Based Approach for Extracting  Textual Images from Deteriorate Images},
        journal={EAI Endorsed Transactions on Industrial Networks and Intelligent Systems},
        volume={8},
        number={28},
        publisher={EAI},
        journal_a={INIS},
        year={2021},
        month={9},
        keywords={Deep neural network, Databases, Images, Segments},
        doi={10.4108/eai.17-9-2021.170961}
    }
    
  • Binay Kumar Pandey
    Digvijay Pandey
    Subodh Wariya
    Gaurav Agarwal
    Year: 2021
    A Deep Neural Network-Based Approach for Extracting Textual Images from Deteriorate Images
    INIS
    EAI
    DOI: 10.4108/eai.17-9-2021.170961
Binay Kumar Pandey1,*, Digvijay Pandey2, Subodh Wariya3, Gaurav Agarwal4
  • 1: Department of Information Technology, College of Technology, Govind Ballabh Pant University of Agriculture and Technology Pantnagar, Uttarakhand, India
  • 2: Department of Electronics Engineering, Institute of Engineering and Technology, Dr. A.P.J. Abdul Kalam Technical University, Lucknow, India
  • 3: Department of Electronics Engineering, Institute of Engineering and Technology, Lucknow, India
  • 4: Invertis University Bareilly, U.P India
*Contact email: binaydece@gmail.com

Abstract

INTRODUCTION: The quantity of audio and visual data is increasing exponentially due to the internet's rapid growth. The digital information in images and videos could be used for fully automated captions, indexing, and image structuring. The online image and video data system has seen a significant increase. In such a dataset, images and videos must be retrieved, explored, as well as inspected.

OBJECTIVES: Text extraction is crucial for locating critical as well as important data. Disturbance is indeed a critical factor that affects image quality, and this is primarily generated during image acquisition and communication operations. An image can be contaminated by a variety of noise-type disturbances. A text in the complex image includes a variety of information which is used to recognise textual as well as non-textual particulars. The particulars in the complicated corrupted images have been considered important for individuals seeing the entire issue. However, text in complicated degraded images exhibits a rapidly changing form in an unconstrained circumstance, making textual data recognition complicated.

METHODS: The naïve bayes algorithm is a weighted reading technique is used to generate the correct text data from the complicated image regions. Usually, images hold some disturbance as a result of the fact that filtration is proposed during the early pre-processing step. To restore the image's quality, the input image is processed employing gradient and contrast image methods. Following that, the contrast of the source images would be enhanced using an adaptive image map. Stroke width transform, Gabor transform, and weighted naïve bayes classifier methodologies have been used in complicated degraded images to segment, features extraction, and detect textual and non-textual elements.

RESULTS: Finally, to identify categorised textual data, the confluence of deep neural networks and particle swarm optimization is being used. The dataset IIIT5K is used for the development portion, and also the performance of the suggested methodology is assessed by utilizing parameters like as accuracy, recall, precision, and F1 score. It performs well enough for record collections such as articles, even when significantly distorted, and is thus suitable for creating library information system databases.

CONCLUSION: A combination of deep neural network and particle swarm optimization is being used to recognise classified text. The dataset IIIT5K is used for the development portion, and while high performance is achieved with parameters such as accuracy, recall, precision, and F1 score, characters may occasionally deviate. Alternatively, the same character is frequently extracted [3] multiple times, which may result in incorrect textual data being extracted from natural images. As a result, an efficient technique for avoiding such flaws in the text retrieval process must be implemented in the near future.