Paper
21 June 2024 Enhancing text extraction from natural scene images through diffusion models
Baojun Jia, Wenyu Zhang, Yifan Chen, Jin Ma
Author Affiliations +
Proceedings Volume 13167, International Conference on Remote Sensing, Mapping, and Image Processing (RSMIP 2024); 1316722 (2024) https://doi.org/10.1117/12.3029668
Event: International Conference on Remote Sensing, Mapping and Image Processing (RSMIP 2024), 2024, Xiamen, China
Abstract
This paper focuses on super-resolution of natural scene text images, serving as a crucial preprocessing step for image recognition, with the aim of extracting text information from low-resolution images. In this preprocessing step, the primary challenges stem from the impact of factors such as lighting, occlusion, and distortion on text features in natural scenes, as well as the tendency of small pixel values in low-resolution images to be overlooked by large models. Previous works have primarily focused on the capability to extract features. However, the key to addressing this problem lies in restoring the different distributions between high-resolution and low-resolution images. As widely acknowledged, diffusion models excel in learning to generate distributions from datasets. Therefore, we introduce a conditional diffusion model to learn feature distributions among diverse images and apply histogram matching to address the issue of n patches. Our experimental results confirm the effectiveness of our approach.
(2024) Published by SPIE. Downloading of the abstract is permitted for personal use only.
Baojun Jia, Wenyu Zhang, Yifan Chen, and Jin Ma "Enhancing text extraction from natural scene images through diffusion models", Proc. SPIE 13167, International Conference on Remote Sensing, Mapping, and Image Processing (RSMIP 2024), 1316722 (21 June 2024); https://doi.org/10.1117/12.3029668
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Diffusion

Data modeling

Education and training

Feature extraction

Super resolution

Histograms

Visual process modeling

Back to Top