Digital images captured under insufficient lighting conditions may suffer from issues such as low contrast and poor visual quality. However, transformers treat images as one-dimensional sequential data, lacking the modeling of local visual structures, thus resulting in a shortage of feature extraction for degraded low-light images. In addition, transformer-based methods require longer training schedules to achieve better performance. We introduce a novel method, named local enhancement transformer (LET). By incorporating convolutions into transformer blocks, we improve our model’s capability to extract features from degraded low-light images. Furthermore, we propose a multi-level enhancement block to adaptively fuse features with learnable correlations among different levels. With the support of these two designs, LET can extract more useful features while requiring less training time. Experimental evaluations conducted on LOL and MIT-5K datasets prove that LET is superior to the state-of-the-art. |
ACCESS THE FULL ARTICLE
No SPIE Account? Create one
Image enhancement
Transformers
Convolution
Education and training
Feature extraction
Visualization
Light sources and illumination