Natural scene text recognition is one of the most challenging tasks in recent years. Compared with traditional document text, natural scene text has the characteristics of various shapes and different directions, so the accuracy of scene text recognition still needs to be improved. In order to locate the text region better and identify the text content more accurate, we present a multi-scale deformable convolution network model for text recognition. The initial image is irregularly corrected through the rectified network, and the ResNet with FPN structure is used as the backbone network to achieve multi-scale feature extraction. In addition, the feature fusion method of Add is adopted to reduce feature information losing and increase the strength of feature extraction in the text area. The deformable convolution block is introduced in the deep convolution to improve the deformation modeling ability of convolution and expand the receptive field. The prediction module adopts the Transformer and abandons the inherent pre and post attributes of RNN to realize parallel operation and solve the problem of path length between remote dependencies. In order to evaluate the effectiveness of the proposed method, we trained our model on two mixed data sets, MJSynth and SynthText, and tested it on some regular and irregular data sets. The experiment results demonstrate that this method performs well in irregular scene text recognition, especially in CUTE80.
We propose a remote sensing image semantic segmentation model based on dual attention and multi-scale feature fusion to solve the problems of objects scale differences and missing small objects. This model uses ResNet50 in the coding part to extract features. First of all, the output features of each stage of ResNet50 are introduced into the pyramid pooling module, making full use of the multi-scale context information of the image to cope with the change of the object scales. Secondly, the dual attention is introduced in the final output features of ResNet50 to establish the semantic relationship between the spatial and channel dimensions, which enhances the ability of feature representation and improve the condition that small targets are difficult to segment. Finally, starting from the output features of the attention module, the features of all levels are gradually integrated to complete decoding to refine the target segmentation edge. The designed comparative experiments results show the effectiveness of the proposed method.
In videos, the waves, floating objects on the sea, peaks, and other objects passing by the ships may cause the shielding of the interest objects, and the ships are often disturbed by the same color background, which will easily lead to tracking failure. This paper presents a ship tracking algorithm based on deep learning and multi-feature, the algorithm utilizes an improved YOLO and multi-feature ship detection method to detect the ships, establishes the correlation of the same ships among different frames by the improved SIFT matching algorithm to realize ship tracking. The improved YOLO and multi-feature ship detection algorithm is proposed, YOLO method is optimized, and the optimization method is combined with HOG and LBP features, which is beneficial to solve the problems of easy omission and inaccurate positioning of YOLO network detection. SIFT matching algorithm is improved to solve the problems of lower accuracy and too long time for traditional SIFT matching algorithm, the SIFT features are reduced by MDS(multi-dimensional scaling), RANSAC(random sample consensus) is used to optimize SIFT feature matching and effectively eliminate mismatching. The experiment results show the tracking algorithm has higher accuracy, stronger robustness and better real-time.
In multitemporal very-high-resolution urban remote sensing images, buildings, especially high-rise buildings, show difference in terms of morphology due to the different view angles. In the coregistered images, the pixels of the same building are not corresponding to each other, which causes false alarm in change detection. Our objective is to find out the matching points located on the roofs of high-rise buildings. When the difference of view angle between the coregistered images is fixed, we discover that there are spatial translation relationships, i.e., local translation transformation and fixed angle offset, between the point matches of high-rise building roofs. Therefore, using these relationships, a method that can sift out point matches of roofs to their correct positions is proposed. The experimental results show that most point matches located on the roof of the same building can be fast and correctly sifted out.
Finding the change in multi-temporal remote sensing image is important in many the image application. Because of the infection of climate and illumination, the texture of the ground object is more stable relative to the gray in high-resolution remote sensing image. And the texture features of Local Binary Patterns (LBP) and Speeded Up Robust Features (SURF) are outstanding in extracting speed and illumination invariance. A method of change detection for matched remote sensing image pair is present, which compares the similarity by LBP and SURF to detect the change and unchanged of the block after blocking the image. And region growing is adopted to process the block edge zone. The experiment results show that the method can endure some illumination change and slight texture change of the ground object.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.