Real-time image processing and regions of interest extraction are crucial in the non-standard welding process guided by structured light vision. However, due to the impact of detection speed, accuracy, and applicability, existing methods are difficult to apply directly. To address these issues, we have screened various improvement methods through experiments and provided a practical lightweight algorithm, YOLO-DGB, based on YOLOv5s, a current mainstream object detection algorithm. The proposed algorithm introduces depthwise separable convolution and Ghost modules into the backbone network to reduce the number of parameters and floating-point operations per second (FLOPs) in the detection process. To meet the accuracy requirements of the detection network, a bottleneck transformer is introduced after the spatial pyramid pooling fast, which improves the detection effect while ensuring a reasonable number of parameters and computation. To address the issue of insufficient datasets, we propose an improved DCGAN network to enhance the collected images. Compared with the original YOLOv5s network, our proposed algorithm reduces the number of parameters and FLOPs by 35.5% and 44.8%, respectively, while increasing the mAP of the model from 96.5% to 98.1%. Experimental results demonstrate that our algorithm can effectively meet the requirements of actual production processes.
A common problem in the field of object detection is that the image features could not be fully expressed. And another issue is that the static query selection in the detection transformer (DETR)-like models cannot adapt well to different datasets due to the fixed number of selected object queries. To solve these problems, hollow attention (HA) and dynamic query selection (DQS) modules were proposed, and a network HA-DQS-Net was further formed. HA integrates specially designed masks into self-attention to better combine channel and spatial directional feature information, thereby learning more complex and comprehensive target features. DQS improves the idea of static query selection in the current DETR-like model by dynamically selecting the number of object queries based on the actual number of targets in the image, which enhances the accuracy of the model. HA-DQS-Net, which combines the advantages of HA and DQS, has a competitive performance in the field of object detection. The excellent detection effectiveness of our viewpoint is validated based on PASVAL VOC and a homemade smoking dataset. It is worth noting that all APs have been improved when HA is applied to different DETR-like models, which improves the universality of the HA module.
KEYWORDS: 3D image reconstruction, Natural surfaces, 3D image processing, Tissues, Polarization, Polarimetry, Depth maps, 3D modeling, 3D acquisition, Surgery
By acquiring three-dimensional profiles of biological tissues, interventions can be performed with increased speed and accuracy, driving the development of next-generation image guided therapy. However, current three-dimensional reconstruction techniques relying on feature detection and matching struggle with tissues lacking distinct features, resulting in relatively sparse reconstruction results. In this paper, we propose a data-driven method for reconstructing three-dimensional surfaces from a single polarimetric image, utilizing physics-based priors. We constructed a calibrated imaging system consisting of a polarization camera and a 3D scanner to collect polarization information and ground truth 3D data. Using this system, we created a dataset with organ models, capturing polarization images, depth maps, and surface normal maps under different lighting conditions. To achieve our goal, we designed a deep neural network based on the Unet architecture. This network takes the polarization image and prior physical parameter maps (phase angle, degree of polarization, and unpolarized intensity) as inputs and is trained to output the surface normal map and relative depth map of the organ. Experimental results on the tissue phantom dataset demonstrate the effectiveness of our method in generating dense reconstruction results, even for the regions lacking distinct features. Furthermore, we validated the robustness of our method to changes in the light source direction, showcasing its ability to handle variations in lighting conditions. Overall, our proposed data-driven approach provides a promising solution for dense three-dimensional reconstruction from a single polarimetric image, leveraging physics-based priors and deep learning techniques.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.