PurposeMedical technology for minimally invasive surgery has undergone a paradigm shift with the introduction of robot-assisted surgery. However, it is very difficult to track the position of the surgical tools in a surgical scene, so it is crucial to accurately detect and identify surgical tools. This task can be aided by deep learning-based semantic segmentation of surgical video frames. Furthermore, due to the limited working and viewing areas of these surgical instruments, there is a higher chance of complications from tissue injuries (e.g., tissue scars and tears).ApproachWith the aid of digital inpainting algorithms, we present an application that uses image segmentation to remove surgical instruments from laparoscopic/endoscopic video. We employ a modified U-Net architecture (U-NetPlus) to segment the surgical instruments. It consists of a redesigned decoder and a pre-trained VGG11 or VGG16 encoder. The decoder was modified by substituting an up-sampling operation based on nearest-neighbor interpolation for the transposed convolution operation. Furthermore, these interpolation weights do not need to be learned to perform upsampling, which eliminates the artifacts generated by the transposed convolution. In addition, we use a very fast and adaptable data augmentation technique to further enhance performance. The instrument segmentation mask is filled in (i.e., inpainted) by the tool removal algorithms using the previously acquired tool segmentation masks and either previous instrument-containing frames or instrument-free reference frames.ResultsWe have shown the effectiveness of the proposed surgical tool segmentation/removal algorithms on a robotic instrument dataset from the MICCAI 2015 and 2017 EndoVis Challenge. We report a 90.20% DICE for binary segmentation, a 76.26% DICE for instrument part segmentation, and a 46.07% DICE for instrument type (i.e., all instruments) segmentation on the MICCAI 2017 challenge dataset using our U-NetPlus architecture, outperforming the results of earlier techniques used and tested on these data. In addition, we demonstrated the successful execution of the tool removal algorithm from surgical tool-free videos that contained moving surgical tools that were generated artificially.ConclusionsOur application successfully separates and eliminates the surgical tool to reveal a view of the background tissue that was otherwise hidden by the tool, producing results that are visually similar to the actual data.
Label noise is inevitable in medical image databases, which can degrade the actual performance of supervised deep learning models and can bias the model's evaluation. Existing literature show that label noise in one class has minimal impact on model’s performance for another class in natural image classification problems where different target classes have relatively distinct shape and share minimal visual cues for knowledge transfer among the classes. However, it is not clear how class-dependent label noise affects the model’s performance when operating on medical images, for which different output classes can be difficult to distinguish even for experts, and there is a high possibility of knowledge transfer across classes during the training period. We hypothesize and investigate that for medical image classification tasks where different classes share very similar shape with differences only in texture, the noisy label for one class might affect the performance across other classes.
While deep learning has shown potential in solving a variety of medical image analysis problems including segmentation, registration, motion estimation, etc., their applications in the real-world clinical setting are still not affluent due to the lack of reliability caused by the failures of deep learning models in prediction. Furthermore, deep learning models need a large number of labeled datasets. In this work, we propose a novel method that incorporates uncertainty estimation to detect failures in the segmentation masks generated by CNNs. Our study further showcases the potential of our model to evaluate the correlation between the uncertainty and the segmentation errors for a given model. Furthermore, we introduce a multi-task cross-task learning consistency approach to enforce the correlation between the pixel-level (segmentation) and the geometric-level (distance map) tasks. Our extensive experimentation with varied quantities of labeled data in the training sets justifies the effectiveness of our model for the segmentation and uncertainty estimation of the left ventricle (LV), right ventricle (RV), and myocardium (Myo) at end-diastole (ED) and end-systole (ES) phases from cine MRI images available through the MICCAI 2017 ACDC Challenge Dataset. Our study serves as a proof-of-concept of how uncertainty measure correlates with the erroneous segmentation generated by different deep learning models, further showcasing the potential of our model to flag low-quality segmentation from a given model in our future study.
Surgical tool segmentation is becoming imperative to provide detailed information during intra-operative execution. These tools can obscure surgeons’ dexterity control due to narrow working space and visual field-of-view, which increases the risk of complications resulting from tissue injuries (e.g. tissue scars and tears). This paper demonstrates a novel application of segmenting and removing surgical instruments from laparoscopic/endoscopic video using digital inpainting algorithms. To segment the surgical instruments, we use a modified U-Net architecture (U-NetPlus) composed of a pre-trained VGG11 or VGG16 encoder and redesigned decoder. The decoder is modified by replacing the transposed convolution operation with an up-sampling operation based on nearest-neighbor (NN) interpolation. This modification removes the artifacts generated by the transposed convolution, and, furthermore, these new interpolation weights require no learning for the upsampling operation. The tool removal algorithms use the tool segmentation mask and either the instrument-free reference frames or previous instrument-containing frames to fill-in (i.e., inpaint) the instrument segmentation mask with the background tissue underneath. We have demonstrated the performance of the proposed surgical tool segmentation/removal algorithms on a robotic instrument dataset from the MICCAI 2015 EndoVis Challenge. We also showed successful performance of the tool removal algorithm from synthetically generated surgical instruments-containing videos obtained by embedding a moving surgical tool into surgical tool-free videos. Our application successfully segments and removes the surgical tool to unveil the background tissue view otherwise obstructed by the tool, producing visually comparable results to the ground truth.
With the advent of Cardiac Cine Magnetic Resonance (CMR) Imaging, there has been a paradigm shift in medical technology, thanks to its capability of imaging different structures within the heart without ionizing radiation. However, it is very challenging to conduct pre-operative planning of minimally invasive cardiac procedures without accurate segmentation and identification of the left ventricle (LV), right ventricle (RV) blood-pool, and LV-myocardium. Manual segmentation of those structures, nevertheless, is time-consuming and often prone to error and biased outcomes. Hence, automatic and computationally efficient segmentation techniques are paramount. In this work, we propose a novel memory-efficient Convolutional Neural Network (CNN) architecture as a modification of both CondenseNet, as well as DenseNet for ventricular blood-pool segmentation by introducing a bottleneck block and an upsampling path. Our experiments show that the proposed architecture runs on the Automated Cardiac Diagnosis Challenge (ACDC) dataset using half (50%) the memory requirement of DenseNet and one-twelfth (∼ 8%) of the memory requirements of U-Net, while still maintaining excellent accuracy of cardiac segmentation. We validated the framework on the ACDC dataset featuring one healthy and four pathology groups whose heart images were acquired throughout the cardiac cycle and achieved the mean dice scores of 96.78% (LV blood-pool), 93.46% (RV blood-pool) and 90.1% (LVMyocardium). These results are promising and promote the proposed methods as a competitive tool for cardiac image segmentation and clinical parameter estimation that has the potential to provide fast and accurate results, as needed for pre-procedural planning and / or pre-operative applications
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.