PURPOSE: Accurate preoperative planning is crucial for liver resection surgery due to the complex anatomical structures and variations among patients. The need of virtual resections utilizing deformable surfaces presents a promising approach for effective liver surgery planning. However, the range of available surface definitions poses the question of which definition is most appropriate. METHODS: The study compares the use of NURBS and B´ezier surfaces for the definition of virtual resections through a usability study, where 25 participants (19 biomedical researchers and 6 liver surgeons) completed tasks using varying numbers of control points driving surface deformations and different surface types. Specifically, participants aim to perform virtual liver resections using 16 and 9 control points for NURBS and B´ezier surfaces. The goal is to assess whether they can attain an optimal resection plan, effectively balancing complete tumor removal with the preservation of enough healthy liver tissue and function to prevent postoperative liver dysfunction, despite working with fewer control points and different surface properties. Accuracy was assessed using Hausdorff distance and average surface distance. A survey based on the NASA Task Load Index measured user performance and preferences. RESULTS: NURBS surfaces exhibit improved accuracy and consistency over B´ezier surfaces, with lower average surface distance and variability of results. The 95th percentile Hausdorff Distance indicates the robustness of NURBS surfaces for the task. Task completion time was influenced by control point dimensions, favoring NURBS 3x3 (vs. 4x4) surfaces for a balanced accuracy-efficiency trade-off. Finally, the survey results indicated participants preferred NURBS surfaces over B´ezier, emphasizing the improved performance, surface manipulation, and reduced effort. CONCLUSION: The integration of NURBS surfaces into liver resection planning offers a promising advancement. This study demonstrates their superiority in accuracy, efficiency, and user preference compared to B´ezier surfaces. The findings underscore the potential of NURBS-based preoperative planning tools to enhance surgical outcomes in liver resection procedures.
PURPOSE: Percutaneous nephrostomy is a commonly performed procedure to drain urine to provide relief in patients with hydronephrosis. Conventional percutaneous nephrostomy needle guidance methods can be difficult, expensive, or not portable. We propose an open-source real-time 3D anatomical visualization aid for needle guidance with live ultrasound segmentation and 3D volume reconstruction using free, open-source software. METHODS: Basic hydronephrotic kidney phantoms were created, and recordings of these models were manually segmented and used to train a deep learning model that makes live segmentation predictions to perform live 3D volume reconstruction of the fluid-filled cavity. Participants performed 5 needle insertions with the visualization aid and 5 insertions with ultrasound needle guidance on a kidney phantom in randomized order, and these were recorded. Recordings of the trials were analyzed for needle tip distance to the center of the target calyx, needle insertion time, and success rate. Participants also completed a survey on their experience. RESULTS: Using the visualization aid showed significantly higher accuracy, while needle insertion time and success rate were not statistically significant at our sample size. Participants mostly responded positively to the visualization aid, and 80% found it easier to use than ultrasound needle guidance. CONCLUSION: We found that our visualization aid produced increased accuracy and an overall positive experience. We demonstrated that our system is functional and stable and believe that the workflow with this system can be applied to other procedures. This visualization aid system is effective on phantoms and is ready for translation with clinical data.
KEYWORDS: Object detection, Ultrasonography, Education and training, Reliability, Video, Receivers, Surgery, Image information entropy, Deep learning, Data acquisition
Computed-based skill assessment relies on accurate metrics to provide comprehensive feedback to trainees. Improving the accuracy of video-based metrics computed using object detection is generally done by improving the performance of the object detection network, however increasing its performance requires resources that cannot always be obtained. This study aims to improve the accuracy of metrics in central venous catheterization without requiring a high performing object detection network by removing false positive predictions identified using uncertainty quantification. The uncertainty for each bounding box was calculated using an entropy equation. The uncertainties were then compared to an uncertainty threshold computed using the optimal point of a Receiver Operating Characteristic curve. Predictions were removed if the uncertainty fell below the predefined threshold. 50 videos were recorded and annotated with ground truth bounding boxes. These bounding boxes were used to train an object detection network, which was used to produce predictive bounding boxes for the test set. This method was evaluated by computing metrics for the predictive bounding boxes with and without having false positives removed and comparing them to ground truth labels using a Pearson Correlation. The Pearson Correlations for the baseline comparisons and the comparisons made using the results calculated using false positive removal were 0.922 and 0.816 for syringe path lengths, 0.753 and 0.510 for ultrasound path lengths, 0.831 and 0.489 for ultrasound usage times, and 0.857 and 0.805 for syringe usage times. This method consistently reduced inflated metrics, making it promising for improving metric accuracy.
Following the shift from time-based medical education to a competency-based approach, a computer-assisted training platform would help relieve some of the new time burden placed on physicians. A vital component of these platforms is the computation of competency metrics which are based on surgical tool motion. Recognizing the class and motion of surgical tools is one step in the development of a training platform. Object detection can achieve tool recognition. While previous literature has reported on tool recognition in minimally invasive surgeries, open surgeries have not received the same attention. Open Inguinal Hernia Repair (OIHR), a common surgery that general surgery residents must learn, is an example of such surgeries. We present a method for object detection to recognize surgical tools in simulated OIHR. Images were extracted from six video recordings of OIHR performed on phantoms. Tools were labelled with bounding boxes. A YOLOV3 object-detection model was trained to recognize the tools used in OIHR. The Average Precision scores per class and the mean Average Precision (mAP) were reported to benchmark the model’s performance. The mAP of the tool classes was 0.61, with individual Average Precision scores reaching up to 0.98. Tools with poor visibility or similar shapes such as the forceps, or scissors achieved lower precision scores. With an object detection network that can identify tools, research can be done on tissue-tool interactions to achieve workflow recognition. Workflow recognition would allow a training platform to detect the tasks performed in hernia repair surgeries.
Treatment for Basal Cell Carcinoma (BCC) includes an excisional surgery to remove cancerous tissues, using a cautery tool to make burns along a defined resection margin around the tumor. Margin evaluation occurs post-surgically, requiring repeat surgery if positive margins are detected. Rapid Evaporative Ionization Mass Spectrometry (REIMS) can help distinguish healthy and cancerous tissue but does not provide spatial information about the cautery tool location where the spectra are acquired. We propose using intraoperative surgical video recordings and deep learning to provide surgeons with guidance to locate sites of potential positive margins. Frames from 14 intraoperative videos of BCC surgery were extracted and used to train a sequence of networks. The first network extracts frames showing surgery in-progress, then, an object detection network localizes the cautery tool and resection margin. Finally, our burn prediction model leverages the effectiveness of both a Long Short-Term Memory (LSTM) network and a Receiver Operating Characteristic (ROC) curve to accurately predict when the surgeon is cutting. The cut identifications will be used in the future for synchronization with iKnife data to provide localizations when cuts are predicted. The model was trained with four-fold cross-validation on a patient-wise split between training, validation, and testing sets. Average recall over the four folds of testing for the LSTM and ROC were 0.80 and 0.73, respectively. The video-based approach is simple yet effective at identifying tool-to-skin contact instances and may help guide surgeons, enabling them to deliver precise treatments in combination with iKnife data.
As medical education adopts a competency-based training approach, assessment of skills and timely provision of formative feedback is required. Provision of such assessment and feedback places a substantial time burden on surgeons. To reduce this time burden, we look to develop a computer-assisted training platform to provide both instruction and feedback to residents learning open Inguinal Hernia Repairs (IHR). To provide feedback on residents’ technical skills, we must first find a method of workflow recognition of the IHR. We thus aim to recognize and distinguish between workflow steps of an open IHR based on the presence and frequencies of different tool-tissue interactions occurring during each step. Based on ground truth tissue segmentations and tool bounding boxes, we identify the visible tissues within a bounding box. This provides an estimation of which tissues a tool is interacting with. The presence and frequencies of the interactions during each step are compared to determine whether this information can be used to distinguish between steps. Based on the ground truth tool-tissue interactions, the presence and frequencies of interactions during each step in the IHR show clear, distinguishable patterns. In conclusion, due to the distinct differences in the presence and frequencies of the tool-tissue interactions between steps, this offers a viable method of step recognition of an open IHR performed on a phantom.
Purpose: Computer-assisted surgical skill assessment methods have traditionally relied on tracking tool motion with physical sensors. These tracking systems can be expensive, bulky, and impede tool function. Recent advances in object detection networks have made it possible to quantify tool motion using only a camera. These advances open the door for a low-cost alternative to current physical tracking systems for surgical skill assessment. This study determines the feasibility of using metrics computed with object detection by comparing them to widely accepted metrics computed using traditional tracking methods in central venous catheterization. Methods: Both video and tracking data were recorded from participants performing central venous catheterization on a venous access phantom. A Faster Region-Based Convolutional Neural Network was trained to recognize the ultrasound probe and syringe on the video data. Tracking-based metrics were computed using the Perk Tutor extension of 3D Slicer. The path length and usage time for each tool were then computed using both the video and tracking data. The metrics from object detection and tracking were compared using Spearman rank correlation. Results: The path lengths had a rank correlation coefficient of 0.22 for the syringe (p<0.03) and 0.35 (p<0.001) for the ultrasound probe. For the usage times, the correlation coefficient was 0.37 (p<0.001) for the syringe and 0.34 (p<0.001) for the ultrasound probe. Conclusions: The video-based metrics correlated significantly with the tracked metrics, suggesting that object detection could be a feasible skill assessment method for central venous catheterization.
Automated skills assessment of ultrasound-guided needle insertions has previously been explored through 3D motion tracking data. The purpose of this study was to determine the viability of 2D motion tracking data in distinguishing between novice and expert subjects. METHODS: Perspective projection was applied to needle and ultrasound probe time series data. The resulting time series data of 2D points were used to calculate various performance metrics. Using these metrics, classifications between novice and expert were performed by random forest. This procedure was repeated with different camera positions all pointing at the reference point to examine systematically the effect of camera position on assessment. RESULTS: For in-plane needle insertions, mean AUC obtained through 3D data and mean AUC obtained through 2D data were well-matched (0.68 vs. 0.69). For out-of-plane insertions, mean AUC values from 3D and 2D data were more distant (0.86 vs. 0.77), but AUC from the optimal camera angle matched up well (0.85). CONCLUSION: 2D data is comparable to 3D data when used to perform skills assessment of ultrasound-guided needle insertions, and camera placement level with the instruments is optimal. We conclude that videos of needle insertions may be feasible for skills assessment.
Purpose: Computer-assisted skill assessment has traditionally been focused on general metrics related to tool motion and
usage time. While these metrics are important for an overall evaluation of skill, they do not address critical errors made
during the procedure. This study examines the effectiveness of utilizing object detection to quantify the critical error of
making multiple needle insertion attempts in central venous catheterization. Methods: 6860 images were annotated with
ground truth bounding boxes around the syringe attached to the needle. The images were registered using the location of
the phantom, and the bounding boxes from the training set were used to identify the regions where the needle was most
likely inserting the phantom. A Faster region-based convolutional neural network was trained to identify the syringe and
produce the bounding box location for images in the test set. A needle insertion attempt began when the location of the
predicted bounding box fell within the identified insertion region. To evaluate this method, we compared the computed
number of insertions to the number of insertions identified by human reviewers. Results: The object detection network
had an overall mean average precision (mAP) of 0.71. This tracking method computed an average of 4.40 insertion attempts
per recording compared to a reviewer count of 1.39 attempts per recording. Conclusions: The difference in the number of
insertion attempts identified by the computer and reviewers decreases with an increasing mAP, making this method suitable
for detecting multiple needle insertions using an object detection network with a high accuracy.
Surgical excision for basal cell carcinoma (BCC) is a common treatment to remove the affected areas of skin. Minimizing positive margins around excised tissue is essential for successful treatment. Residual cancer cells may result in repeat surgery; however, detecting remaining cancer can be challenging and time-consuming. Using chemical signal data acquired while tissue is excised with a cautery tool, the iKnife system can discriminate between healthy and cancerous tissue but lacks spatial information, making it difficult to navigate back to suspicious margins. Intraoperative videos of BCC excision allow cautery locations to be tracked, providing the sites of potential positive margins. We propose a deep learning approach using convolutional neural networks to recognize phases in the videos and subsequently track the cautery location, comparing two localization methods (supervised and semi-supervised). Phase recognition was used for preprocessing to classify frames as showing the surgery or the start/stop of iKnife data acquisition. Only frames designated as showing the surgery were used for cautery localization. Fourteen videos were recorded during BCC excisions with iKnife data collection. On unseen testing data (2 videos, 1,832 frames), the phase recognition model showed an overall accuracy of 86%. Tool localization performed with a mean average precision of 0.98 and 0.96 for supervised and semisupervised methods, respectively, at a 0.5 intersection over union threshold. Incorporating intraoperative phase data with tool tracking provides surgeons with spatial information about the cautery tool location around suspicious regions, potentially improving the surgeon's ability to navigate back to the area of concern.
PURPOSE: As medical education adopts a competency-based training method, experts are spending substantial amounts of time instructing and assessing trainees’ competence. In this study, we look to develop a computer-assisted training platform that can provide instruction and assessment of open inguinal hernia repairs without needing an expert observer. We recognize workflow tasks based on the tool-tissue interactions, suggesting that we first need a method to identify tissues. This study aims to train a neural network in identifying tissues in a low-cost phantom as we work towards identifying the tool-tissue interactions needed for task recognition. METHODS: Eight simulated tissues were segmented throughout five videos from experienced surgeons who performed open inguinal hernia repairs on phantoms. A U-Net was trained using leave-one-user-out cross validation. The average F-score, false positive rate and false negative rate were calculated for each tissue to evaluate the U-Net’s performance. RESULTS: Higher F-scores and lower false negative and positive rates were recorded for the skin, hernia sac, spermatic cord, and nerves, while slightly lower metrics were recorded for the subcutaneous tissue, Scarpa’s fascia, external oblique aponeurosis and superficial epigastric vessels. CONCLUSION: The U-Net performed better in recognizing tissues that were relatively larger in size and more prevalent, while struggling to recognize smaller tissues only briefly visible. Since workflow recognition does not require perfect segmentation, we believe our U-Net is sufficient in recognizing the tissues of an inguinal hernia repair phantom. Future studies will explore combining our segmentation U-Net with tool detection as we work towards workflow recognition.
Purpose: As medical schools move toward competency-based medical education, they seek methods of quantifying trainee skill without human expert supervision. This study evaluates the efficacy of using object detection to track performance metrics in ultrasound-guided interventions, specifically central venous catheterization. While several studies have explored methods to automate the evaluation of these interventions, they typically rely on expensive, bulky markers. Therefore, a webcam-based approach is desirable. Methods: We used the Faster Region-Based Convolutional Neural Network object detection network developed by Ren et al. to track the two-dimensional path length and the usage time of seven tools used in central venous catheterization. Object detection relies solely on webcam imagery. Video data were collected from recordings of 20 central venous catheterization trials by four different medical students. Each recording was separated into individual frames, annotated, and inputted to the object detection network. Mean average precision was calculated for each fold and each tool. Results: The average mean average precision was 0.66. Between trials one and five, the average reduction in tool usage time was 52%, and the average reduction in 2D path length was 29%. Conclusions: The neural network was able to identify each tool with considerable accuracy. Furthermore, the neural network successfully computed differences in performance metrics that emerge as trainees gain experience. Faster Region-Based Convolutional Neural Network is an effective method to assess trainee skill in ultrasound-guided interventions
PURPOSE: Under ultrasound guidance, procedures that have been traditionally performed using landmark approaches have become safer and more efficient. However, inexperienced trainees struggle with coordinating probe handling and needle insertion. We aimed to establish learning curves to identify the rate of acquisition of in-plane and out-of-plane vascular access skill in novice medical trainees. METHODS: Thirty-eight novice participants were randomly assigned to perform either in-plane or out-of-plane insertions. Participants underwent baseline testing, four practice insertions (with 3D visualization assistance), and final testing; performance metrics were computed for all procedures. Five expert participants performed insertions in both approaches to establish expert performance metric benchmarks. RESULTS: In-plane novices (n=19) demonstrated significant final reductions in needle path inefficiency (45.8 vs. 127.1, p<0.05), needle path length (41.1 mm vs. 58.0 mm, p<0.05), probe path length (11.6 mm vs. 43.8 mm, p<0.01), and maximal distance between needle and ultrasound plane (3.1 mm vs. 5.5 mm, p<0.05) and surpassed expert benchmarks in average and maximal rotational error. Out-of-plane novices (n=19) demonstrated significant final reductions in all performance metrics, including needle path inefficiency (54.4 vs. 1102, p<0.01), maximum distance of needle past plane (0.0 mm vs. 7.3 mm, p<0.01), and total time of needle past plane (0.0 s vs. 3.4 s, p<0.01) and surpassed expert benchmarks in maximum distance and time of needle past plane. CONCLUSION: Our learning curves quantify improvement in in-plane and out-of-plane vascular access skill with 3D visualization over multiple attempts. The training session enables more than half of novices to approach expert performance benchmarks.
KEYWORDS: Video, Ultrasonography, Object recognition, RGB color model, 3D modeling, 3D image processing, Visualization, Sensors, Human-machine interfaces, 3D displays
Purpose: Medical schools are shifting from a time-based approach to a competency-based education approach. A competency-based approach requires continuous observation and evaluation of trainees. The goal of Central Line Tutor is to be able to provide instruction and real-time feedback for trainees learning the procedure of central venous catheterization, without requiring a continuous expert observer. The purpose of this study is to test the accuracy of the workflow detection method of Central Line Tutor. This study also looks at the effectiveness of object recognition from a webcam video for workflow detection. Methods: Five trials of the procedure were recorded from Central Line Tutor. Five reviewers were asked to identify the timestamp of the transition points in each recording. Reviewer timestamps were compared to those identified by Central Line Tutor. Differences between these values were used to calculate average transitional delay. Results: Central Line Tutor was able to identify 100% of transition points in the procedure with an average transitional delay of -1.46 ± 0.81s. The average transitional delay of EM and webcam tracked steps were -0.35 ± 2.51s and -2.46 ± 3.57s respectively. Conclusions: Central line tutor was able to detect completion of all workflow tasks with minimal delay and may be used to provide trainees with real-time feedback. The results also show that object recognition from a webcam video is an effective method for detecting workflow tasks in the procedure of central venous catheterization.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.