VProtoNet: vision transformer-driven prototypical networks for enhanced interpretability in chest x-ray diagnostics

Haoyu Guo; Lifen Jiang; Fengbo Zheng; Yan Liang; Sichen Bao; Xiu Zhang; Qiantong Zhang; Jiawei Tang; Ran Li

doi:10.1117/12.3026314

27 March 2024 VProtoNet: vision transformer-driven prototypical networks for enhanced interpretability in chest x-ray diagnostics

Haoyu Guo, Lifen Jiang, Fengbo Zheng, Yan Liang, Sichen Bao, Xiu Zhang, Qiantong Zhang, Jiawei Tang, Ran Li

Author Affiliations +

Proceedings Volume 13105, International Conference on Computer Graphics, Artificial Intelligence, and Data Processing (ICCAID 2023); 1310539 (2024) https://doi.org/10.1117/12.3026314
Event: 3rd International Conference on Computer Graphics, Artificial Intelligence, and Data Processing (ICCAID 2023), 2023, Qingdao, China

Abstract

Deep learning-based methods have achieved significant improvement in accuracy in diagnosing lung diseases utilizing Chest X-Ray. However, their black-box nature and lack of interpretability reduce the confidence among physicians in the reliability of machine-generated decisions, which consistently limits their application in clinical practice. In this paper, we propose a novel interpretable deep learning model VProtoNet, which can produce heatmaps that display important diagnostic image features of lung diseases and reveal how the model makes decision based on them. VProtoNet generates heatmaps by comparing the features extracted by Vision Transformer with the prototypes, each of which signifies a typical part of a Chest X-ray image, learned within the model. Further, we simplify the heatmap into a single similarity score that can be used as the basis for model classification diagnosis. To verify the effectiveness of our model, we applied our method to Chest X-ray 14 dataset and achieved an accuracy of 72.35%. Also, we analyzed the feature maps generated by our model during the classification process, discovering that they indeed intuitively demonstrate the model's recognition and understanding of the diseased areas, which enables physicians to better comprehend the model's decision-making process.

(2024) Published by SPIE. Downloading of the abstract is permitted for personal use only.

Citation Download Citation

Haoyu Guo, Lifen Jiang, Fengbo Zheng, Yan Liang, Sichen Bao, Xiu Zhang, Qiantong Zhang, Jiawei Tang, and Ran Li "VProtoNet: vision transformer-driven prototypical networks for enhanced interpretability in chest x-ray diagnostics", Proc. SPIE 13105, International Conference on Computer Graphics, Artificial Intelligence, and Data Processing (ICCAID 2023), 1310539 (27 March 2024); https://doi.org/10.1117/12.3026314

ACCESS THE FULL ARTICLE

INSTITUTIONAL
Select your institution to access the SPIE Digital Library.

SELECT YOUR INSTITUTION

PERSONAL
Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.

PERSONAL SIGN IN

No SPIE Account? Create one

PURCHASE THIS CONTENT

SUBSCRIBE TO DIGITAL LIBRARY

50 downloads per 1-year subscription

Members: $195

Non-members: $335 ADD TO CART

25 downloads per 1 - year subscription

Members: $145

Non-members: $250 ADD TO CART

PURCHASE SINGLE ARTICLE

Includes PDF, HTML & Video, when available