Objective: To predict right/left/total lung volume, thoracic cavity volume, and heart volume from subject demographics. Methods and materials: A cohort of 4,610 subjects with chest CT scans and basic demographics (i.e., age, gender, race, smoking status, smoking history, weight, and height) was used in this study. The right and left lungs, thoracic cavity, and heart depicted on chest CT scans were automatically segmented using U-Net, and their volumes were computed. Eight machine learning models (i.e., random forest, multivariate linear regression (MLR), support vector machine (SVM), extreme gradient boosting (XGBoost), multilayer perceptron (MLP), decision tree, k-nearest neighbors (KNN), and Bayesian regression) were developed and used to predict the volume measures from subject demographics. The 10-fold cross-validation method was used to evaluate the performances of the prediction models. R-squared (R²), mean absolute error (MAE), and mean absolute percentage error (MAPE) were used as performance metrics. Results: The MLP model demonstrated the best performance for predicting the thoracic cavity volume (R²: 0.628, MAE: 0.736 liters, MAPE: 10.9%), right lung volume (R²: 0.501, MAE: 0.383 liters, MAPE: 13.9%), and left lung volume (R²: 0.507, MAE: 0.365 liters, MAPE: 15.2%), while the XGBoost model demonstrated the best performance for predicting the total lung volume (R²: 0.514, MAE: 0.728 liters, MAPE: 14.0%) and heart volume (R²: 0.430, MAE: 0.075 liters, MAPE: 13.9%). Conclusion: Our results demonstrate the feasibility of predicting lung, heart, and thoracic cavity volumes from subject demographics.
|