Classification of skin cancer based on hyperspectral microscopic imaging and machine learning

Meijie Qi; Yujie Liu; Yanru Li; Lixin Liu; Zhoufeng Zhang

doi:10.1117/12.2666425

28 March 2023 Classification of skin cancer based on hyperspectral microscopic imaging and machine learning

Meijie Qi, Yujie Liu, Yanru Li, Lixin Liu, Zhoufeng Zhang

Author Affiliations +

Proceedings Volume 12601, SPIE-CLP Conference on Advanced Photonics 2022; 1260103 (2023) https://doi.org/10.1117/12.2666425
Event: SPIE-CLP Conference on Advanced Photonics 2022, 2022, Online Only

Abstract

Hyperspectral microscopic imaging (HMI) technology is a non-contact optical diagnostic method, which combines hyperspectral imaging (HSI) technology with microscopy to provide both spectral information and image information of the samples to be measured. In this paper, basal cell carcinoma (BCC), squamous cell carcinoma (SCC) and malignant melanoma (MM) were classified based on synthetic RGB image data from HMI cube by using four classification methods extreme learning machine (ELM), support vector machine (SVM), decision tree and random forest (RF). The highest classification accuracy of 0.791±0.060 and a KAPPA value of 0.685±0.095 were obtained when color moment, gray level co-occurrence matrix (GLCM) and local binary pattern (LBP) were used for image feature extraction, feature dimensions were reduced by the PLS, the sample sets were divided by the hold-out method, and the tissues were classified by the SVM model.

1. INTRODUCTION

Skin cancer is the most common of all cancers with incidence rates increasing year over year. The diagnosis and treatment of skin cancer has become a major public health problem. Various optical imaging techniques have been used as auxiliary tools in skin cancer diagnosis with the advantages of non-invasiveness, high resolution and high sensitivity. Hyperspectral microscopic imaging (HMI) technology¹ combines hyperspectral imaging (HSI) with microscopy to provide 3D data cube, i.e. spectral information and image information, to reflect the changes in the physical structure and microenvironment of the samples to be measured. In recent years, HMI has been used as a diagnostic method in biomedical field for the detection of tissues, cells and microbe etc^2-4. At the same time, the combination of HMI and machine learning can assist doctors in diagnosis and greatly improve the efficiency and accuracy of the diagnosis, as well as having a wide field of application in the future^4-6.

Abdlaty et al.⁷ investigated the feasibility of using HSI for quantitative assessment of early skin erythema. HSI and color imaging data was analyzed using linear discriminant analysis (LDA) to perform classification. The classification results, including accuracy and precision, demonstrated that HSI was superior to color imaging in skin erythema assessment. Ortega et al.⁸ collected 517 HMI cubes of pathological slides from 13 glioblastoma patients and automatically detected glioblastoma by using a convolutional neural network (CNN). The results showed average sensitivity and specificity values of 88% and 77%, respectively, representing an improvement of 7% and 8% respectively, as compared to the results obtained using RGB images. Leon et al.⁹ used the HMI system to collect 76 images of pigmented skin lesions from 61 patients and classified them into benign and malignant with an automatic recognition and classification framework based on the combination of unsupervised algorithm and supervised algorithm. The sensitivity and specificity for the differential diagnosis of benign and malignant pigmentary skin lesions were 87.5% and 100%, respectively. Liu et al.¹⁰ combined HMI with machine learning methods for staging identification of squamous cell carcinoma (SCC) based on hyperspectral data and obtained the highest staging accuracy of 0.952±0.014, and a KAPPA value of 0.928±0.022.

In this paper, HMI data cubes of basal cell carcinoma (BCC), squamous cell carcinoma (SCC) and malignant melanoma (MM) were collected by a home-made push-broom HMI system¹¹. Based on the image data and using various classification models, the classification of BCC, MM and SCC was realized.

2. MATERIALS AND METHODS

2.1

Experimental Materials

The skin tissue samples used in our experiments were purchased from Xi’an Alenabio company (sample No: SK801c) and ZhongkeGuanghua (Xi’an) Intelligent Biotechnology company (sample No: K683501). In total, there were 34 cases of BCC, 63 cases of SCC and 39 cases of MM.

2.2

HMI System

The HMI system used in our experiments¹¹ had a wavelength range of 465.5-905.1 nm, with a total of 151 bands and the spectral resolution of ~3 nm. The system magnification was 28.15×, the field of view was 400.18 μm×192.47 μm, and the actual resolution was in the range of ~1.10-1.38 μm depending on the light wavelength.

2.3

Classification Methods

2.3.1

Image feature extraction

The image information in the HMI data cube can display the physical structure and spatial distribution of the samples to be measured. Traditional color image data contains RGB three channels, while the data collected by HMI system contains 151 bands. According to the International Commission on Illumination, three primary colors of RGB are red light (R) with a wavelength of 700.0nm, green light (G) with a wavelength of 546.1nm and blue light (B) with a wavelength of 435.8nm. Therefore, we selected the single-band images of band 74 (wavelength of 700.1 nm), band 23 (wavelength of 545.3 nm) and band 1 (wavelength of 465.5 nm) for RGB image synthesis.

In this paper, the color features and texture features were extracted from the synthesized RGB image, respectively, and then they were used for skin cancers classification. The color features included color moment and HSV (hue, saturation, value) color space. The texture features included local binary pattern (LBP), gray-level co-occurrence matrix (GLCM) and histogram of oriented gradient (HOG).

2.3.2

Dimensionality reduction

Dimensionality reduction is of great importance for high dimensional data analysis because it can eliminate the redundances among data samples and extract useful features at the same time. In this paper, principal component analysis (PCA) and partial least squares (PLS) were used to reduce the dimensions of the HMI datasets to improve the accuracy of the model and speed up the algorithms.

2.3.3

Classification model

In our experiments, skin cancer classification models were established based on four classification methods: extreme learning machine (ELM), SVM, decision tree and random forest (RF).

3. RESULTS

3.1

Dimensionality reduction

To remove redundant information and reduce the complexity of calculation, dimensionality reduction for image features were implemented. Figure 1 and Figure 2 show the contribution rates of color moment, HSV color space, HOG, GLCM and LBP features using PCA and PLS dimensionality reductions, respectively. By comparing the contribution rates of each image feature, it could be seen that the PLS outperformed the PCA in higher contribution rates with fewer principal components. Therefore, the PLS was selected to reduce dimensions in the experiments. In the subsequent analysis, the first three principal components were extracted for color moment, HSV color space and GLCM features, respectively, and the first 10 principal components were extracted for HOG and LBP features, respectively.

Figure 1.

Contribution rates of different image features with PCA dimensionality reduction. (a) color moment; (b) HSV color space; (c) HOG; (d) GLCM; (e) LBP.

Figure 2.

Contribution rates of different image features with PLS dimensionality reduction, (a) color moment; (b) HSV color space; (c) HOG; (d) GLCM; (e) LBP.

3.2

Combination of Image features

In order to extract the optimal image features, we tried different combinations of HSV color space, color moment, GLCM, LBP and HOG features. Table 1 shows the classification results when PLS was used for dimensionality reduction, hold-out method for dataset partition and SVM model for skin cancers classification. It could be seen that, the classification accuracies based on each single image feature, from high to low, were as follows: color moment, LBP, GLCM, HSV color space and HOG. Since the HOG feature showed the lowest classification accuracy and KAPPA value (0.454±0.05, 0.164±0.078), it was discarded as unreliable. From classification results of different combinations of the other four image features, the combination of color moment, GLCM and LBP had the highest accuracy of 0.791±0.06 and KAPPA value of 0.685±0.095. Therefore, the above three features were extracted in the subsequent skin cancer classifications with various models.

Table 1.

Classification results of the SVM model based on image features.

Classification model	Number of features	Feature	Accuracy	KAPPA
SVM	1	HSV	0.563±0.038	0.337±0.058
Color moment	0.645±0.044	0.457±0.074
HOG	0.454±0.050	0.164±0.078
GLCM	0.630±0.032	0.439±0.049
LBP	0.635±0.038	0.446±0.059
2	HSV+ Color moment	0.642±0.044	0.459±0.067
HSV+GLCM	0.693±0.062	0.535±0.092
HSV+LBP	0.603±0.038	0.397±0.060
Color moment +GLCM	0.713±0.069	0.566±0.092
Color moment +LBP	0.669±0.069	0.500± 0.103
GLCM+LBP	0.684±0.056	0.523±0.086
3	HSV+ Color moment +GLCM	0.709±0.056	0.559±0.084
HSV+ Color moment +LBP	0.694±0.081	0.538±0.098
Color moment +GLCM+LBP	0.791±0.060	0.685±0.094
4	HSV+ Color moment +GLCM+LBP	0.745±0.062	0.613±0.094

3.3

Skin cancer classification

According to the results in Section 3.1 and Section 3.2, the color moment, GLCM and LBP features were extracted from the images, PLS was used to reduce the dimensions and hold-out method was used to divide training and test datasets. The skin cancer classification models for BCC, SCC and MM were established based on ELM, SVM, decision tree and RF. From the results shown in Table 2, it could be seen that the comprehensive performance of SVM model was the best, followed by ELM, RF and decision tree in sequence. Figure 3 shows the skin cancer classification result of SVM model. The accuracy of 0.802 and KAPPA value of 0.699 were obtained.

Figure 3.

Skin cancer classification result of SVM model based on the features of Color moment, GLCM and LBP. The “o” and “*” represented the true values and predicted values, respectively.

Table 2.

Skin cancer classification results of each model based on the features of Color moment, GLCM and LBP.

Classification model	Accuracy	KAPPA
ELM	0.661± 0.063	0.490±0.095
SVM	0.791±0.060	0.685±0.095
Decision tree	0.671± 0.044	0.505±0.068
RF	0.706± 0.056	0.553±0.085

4. CONCLUSION

In this paper, we performed skin cancer classification by using HMI image data and machine learning. In the classification of BCC, SCC and MM, different image features or their combinations were extracted from the synthetic RGB image data in HMI cube; the different features were dimensionally reduced with PCA and PLS; the training set and test set were divided using the hold-out method; and the skin cancer classification models (ELM, SVM, decision tree and RF) were established.

From the experimental results, it could be seen: PLS was better than PCA and used to reduce dimensions; the combination of color moment, GLCM and LBP features was extracted as image features for its superiority over other features/combinations; and the SVM model obtained the highest classification accuracy and KAPPA value (0.791±0.06, 0.685±0.095). In the further work, more tissue samples will be used to test and optimize the model performance. We believe that the combination of HMI and machine learning will be beneficial to doctors in skin cancer diagnosis with high efficiency and accuracy and have extensive applications in biomedical field in the future.

ACKNOWLEDGEMENTS

This work was supported by the 111 Project and Open Research Fund of CAS Key Laboratory of Spectral Imaging Technology (LSIT202005W).

REFERENCES

[1]

Schultz, R.A., Nielsen, T., Zavaleta, J.R., Ruch, R., Wyatt, R. and Garner, H.R., “Hyperspectral imaging: a novel approach for microscopic analysis,” Cytometry, 43 (4), 239 –247 (2001). https://doi.org/10.1002/(ISSN)1097-0320 Google Scholar

[2]

Siddiqi A.M, Li H., Faruque F., Williams W., Lai K., Hughson M., Bigler S., Beach J. and Johnson W., “Use of hyperspectral imaging to distinguish normal, precancerous and cancerous cells,” Cancer Cytopathology, 114 (1), 13 –21 (2008). https://doi.org/10.1002/cncr.23286 Google Scholar

[3]

Liu, L.X., Li, M.Z., Zhao, Z.G. and Qu, J.L., “Recent advances of hyperspectral imaging application in biomedicine,” Chin. J. Lasers, 45 (2), 214 –223 (2018). Google Scholar

[4]

Yoon, J., “Hyperspectral imaging for clinical applications,” BioChip J, 16 1 –12 (2022). https://doi.org/10.1007/s13206-021-00041-0 Google Scholar

[5]

Khan, U., Sidike, P., Elkin, C. and Devabhaktuni, V., “Trends in deep learning for medical hyperspectral image analysis,” IEEE Access, 9 79534 –79548 (2021). https://doi.org/10.1109/ACCESS.2021.3068392 Google Scholar

[6]

Barberio, M., Benedicenti, S, Pizzicannella, M., Felli, E., Collins, T., Jansen-Winkeln, B., Marescaux, J., Viola, M.G. and Diana, M., “Intraoperative guidance using hyperspectral imaging: a review for surgeons,” Diagnostics, 11 2066 (2021). https://doi.org/10.3390/diagnostics11112066 Google Scholar

[7]

Abdlaty, R., Doerwald-Munoz, L., Madooei, A., Yeh, S. A., Zerubia, J., Wong, R. K., Hayward, J. E., Farrell, T., and Fang, Q., “Grading of Skin Erythema with Hyperspectral Imaging,” Frontiers in Physics, 6 72 (2018). https://doi.org/10.3389/fphy.2018.00072 Google Scholar

[8]

Ortega S., Halicek M., Fabelo H., Camacho R., de la Luz Plaza M., Godtliebsen F., Callicó G.M. and Fei B., “Hyperspectral imaging for the detection of glioblastoma tumor cells in H&E slides using convolutional neural networks,” Sensors, 20 1911 (2020). https://doi.org/10.3390/s20071911 Google Scholar

[9]

Leon, R., Martinez-Vega, B., Fabelo, H., Ortega, S., Melian, V., Castaño, I., Carretero, G., Almeida, P., Garcia, A., Quevedo, E., Hernandez, J. A., Clavo, B. and Callico, G. M., “Non-invasive skin cancer diagnosis using hyperspectral imaging for in-situ clinical support,” J. Clin. Med, 9 (6), E1662 (2020). https://doi.org/10.3390/jcm9061662 Google Scholar

[10]

Liu L., Qi M., Li Y., Liu Y., Liu X., Zhang Z, and Qu J., “Staging of skin cancer based on hyperspectral microscopic imaging and machine learning,” Biosensors, 12 (10), 790 (2022). https://doi.org/10.3390/bios12100790 Google Scholar

[11]

Qi M., Liu L., Li Y., Liu Y., Zhang Z. and Qu J., “Design and experiment of push-broom hyperspectral microscopic imaging system,” Chin. J. Lasers, 49 (20), 2007105 (2022). Google Scholar

Citation Download Citation

Meijie Qi, Yujie Liu, Yanru Li, Lixin Liu, and Zhoufeng Zhang "Classification of skin cancer based on hyperspectral microscopic imaging and machine learning", Proc. SPIE 12601, SPIE-CLP Conference on Advanced Photonics 2022, 1260103 (28 March 2023); https://doi.org/10.1117/12.2666425

Access the abstract

PROCEEDINGS
5 PAGES

DOWNLOAD PAPER SAVE TO MY LIBRARY

GET CITATION

RIGHTS & PERMISSIONS

Get copyright permission Get copyright permission on Copyright Marketplace

KEYWORDS

RGB color model

Skin cancer

Image classification

Feature extraction

Hyperspectral imaging

Machine learning

Data modeling

1.

INTRODUCTION

2.

MATERIALS AND METHODS

2.1

Experimental Materials

2.2

HMI System

2.3

Classification Methods

2.3.1

Image feature extraction

2.3.2

Dimensionality reduction

2.3.3

Classification model

3.

RESULTS

3.1

Dimensionality reduction

Figure 1.

Figure 2.

3.2

Combination of Image features

Table 1.

3.3

Skin cancer classification

Figure 3.

Table 2.

4.

CONCLUSION

ACKNOWLEDGEMENTS

REFERENCES

Keywords/Phrases

Search In:

Publication Years