|
1.IntroductionHyperspectral sensors [e.g., Airborne Visible Infrared Imaging Spectrometer (AVIRIS), hyperspectral digital imagery collection experiment, HyMap, and EO-1 Hyperion] record a scene over the wide wavelength ranging from the visible region to the infrared spectrum, which provides detailed spectral information about the objects in numerous and continuous spectral bands (from tens to several hundreds) as well as a high spatial resolution.1 Due to the high spectral resolution, hyperspectral images offer very high-discrimination capabilities among similar ground cover objects.2 However, the huge numbers of bands always bring the curse of dimensionality, reducing the discriminating ability of the data as the dimensionality increases with fewer numbers of labeled training samples.3,4 This behavior is also referred to as “Hughes phenomenon.”5 Moreover, the high dimensionality of the hyperspectral image also consists of redundant and noisy information, which increases the computational burden of the data processing. So dimensionality reduction becomes an essential task in the hyperspectral image processing. Dimensionality reduction is the process of reducing redundant data and extracting meaningful features. In other words, dimensionality reduction is a convenient way of reducing the number of spectral bands and transforming the data from a high-dimensional space to a lower dimensional space, where the most significant information is conserved.6,7 Dimensionality reduction can be done through the feature selection or feature extraction method. In the feature selection method, a few informative bands are selected on the basis of the adopted selection criteria, namely, the distance measures (Euclidean distance, spectral angle mapping, Bhattacharyya distance, Hausdorff distance, and Jeffreys–Matusita distance), information theoretic approaches (divergence, transformed divergence, and mutual information), and Eigen analysis [principal component analysis (PCA)], where the original physical significant properties of the bands can be preserved.8–15 One of the popular band selection methods is the constrained band selection (CBS) method.9 It minimizes the correlation and dependency in the selection of the bands. Based on correlation and dependence, CBS method offers four different approaches, which arise from two different approaches: (1) constrained energy minimization (CEM) and (2) linearly constrained minimum variance (LCMV). There are four specific criteria for band selection such as, band correlation minimization (BCM), band correlation constraint (BCC), band dependence minimization (BDM), and band dependence constraint (BDC). These four criteria divide the CEM and LCMV approach into to four parts: CEM-BCC/BDC, CEM-BCM/BDM, LCMV-BCC/BDC, and LCMV-BCM/BDM. Feature selection provides suitable features for classification but is computationally expensive and often not robust in complex scenes (variation in spectral signatures across scenes). On the other hand, feature extraction methods transform the higher dimensional data into the lower dimensional space. They are computationally superior and more robust to the complex scenes. However, extraction of efficient and suitable features in the classification of large hyperspectral data is a highly crucial task. Feature extraction methods transform the original high-dimensional feature space into a low-dimensional feature space, which faces loss of the physical meaning of the bands but preserves the significant discriminative information needed for further analysis.16–25,26 PCA is one of the most widely used approaches for feature extraction.16 This is due to the fact of PCA being an invertible transformation, which facilitates the interpretation of the extracted features. PCA offers high-computational load and operates on the global features but loses local information.27 The extension of PCA, segmented PCA method,17 is presented for addressing this issue. Here, for using the local information, PCA is applied to the groups of bands formed using the correlation between bands. Another most useful feature extraction method is independent component analysis (ICA),19 which is used for the extraction of class discriminant features from the hyperspectral images. But the complexity of ICA method increases the computational load. In general, the hyperspectral data are nonlinear in nature. Hence, the linear classifier usually provides unsatisfied classification performance. In recent times, some nonlinear methods such as maximum noise fraction20 and kernel PCA,21 and probabilistic PCA (PPCA) are proposed as an extension to the conventional PCA. PPCA is a constraint Gaussian generative latent variable model. PPCA extracts features using the maximum likelihood estimates for the parameters associated with the covariance matrix that can be efficiently calculated from the data principal component.22 In most of the situations, the labeled samples are limited and obtaining the labeled samples is a very expensive and time-consuming task. On the other hand, unlabeled samples are available in large quantities at low cost. Hence, semisupervised PPCA is proposed as an extension of PPCA, which uses both the labeled as well as unlabeled information into the projection for overcoming the problem of the scarcity of the labeled samples.18 Apart from the PCA, there are two other best known feature extraction approaches, discriminant analysis feature extraction28 and linear discriminant analysis (LDA).23 In recent times, many other extensions to the above-mentioned two methods have been proposed, namely, regularized LDA,23 nonparametric weighted feature extraction (NWFE),24 and kernel NWFE.25 Another most popular feature extraction approach is the clustering-based feature extraction (CBFE). Clustering makes partitions of the hyperspectral image into several uncorrelated subband groups, each of which contains contiguous bands. Clustering has received increasing attention in the hyperspectral remote sensing community due to its better performance toward the curse of dimensionality problem.29–35 Clustering technique removes redundancies and the correlated data from the high-dimensional data and provides uncorrelated low-dimensional data. In Ref. 30, CBFE is proposed. It works well in a small sample size scenario using the most popular -means clustering algorithm. A semisupervised -means clustering method is proposed for utilizing the easily available unlabeled samples.36 It uses the multiple classifiers for each cluster of band and the final output is the fused result of the multiple classifiers. Clustering methods do not require a priori knowledge in advance to the band grouping process, but make the cluster of the bands as per the distribution of the spectral features of hyperspectral image. Moreover, clustering methods are too sensitive to the randomly initialized cluster center and selected subset of bands may be unstable. Hence, in Ref. 37, an automatic clustering method [fast density peak-based clustering (FDPC)] is proposed, which selects the cluster centers using the fast search method. But it is not a fully automatic cluster center selection method and loses the data points. Hence, improvements in FDPC are proposed, namely, enhanced fast density peak-based clustering (E-FDPC),38 and -means fast density peak-based clustering.39 Dual clustering-based band selection by context analysis (DCCA)33 does the clustering by considering the context information in the bands of the hyperspectral image. Recently, along with the algorithm development for the hyperspectral image classification, fusion methods such as decision level and feature level fusion methods have gained great interest,40–43 and these methods demonstrated the ability of the combination of the selected features to improve the classification performance. Considering the above study of the feature extraction techniques, the authors of this work found the following challenges:
The main contributions of the proposed method are summarized as follows.
The remainder of this paper is arranged as follows: in Sec. 2, the proposed architecture of EM clustering and weighted average fusion-based hyperspectral image classification is explained in detail. Mathematical details of EM clustering and weighted average fusion are also discussed. Experimental analysis of four standard datasets is presented in Sec. 3. More precisely, the proposed method is compared with other clustering and fusion-based methods. Comparison is done for both quantitative accuracy and visual interpretation. Section 4 provides the concluding remarks. 2.Proposed ArchitectureThis section discusses the proposed architecture of the feature extraction for hyperspectral image classification in detail. The proposed feature extraction architecture is presented in Fig. 1, which depicts the proposed approach as comprising three stages, namely, band clustering, the fusion of the bands of each cluster, and classification. The following section provides a detailed explanation of the various stages present in the proposed system. 2.1.Band ClusteringHyperspectral data consist of the hundreds of spectral bands, which are highly redundant due to similar sensor responses in two adjacent bands. The objective of the band clustering is to group the highly correlated bands and group them into distant clusters. Figure 2 shows the workflow of the band clustering procedure. Here the Bhattacharya distance28 is used as band separability measure for computing the distance between each pair of spectral bands. The Bhattacharya distance between bands and is defined as Here, and are band means, and are band covariance matrices.Using the distance information, the bands are clustered using the EM clustering algorithm. The band clustering procedure using the EM clustering algorithm is explained in detail in the following section. 2.1.1.Band clustering using EM algorithmUsing the generated distances between each pair of spectral bands, all the original bands are grouped into “” clusters. Clustering is done using the EM algorithm. The EM clustering algorithm features the partial allotment of points to different clusters instead of assigning them to the closest cluster center. This can be achieved by modeling each cluster using the probabilistic distribution. Finally, the algorithm is converged into the cluster with the highest probability. The -means clustering algorithm is an incremental heuristic approach, whereas the EM algorithm is a statistical algorithm that assumes a statistical model that describes the data. The assumption of the EM algorithm to cluster analysis is that the patterns are drawn from one or several distributions. The goal here is to identify the parameters of each distribution. In this case, the parameters of a Gaussian mixture model have to be estimated. The EM algorithm44 is a probabilistic model used for finding the maximum likelihood estimates of the parameters from the patterns. Assume that bands belonging to the same cluster are drawn from a multivariate Gaussian probability distribution for forming the cluster of bands. The EM clustering algorithm converges to an optimal value of the clusters. It considered as converged when there is no further change in the assignment of the bands to cluster. The EM clustering algorithm is explained in Algorithm 1. Algorithm 1Band clustering using EM algorithm.
2.2.Weighted Average FusionFollowing the band clustering process, all the bands from each cluster are fused together using the weighted average fusion method. The fused bands should have the following characteristics:
The simple average fusion method proposed in Ref. 29 does not ensure any satisfactory way for removing redundant information. Hence, here, the weighted average fusion method is used for the preservation of the discriminative information of the original bands. Since the weight factor preserves the discriminative information of the original bands, it improves the classification results. Therefore, bands in ’th cluster are fused as shown in where is the ’th band in ’th cluster, and is the weight factor for ’th band in ’th cluster. Here we provide each band a weight value of . An optimal weight value of each band is determined by updating the weight value .45 Let the sum of band weight in each cluster be one, i.e., The initial weight value of each band is evaluated by considering the variance of each band. The initial value of weight is calculated as: where represents variance of ’th band image and represents the total number of bands in the hyperspectral image data.The weight updating procedure is iterated for times for finding the optimal weight value of each band. The weight value is determined using the following equation: where is the number of iterations, is the balance factor between first and second term of Eq. (10), and is the distance between band and calculated by using Eq. (1).In this propagation process, each time updating one band’s weight is done using all other information relating to the bands based on the distance between them. This process continues until all bands in the cluster have been updated once. The weight updating procedure indicated in Eq. (10) ensures following two characteristics of fused bands. The first term measures the compactness within the same cluster, whereas the second term measures the scatteredness among the discriminative clusters. There exists a concise form for Eq. (10): whereThe coefficient matrix is defined as Following the iterations, the weight value of band is chosen by maximizing Eq. (10), i.e., Then weight value in each band cluster is normalized as follows:Calculation of the weighted average of bands in each subgroup removes the noise from bands and also the redundant information for each subgroups. Weighted average fusion decorrelates the intercorrelated hyperspectral bands into a set of uncorrelated bands. The fused bands from each cluster are then considered as set of extracted features. After fusion of bands using the weighted average fusion technique, the actual classification is performed with SVM classifier. The extracted features are used for training the SVM classifier. Its remarkable benefits in solving the complex problems such as nonlinear and high dimensionality of the data and limited training samples make the SVM classifier the most commonly used in the hyperspectral image classification.46 2.3.Computational Cost AnalysisIn this section, the theoretical computational cost of the proposed EM-WAF method is discussed. Both the arithmetic operations and the big notation are used for calculation of the computational cost. The theoretical computational cost of the proposed method depends on four steps, namely, the Bhattacharya distance-based band distance measure, the EM band clustering, the weighted average fusion, and SVM classifier. The computational cost of the Bhattacharya distance measure for all pairs of bands scales is , where is the number of the spectral bands. The computational cost of EM clustering method is , where is the number of iterations in EM clustering and is the number of clusters formed. In the weighted average fusion, the computation cost comes mainly from Eq. (10), which scales as , where is the number of iteration in the process. For the SVM with RBF kernel, the computational cost is , where is the number of input dimensions. Hence, the total computational cost of the proposed algorithm is the arithmetic sum of the computational costs of all stages, which is given as: Although the proposed method shows a significant classification performance, its training phase requires the determination of an optimal weight value of the band in the fusion process, which is computationally expensive. 3.Results and DiscussionThis section presents the experimental analysis of the proposed method using a standard bench-mark hyperspectral datasets widely used in the literature. 3.1.Dataset DescriptionA series of experiments were conducted on four standard bench-mark datasets, namely, Indian Pines, Pavia University, Salinas, and Botswana dataset, available in Ref. 47. Datasets such as Indian Pines, Pavia University, and Salinas are small-size datasets captured by airborne sensor, whereas Botswana dataset is a large-size hyperspectral dataset, which is captured by space borne or satellite sensors. The detailed description of each dataset is given below:
3.2.Evaluation MeasuresThe classification performance of the proposed EM-WAF technique is assessed using three commonly used quality metrics, i.e., overall accuracy (OA), average accuracy (AA), and . Percentage of the correctly classified pixels in the whole scene: Mean of the percentage of the correctly labeled pixels for each class: It is a robust measure of the degree of agreement, which integrates diagonal and off-diagonal entries of a confusion matrix. 3.3.Parameters SettingsFor the EM clustering algorithm, the number of iteration is set to 10. For an optimal weight finding procedure, the balance factor is set to 0.5 and the number of iterations is set to 100. The SVM classifier with RBF kernels has two parameters: the penalty parameter and the RBF parameter are tuned through fivefold cross validation (, ). 3.4.Experimental ResultsIn this section, the impact of different proportions of training samples on OA, the classification results obtained for Indian Pines, Pavia University, Salinas, and Botswana dataset, analysis of the features extracted by the proposed method, and remarkable findings are discussed. All the experiments are conducted using MATLAB 2018a on PC with 16 GB RAM and 2.70 GHz CPU. In the beginning, to evaluate the effectiveness of the proposed method with fewer amounts of labeled data, 20% of the samples for each class from the Indian Pines, Pavia University, Salinas, and Botswana dataset are randomly chosen as training samples, and the remaining samples in each class are used for testing purpose. Section 3.4.1 provides a detailed analysis of the different proportions of the training samples on OA. The experiment is conducted ten times to evaluate an average of OA, AA, and kappa coefficient. Four different categories of methods have been considered for comparison for verification of the superiority of the proposed method.
3.4.1.Influence of different proportion of training samples on OA obtained by the proposed method for all four hyperspectral datasetsThe performance of the proposed method is validated against different proportions of training samples, namely, 5%, 10%, 15%, 20%, 25%, 30%, 35%, 40%, 45%, and 50% of the labeled training samples per class. Figure 3 shows OA obtained using the proposed method for different proportions of the training samples. The proposed method has seen a good discriminative ability to deal even with a smaller size of the labeled samples, 5% of training sample size per class. With increase in the number of training sample, the classification performance of the proposed method increases gradually for all four datasets. The sample size of more than 20% does not have much impact on the OA. However, the increase in sample size increases the computational burden in the training phase. Hence, the proposed method is tested with 20% of the training samples. 3.4.2.Results analysis by comparing the proposed method with different classification methods on Indian Pines datasetThe ground truth data of Indian Pines dataset are shown in Fig. 4(a), where the different colors signify the various land cover categories. Figure 4(b) shows the spectral signature or the reflectance of each category. The classification maps obtained for all the competing methods on Indian Pines dataset as shown in Fig. 5 and the classification results (i.e., OA, class wise accuracy, AA, and ) are reported in Table 1. Figure 5 and Table 1 show that the proposed method achieves the better result when compared with the competing methods in terms of OA, AA, and . It is due to the use of EM clustering algorithm for band partitioning and weighted average fusion for fusing the correlated band that leads to increase the interclass separation and decrease the intraclass separation. Table 1Comparison of classification accuracies (%) obtained by the proposed method with other competing methods for Indian Pines dataset.
Note: Highest value across the method is represented in bold font. Table 1 shows EM-WAF method achieving a good performance compared to clustering-based methods, namely, CBFE, DCCA, CEM-BCC/BDC, CEM-BCM/BDM, LCMV-BCC/BDC, LCMV-BCM/BDM, and E-FDPC. The proposed technique shows a noticeable performance due to the presence of larger discriminative information by the clustering and fusing of highly correlated bands. The classification accuracy of the proposed EM-WAF method is much better than that of the simple IF method and highlights the importance of weight factor in the fusion process. Clustering-based methods and IF method only consider the intracluster distance, which limits the discriminative ability, whereas the proposed method considers the intercluster distance as well as intracluster distance, which leads to a better discriminative ability. Hence, the proposed EM-WAF technique preserves the useful as well as the discriminative information of the original data. When compared to the other competing approaches, the proposed EM-WAF approach achieves a substantial improvement in terms of the class wise classification accuracy as shown in Table 1 (boldface). It is evident that the classification accuracy of the classes “alfalfa,” “corn-no till,” “corn-min till,” “corn,” “grass-pasture-mowed,” “hay-windrowed,” “oat,” “soybean-no till,” “soybean-min till,” “soybean-clean,” and “woods” increases from 54.63% to 94.01%, 41.59% to 81.09%, 47.54% to 87.24%, 51.65% to 91.15%, 44.65% to 84.15%, 59.58% to 99.08%, 43.54% to 83.04%, 53% to 89.06%, 58.4% to 97.9%, 48.87% to 88.37%, and 57.53% to 97.03%, respectively. In particular, in the class “wheat” all the pixels are correctly classified through the use of the proposed method. However, it is observed that the proposed method achieves slightly lesser accuracy for the individual classes such as “grass-pasture” and “stone-steel-towers” when compared to the IF method (achieves 100% accuracy for both classes) as shown in Table 1. 3.4.3.Results analysis by comparing the proposed method with different classification methods on Pavia University datasetThe ground truth data of Pavia University dataset are shown in Fig. 6(a), where the different colors denote the different categories. Figure 6(b) shows the spectral signature or the reflectance of each category. The classification maps obtained for all the competing techniques along with the proposed technique on Pavia University dataset are depicted in Fig. 7 and the classification results (i.e., OA, class wise accuracy, AA, and ) are presented in Table 2. Figure 7 and Table 2 show that the proposed EM-WAF technique achieving the best result among all the competing methods in terms of OA, AA, and . It is due to the fact of EM clustering extracts more useful information and increases the separation among the spectral classes. As shown in Table 2, the classification accuracy of the proposed EM-WAF method is much better than the IF method showing the importance of the weight factor in the fusion process. In other words, the proposed method preserves the complementary information of all bands well. Table 2Comparison of classification accuracies (%) obtained by the proposed method with other competing methods for Pavia University dataset.
Note: Highest value across the method is represented in bold font. A shown in Fig. 7, the proposed approach helps in the elimination of most of the noisy pixels generated by the other methods, and the overall classification accuracy increases by more than 2%. For instance, the misclassified pixels are corrected in the green region at the center of Fig. 7, which is very close to the ground truth and the classification map becomes smoother. When compared to the other competing approaches, the proposed approach shows a significant improvement in the class wise classification accuracy as shown in Table 2 (boldface). For instance, the classification accuracy of class “Gravel” increases from 7.98% to 85.21%. Moreover, the proposed method correctly classified the class “painted metal sheets.” However, EM-WAF approach is seen producing lesser classification accuracy for individual class, namely, “self-blocking bricks” when compared to LCMV-BCM/BDM method as shown in Table 2. The reason is that fusion of the spectral bands eliminates the important spectral features of the respective land cover class. 3.4.4.Results analysis by comparing the proposed method with different classification methods on Salinas datasetThe ground truth data of the Salinas dataset are shown in Fig. 8(a), where the different colors represent the different categories. Figure 8(b) shows the spectral signature or the reflectance of each category. The classification maps of all the competing techniques on Salinas dataset are shown in Fig. 9 and the classification results (i.e., OA, class wise accuracy, AA, and ) are reported in Table 3. Table 3 and Fig. 9 show that the proposed method achieves the best performance in terms of the quantitative results and visual interpretation. Table 3Comparison of classification accuracies (%) obtained by the proposed method with other competing methods for Salinas dataset.
Note: Highest value across the method is represented in bold font. Though all the competing methods are quite useful for dimensionality reduction, CBFE and DCCA methods attain noticeable performance over E-FDPC and other CBS methods. However, the proposed method shows the significant performance over all the other competing methods. It is due to the fact that the clustering and weighted average fusion of the highly correlated bands provide more discriminative information. It shows the proposed EM-WAF technique extracting the significant features of the data. Consequently, the superiority of the EM-WAF approach can be explained by the use of weighted average of useful bands. When compared to the other competing methods, the performance of the proposed method is superior in terms of OA, AA, and . In most of the classes, the class wise accuracy of the proposed method exceeds 90%. However, the proposed method fails to obtain a good performance for a few classes. For instance, the pixels of class “grapes-untrained” are misclassified with the pixels of “vinyard-untrained” class. This misclassification occurs as the spectral signatures of these two classes are almost the same. Figure 9 shows that the region uniformity of the classes “fallow” and “corn-senesced-green-weeds” (marked by red circles) as improved by the proposed method when compared to the other competing methods. 3.4.5.Results analysis by comparing the proposed method with different classification methods on Botswana DatasetThe ground truth information relating to Botswana dataset used for experimentation is shown in Fig. 10(a), where the different colors signify the different land cover categories. Figure 10(b) shows the spectral signature or the reflectance of each category. The classification maps of all the competing techniques on Botswana dataset are shown in Fig. 11 and the classification results (i.e., OA, class wise accuracy, AA, and ) are summarized in Table 4. Table 4Comparison of classification accuracies (%) obtained by the proposed method with other competing methods for Botswana dataset.
Note: Highest value across the method is represented in bold font. The results reported in Table 4 lead to the observation of the proposed EM-WAF method delivering a better performance than the other competing methods. Table 4 shows the classification results obtained by the proposed clustering and fusion-based method are very promising, which indicates the possibility of classification of the large-size dataset using the proposed method. Table 4 shows the E-FDPC method obtains significant performance superior that of other clustering and constrained-based selection methods, this is mainly due to the band selection strategy of the ranking-based methods. However, the proposed method is better than the E-FDPC method, since the latter technique only considers the intracluster distance between the data points, whereas the former technique considers intercluster as well as intracluster distance between the data points, resulting in good discriminative capabilities for the classification. Table 4 shows that the proposed method achieves better class wise accuracies for most of the classes. It is observed that the proposed method classifies all pixels of the class “water” correctly. Compared to the other competing methods, classes such as “hippo grass,” “Acacia woodlands,” “short mopane,” and “mixed mopane” are better distinguished by the proposed method. The performance of the proposed method is better than that of the other competing methods for the classes, “reeds1” and “riparian,” though it is not satisfactory. The main reason is that the samples selected from such classes consist of more redundant information. 3.4.6.Analysis of number of selected bands or features for all four hyperspectral datasetsTable 5 shows the number of selected bands or features and OA for four hyperspectral datasets. Table 5 shows the ability of the proposed approach to achieve a better classification accuracy through selection of features of an optimal number. In other words, the proposed approach selects the features that separate the land cover classes well. Table 5Number of selected bands or features and OA (%) for all four hyperspectral datasets.
As shown in Table 5, the features extracted by the proposed method for all datasets achieve the highest classification accuracy. For the Indian Pines dataset, the proposed method provides a maximum OA of 92.19% among all the competing methods for only seven features, which is found to be optimal. For the Pavia University dataset, the proposed method delivers the highest OA of 94.10% among all the competing methods for only 11 optimal features. For the Salinas dataset, CBFE method provides 85.14% OA for only 12 features, which are the minimum number of features extracted by CBFE among all other competing methods. However, the proposed method achieves a maximum OA of 93.96% among all the competing methods for an optimal number of 13 averaged bands. For Botswana dataset, the proposed method provides OA, which is slightly better than E-FDPC method. However, the proposed method achieves maximum OA (84.92%) among all the competing methods for only 20 features, which is found to be optimal one. Table 5 shows that the proposed approach extracts meaningful features from the hyperspectral data. These features are suitable and adequate for the hyperspectral image classification. These results indicate that: (a) the pairwise distance-based band separability is an important aspect for feature extraction; (b) consideration of intracluster and intercluster distance provides more discriminative information; and (c) an appropriate weighting mechanism for the weighted average fusion improves the performance of feature extraction significantly. 4.ConclusionIn this paper, EM clustering and weighted average fusion technique-based feature extraction for hyperspectral image classification has proposed. The proposed method explores the information among the clusters and removes redundancy among the bands. The EM algorithm converges to the best number of clusters, thereby providing an effective way to determine an optimal number of features. The weight factor of the bands is calculated on the basis of the criteria of minimizing the distance inside each cluster and maximizing the distance among the different clusters, which highlights the importance of the particular band in the fusion process. The significance of this technique lies in its highly discriminative ability, which leads to a better classification performance. Experimental results and comparison with the existing approaches prove the efficiency of the proposed method for hyperspectral image classification. When compared with the other competing methods on four standard datasets, the proposed method achieves higher classification accuracy and better visual results. For the Botswana dataset, the proposed method provides better OA among all other competing methods, which makes it evident that the proposed method can classify a large-size dataset effectively. Moreover, the proposed method performs equally well for all four hyperspectral datasets, showing the robustness of the proposed method in both small- and large-size datasets. In our future work, we will focus on integrating the spatial features with the spectral features to improve the classification performance. AcknowledgmentsThe authors would like to thank the anonymous reviewers for their comments and valuable suggestions, which greatly helped us to improve the technical quality and presentation of the manuscript. The authors thank VIT for providing a VIT seed grant for carrying out this research work and the Council of Scientific & Industrial Research (CSIR), New Delhi, India for the award of CSIR-SRF. ReferencesH. Ren and C.-I. Chang,
“Automatic spectral target recognition in hyperspectral imagery,”
IEEE Trans. Aerosp. Electron. Syst., 39
(4), 1232
–1249
(2003). https://doi.org/10.1109/TAES.2003.1261124 IEARAX 0018-9251 Google Scholar
M. Khodadadzadeh et al.,
“A new framework for hyperspectral image classification using multiple spectral and spatial features,”
in IEEE Geoscience and Remote Sensing Symp.,
4628
–4631
(2014). https://doi.org/10.1109/IGARSS.2014.6947524 Google Scholar
S. S. Sawant and M. Prabukumar,
“Semi-supervised techniques based hyper-spectral image classification: a survey,”
in Innovations in Power and Advanced Computing Technologies (i-PACT),
(2017). https://doi.org/10.1109/IPACT.2017.8244999 Google Scholar
J. Richards, Remote Sensing Digital Image Analysis, Springer-Verlag, Berlin
(1999). Google Scholar
G. F. Hughes,
“On the mean accuracy of statistical pattern recognizers,”
IEEE Trans. Inf. Theory, 14
(1), 55
–63
(1968). https://doi.org/10.1109/TIT.1968.1054102 IETTAW 0018-9448 Google Scholar
C. Burges,
“Dimension reduction: a guided tour,”
Found. Trends Mach. Learn., 2
(4), 275
–364
(2010). https://doi.org/10.1561/2200000002 MALEEZ 0885-6125 Google Scholar
R. Vaddi and M. Prabukumar,
“Comparative study of feature extraction techniques for hyper spectral remote sensing image classification: a survey,”
in Int. Conf. on Intelligent Computing and Control Systems (ICICCS),
543
–548
(2017). https://doi.org/10.1109/ICCONS.2017.8250521 Google Scholar
C. I. Chang,
“A joint band prioritization and band decorrelation approach to band selection for hyperspectral image classification,”
IEEE Trans. Geosci. Remote Sens., 37
(6), 2631
–2641
(1999). https://doi.org/10.1109/36.803411 IGRSD2 0196-2892 Google Scholar
C. I. Chang and S. Wang,
“Constrained band selection for hyperspectral imagery,”
IEEE Trans. Geosci. Remote Sens., 44
(6), 1575
–1585
(2006). https://doi.org/10.1109/TGRS.2006.864389 IGRSD2 0196-2892 Google Scholar
X. Bai et al.,
“Semisupervised hyperspectral band selection via spectral–spatial hypergraph model,”
IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens., 8
(6), 2774
–2783
(2015). https://doi.org/10.1109/JSTARS.2015.2443047 Google Scholar
X. Cao et al.,
“Fast hyperspectral band selection based on spatial feature extraction,”
J. Real-Time Image Process., 1
–10
(2018). https://doi.org/10.1007/s11554-018-0777-9 Google Scholar
C. Yu, M. Song and C. Chang,
“Band subset selection for hyperspectral image classification,”
Remote Sens., 10 113
(2018). https://doi.org/10.3390/rs10010113 Google Scholar
Q. Chen,
“Band selection algorithm based on information entropy for hyperspectral image classification,”
J. Appl. Remote Sens., 11
(2), 026018
(2017). https://doi.org/10.1117/1.JRS.11.026018 Google Scholar
W. Zhang, X. Li and L. Zhao,
“Hyperspectral band selection based on triangular factorization,”
J. Appl. Remote Sens., 11
(2), 025007
(2017). https://doi.org/10.1117/1.JRS.11.025007 Google Scholar
S. Samiappan, S. Prasad and L. Bruce,
“Non-uniform random feature selection and kernel density scoring with SVM based ensemble classification for hyperspectral image analysis,”
IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens., 6
(2), 792
–800
(2013). https://doi.org/10.1109/JSTARS.2013.2237757 Google Scholar
J. C. Davis,
“Introduction to statistical pattern recognition: 2nd edition, by Keinosuke Fukunaga, Academic Press, San Diego, 1990, 591 p., ISBN 0-12-269851-7, US$69.95,”
Comput. Geosci., 22
(7), 833
–834
(1990). https://doi.org/10.1016/0098-3004(96)00017-9 CGEODT 0098-3004 Google Scholar
F. Tsai, E.-K. Lin and K. Yoshino,
“Spectrally segmented principal component analysis of hyperspectral imagery for mapping invasive plant species,”
Int. J. Remote Sens., 28
(5), 1023
–1039
(2007). https://doi.org/10.1080/01431160600887706 IJSEDK 0143-1161 Google Scholar
X. Junshi et al.,
“(Semi-) supervised probabilistic principal component analysis for hyperspectral remote sensing image classification,”
IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens., 7
(6), 2224
–2236
(2014). https://doi.org/10.1109/JSTARS.2013.2279693 Google Scholar
A. Villa et al.,
“Hyperspectral image classification with independent component discriminant analysis,”
IEEE Trans. Geosci. Remote Sens., 49
(12), 4865
–4876
(2011). https://doi.org/10.1109/TGRS.2011.2153861 IGRSD2 0196-2892 Google Scholar
X. Liu et al.,
“A maximum noise fraction transform with improved noise estimation for hyperspectral images,”
Sci. China Ser. F, 52
(9), 1578
–1587
(2009). https://doi.org/10.1007/s11432-009-0156-z Google Scholar
M. Fauvel, J. Chanussot and J. A. Benediktsson,
“Kernel principal component analysis for the classification of hyperspectral remote sensing data over urban areas,”
EURASIP J. Adv. Signal Process., 2009 783194
(2009). https://doi.org/10.1155/2009/783194 Google Scholar
M. E. Tipping,
“Probabilistic principal component analysis,”
J. R. Stat. Soc. Ser. B, 61
(3), 611
–622
(1999). https://doi.org/10.1111/rssb.1999.61.issue-3 Google Scholar
T. V. Bandos, L. Bruzzone and G. Camps-Valls,
“Classification of hyperspectral images with regularized linear discriminant analysis,”
IEEE Trans. Geosci. Remote Sens., 47
(3), 862
–873
(2009). https://doi.org/10.1109/TGRS.2008.2005729 IGRSD2 0196-2892 Google Scholar
B. C. Kuo and D. A. Landgrebe,
“Nonparametric weighted feature extraction for classification,”
IEEE Trans. Geosci. Remote Sens., 42
(5), 1096
–1105
(2004). https://doi.org/10.1109/TGRS.2004.825578 IGRSD2 0196-2892 Google Scholar
B. C. Kuo, C. H. Li and J. M. Yang,
“Kernel nonparametric weighted feature extraction for hyperspectral image classification,”
IEEE Trans. Geosci. Remote Sens., 47
(4), 1139
–1155
(2009). https://doi.org/10.1109/TGRS.2008.2008308 IGRSD2 0196-2892 Google Scholar
M. Prabukumar et al.,
“Threedimensional discrete cosine transform-based feature extraction for hyperspectral image classification,”
J. Appl. Remote Sens., 12
(4), 046010
(2018). https://doi.org/10.1117/1.JRS.12.046010 Google Scholar
I. Makki et al.,
“A survey of landmine detection using hyperspectral imaging,”
ISPRS J. Photogramm. Remote Sens., 124 40
–53
(2017). https://doi.org/10.1016/j.isprsjprs.2016.12.009 IRSEE9 0924-2716 Google Scholar
A. R. Webb, Statistical Pattern Recognition, 71
(8), John Wiley & Sons, Ltd., England
(2011). Google Scholar
S. Sawant and M. Prabukumar,
“Band fusion based hyper spectral image classification,”
Int. J. Pure Appl. Math., 117
(17), 71
–76
(2017). Google Scholar
M. Imani and H. Ghassemian,
“Band clustering-based feature extraction for classification of hyperspectral images using limited training samples,”
IEEE Geosci. Remote Sens. Lett., 11
(8), 1325
–1329
(2014). https://doi.org/10.1109/LGRS.2013.2292892 Google Scholar
Q. Yan et al.,
“Class probability propagation of supervised information based on sparse subspace clustering for hyperspectral images,”
Remote Sens., 9 1017
(2017). https://doi.org/10.3390/rs9101017 Google Scholar
X. Peng et al.,
“Constructing the L2-graph for subspace learning and subspace clustering,”
IEEE Trans. Cybern., 6 1
–14
(2016). Google Scholar
Y. Yuan, J. Lin and Q. Wang,
“Dual-clustering-based hyperspectral band selection by contextual analysis,”
IEEE Trans. Geosci. Remote Sens., 54
(3), 1431
–1445
(2016). https://doi.org/10.1109/TGRS.2015.2480866 IGRSD2 0196-2892 Google Scholar
M. Khoder et al.,
“Multicriteria classification method for dimensionality reduction adapted to hyperspectral images,”
J. Appl. Remote Sens., 11
(2), 025001
(2017). https://doi.org/10.1117/1.JRS.11.025001 Google Scholar
X. Sun et al.,
“Hyperspectral image clustering method based on artificial bee colony algorithm,”
in Sixth Int. Conf. on Advanced Computational Intelligence (ICACI),
106
–109
(2013). https://doi.org/10.1109/ICACI.2013.6748483 Google Scholar
H. Su and P. Du,
“Multiple classifier ensembles with band clustering for hyperspectral image classification,”
Eur. J. Remote Sens., 47
(1), 217
–227
(2014). https://doi.org/10.5721/EuJRS20144714 Google Scholar
R. Liu, H. Wang and X. Yu,
“Shared-nearest-neighbor-based clustering by fast search and find of density peaks,”
Inf. Sci., 450 200
–226
(2018). https://doi.org/10.1016/j.ins.2018.03.031 Google Scholar
S. Jia et al.,
“A novel ranking-based clustering approach for hyperspectral band selection,”
IEEE Trans. Geosci. Remote Sens., 54
(1), 88
–102
(2016). https://doi.org/10.1109/TGRS.2015.2450759 IGRSD2 0196-2892 Google Scholar
H. Xie et al.,
“Unsupervised hyperspectral remote sensing image clustering based on adaptive density,”
IEEE Geosci. Remote Sens. Lett., 15
(4), 632
–636
(2018). https://doi.org/10.1109/LGRS.2017.2786732 Google Scholar
B. Peng et al.,
“Weighted-fusion-based representation classifiers for hyperspectral imagery,”
Remote Sens., 7 14806
–14826
(2015). https://doi.org/10.3390/rs71114806 Google Scholar
S. Prasad and L. M. Bruce,
“Decision fusion with confidence-based weight assignment for hyperspectral target recognition,”
IEEE Trans. Geosci. Remote Sens., 46
(5), 1448
–1456
(2008). https://doi.org/10.1109/TGRS.2008.916207 IGRSD2 0196-2892 Google Scholar
T. Lu et al.,
“From subpixel to superpixel: a novel fusion framework for hyperspectral image classification,”
IEEE Trans. Geosci. Remote Sens., 55
(8), 4398
–4411
(2017). https://doi.org/10.1109/TGRS.2017.2691906 IGRSD2 0196-2892 Google Scholar
B. Kumar and O. Dikshit,
“Hyperspectral image classification based on morphological profiles and decision fusion,”
Int. J. Remote Sens., 38
(20), 5830
–5854
(2017). https://doi.org/10.1080/01431161.2017.1348636 IJSEDK 0143-1161 Google Scholar
A. Dempster, N. Laird and D. Rubin,
“Maximum likelihood from incomplete data via the EM algorithm,”
J. R. Stat. Soc. Ser. B, 39
(1), 1
–38
(1977). JSTBAJ 0035-9246 Google Scholar
R. Yang et al.,
“Representative band selection for hyperspectral image classification,”
J. Vision Commun. Image Represent., 48 396
–403
(2017). https://doi.org/10.1016/j.jvcir.2017.02.002 Google Scholar
F. Melgani and L. Bruzzone,
“Classification of hyperspectral remote sensing,”
IEEE Trans. Geosci. Remote Sens., 42
(8), 1778
–1790
(2004). https://doi.org/10.1109/TGRS.2004.831865 IGRSD2 0196-2892 Google Scholar
“Hyperspectral remote sensing scenes,”
(2007) http://www.ehu.eus/ccwintco/index.php/Hyperspectral_Remote_Sensing_Scenes Google Scholar
BiographyManoharan Prabukumar received his BE degree in electronics and communication engineering from Periyar University, Tamilnadu, India, in 2002, his MTech degree in computer vision and image processing from Amrita School of Engineering, Coimbatore, India, in 2007, and his PhD in computer graphics from Vellore Institute of Technology (VIT), Tamilnadu, India, in 2014. Currently, he is working as an associate professor in the School of Information Technology and Engineering, VIT. His research interests include hyperspectral remote sensing, image processing, computer graphics, and machine learning. Sawant Shrutika received her BE and ME degrees in electronics and telecommunication engineering from Shivaji University, Maharashtra, India, in 2009 and 2012, respectively. Currently, she is pursuing her PhD in hyperspectral image processing from VIT, Vellore, Tamilnadu, India. She has been awarded with the senior research fellowship from the Council of Scientific and Industrial Research, New Delhi, India. Her research interests include hyperspectral remote sensing, image processing, and machine learning. |