Open Access
2 November 2018 Band clustering using expectation–maximization algorithm and weighted average fusion-based feature extraction for hyperspectral image classification
Manoharan Prabukumar, Sawant Shrutika
Author Affiliations +
Abstract
The presence of a significant amount of information in the hyperspectral image makes it suitable for numerous applications. However, extraction of the suitable and informative features from the high-dimensional data is a tedious task. A feature extraction technique using expectation–maximization (EM) clustering and weighted average fusion technique is proposed. Bhattacharya distance measure is used for computing the distance among all the spectral bands. With this distance information, the spectral bands are grouped into the clusters by employing the EM clustering method. The EM algorithm automatically converges to an optimum number of clusters, thereby specifying the absence of need for the required number of clusters. The bands in each cluster are fused together applying the weighted average fusion method. The weight of each band is calculated on the basis of the criteria of minimizing the distance inside the cluster and maximizing the distance among the different clusters. The fused bands from each cluster are then considered as the extracted features. These features are used to train the support vector machine for classification of the hyperspectral image. The performance of the proposed technique has been validated against three small-size standard bench-mark datasets, Indian Pines, Pavia University, Salinas, and one large-size dataset, Botswana. The proposed method achieves an overall accuracy (OA) of 92.19%, 94.10%, 93.96%, and 84.92% for Indian Pines, Pavia University, Salinas, and Botswana datasets, respectively. The experimental results prove that the proposed technique attains significant classification performance in terms of the OA, average accuracy, and Cohen’s kappa coefficient (k) when compared to the other competing methods.

1.

Introduction

Hyperspectral sensors [e.g., Airborne Visible Infrared Imaging Spectrometer (AVIRIS), hyperspectral digital imagery collection experiment, HyMap, and EO-1 Hyperion] record a scene over the wide wavelength ranging from the visible region to the infrared spectrum, which provides detailed spectral information about the objects in numerous and continuous spectral bands (from tens to several hundreds) as well as a high spatial resolution.1 Due to the high spectral resolution, hyperspectral images offer very high-discrimination capabilities among similar ground cover objects.2 However, the huge numbers of bands always bring the curse of dimensionality, reducing the discriminating ability of the data as the dimensionality increases with fewer numbers of labeled training samples.3,4 This behavior is also referred to as “Hughes phenomenon.”5 Moreover, the high dimensionality of the hyperspectral image also consists of redundant and noisy information, which increases the computational burden of the data processing. So dimensionality reduction becomes an essential task in the hyperspectral image processing.

Dimensionality reduction is the process of reducing redundant data and extracting meaningful features. In other words, dimensionality reduction is a convenient way of reducing the number of spectral bands and transforming the data from a high-dimensional space to a lower dimensional space, where the most significant information is conserved.6,7 Dimensionality reduction can be done through the feature selection or feature extraction method. In the feature selection method, a few informative bands are selected on the basis of the adopted selection criteria, namely, the distance measures (Euclidean distance, spectral angle mapping, Bhattacharyya distance, Hausdorff distance, and Jeffreys–Matusita distance), information theoretic approaches (divergence, transformed divergence, and mutual information), and Eigen analysis [principal component analysis (PCA)], where the original physical significant properties of the bands can be preserved.815 One of the popular band selection methods is the constrained band selection (CBS) method.9 It minimizes the correlation and dependency in the selection of the bands. Based on correlation and dependence, CBS method offers four different approaches, which arise from two different approaches: (1) constrained energy minimization (CEM) and (2) linearly constrained minimum variance (LCMV). There are four specific criteria for band selection such as, band correlation minimization (BCM), band correlation constraint (BCC), band dependence minimization (BDM), and band dependence constraint (BDC). These four criteria divide the CEM and LCMV approach into to four parts: CEM-BCC/BDC, CEM-BCM/BDM, LCMV-BCC/BDC, and LCMV-BCM/BDM. Feature selection provides suitable features for classification but is computationally expensive and often not robust in complex scenes (variation in spectral signatures across scenes). On the other hand, feature extraction methods transform the higher dimensional data into the lower dimensional space. They are computationally superior and more robust to the complex scenes. However, extraction of efficient and suitable features in the classification of large hyperspectral data is a highly crucial task.

Feature extraction methods transform the original high-dimensional feature space into a low-dimensional feature space, which faces loss of the physical meaning of the bands but preserves the significant discriminative information needed for further analysis.1625,26 PCA is one of the most widely used approaches for feature extraction.16 This is due to the fact of PCA being an invertible transformation, which facilitates the interpretation of the extracted features. PCA offers high-computational load and operates on the global features but loses local information.27 The extension of PCA, segmented PCA method,17 is presented for addressing this issue. Here, for using the local information, PCA is applied to the groups of bands formed using the correlation between bands. Another most useful feature extraction method is independent component analysis (ICA),19 which is used for the extraction of class discriminant features from the hyperspectral images. But the complexity of ICA method increases the computational load. In general, the hyperspectral data are nonlinear in nature. Hence, the linear classifier usually provides unsatisfied classification performance. In recent times, some nonlinear methods such as maximum noise fraction20 and kernel PCA,21 and probabilistic PCA (PPCA) are proposed as an extension to the conventional PCA. PPCA is a constraint Gaussian generative latent variable model. PPCA extracts features using the maximum likelihood estimates for the parameters associated with the covariance matrix that can be efficiently calculated from the data principal component.22 In most of the situations, the labeled samples are limited and obtaining the labeled samples is a very expensive and time-consuming task. On the other hand, unlabeled samples are available in large quantities at low cost. Hence, semisupervised PPCA is proposed as an extension of PPCA, which uses both the labeled as well as unlabeled information into the projection for overcoming the problem of the scarcity of the labeled samples.18 Apart from the PCA, there are two other best known feature extraction approaches, discriminant analysis feature extraction28 and linear discriminant analysis (LDA).23 In recent times, many other extensions to the above-mentioned two methods have been proposed, namely, regularized LDA,23 nonparametric weighted feature extraction (NWFE),24 and kernel NWFE.25 Another most popular feature extraction approach is the clustering-based feature extraction (CBFE). Clustering makes partitions of the hyperspectral image into several uncorrelated subband groups, each of which contains contiguous bands. Clustering has received increasing attention in the hyperspectral remote sensing community due to its better performance toward the curse of dimensionality problem.2935 Clustering technique removes redundancies and the correlated data from the high-dimensional data and provides uncorrelated low-dimensional data. In Ref. 30, CBFE is proposed. It works well in a small sample size scenario using the most popular k-means clustering algorithm. A semisupervised k-means clustering method is proposed for utilizing the easily available unlabeled samples.36 It uses the multiple classifiers for each cluster of band and the final output is the fused result of the multiple classifiers. Clustering methods do not require a priori knowledge in advance to the band grouping process, but make the cluster of the bands as per the distribution of the spectral features of hyperspectral image. Moreover, clustering methods are too sensitive to the randomly initialized cluster center and selected subset of bands may be unstable. Hence, in Ref. 37, an automatic clustering method [fast density peak-based clustering (FDPC)] is proposed, which selects the cluster centers using the fast search method. But it is not a fully automatic cluster center selection method and loses the data points. Hence, improvements in FDPC are proposed, namely, enhanced fast density peak-based clustering (E-FDPC),38 and k-means fast density peak-based clustering.39 Dual clustering-based band selection by context analysis (DCCA)33 does the clustering by considering the context information in the bands of the hyperspectral image. Recently, along with the algorithm development for the hyperspectral image classification, fusion methods such as decision level and feature level fusion methods have gained great interest,4043 and these methods demonstrated the ability of the combination of the selected features to improve the classification performance. Considering the above study of the feature extraction techniques, the authors of this work found the following challenges:

  • 1. Though the existing clustering-based feature extraction approaches show a significant performance, the emphasis of these conventional clustering strategies is on raw spectral features rather than exploiting more complementary information from the bands of the hyperspectral cube.

  • 2. The existing clustering-based feature extraction approaches fail to find an optimal number of clusters and are very sensitive to the number of clusters.

  • 3. The existing feature extraction methods work well in small-size data, but fail to show the effectiveness in the case of the large-size data.

The main contributions of the proposed method are summarized as follows.

  • 1. An effective expectation–maximization clustering and weighted average fusion (EM-WAF)-based feature extraction method is proposed for the hyperspectral image classification.

  • 2. The EM algorithm automatically converges to an optimal number of clusters. Therefore, the proposed technique circumvents the necessity to specify the number of clusters by making the use of the EM clustering algorithm.

  • 3. The bands from each cluster are combined by adopting the weighted average fusion method. This process usually improves the classification performance by giving more weight to the particular band, thereby providing more discriminative and complementary information. Calculation of the weight is done on the basis of the criteria of minimizing the intracluster distance and maximizing the intercluster distance. The fused bands obtained from each cluster are then considered as extracted features, which are further used for the hyperspectral image classification.

  • 4. Finally, the experimentation is done on both small and large-size datasets to prove the effectiveness of the proposed method.

The remainder of this paper is arranged as follows: in Sec. 2, the proposed architecture of EM clustering and weighted average fusion-based hyperspectral image classification is explained in detail. Mathematical details of EM clustering and weighted average fusion are also discussed. Experimental analysis of four standard datasets is presented in Sec. 3. More precisely, the proposed method is compared with other clustering and fusion-based methods. Comparison is done for both quantitative accuracy and visual interpretation. Section 4 provides the concluding remarks.

2.

Proposed Architecture

This section discusses the proposed architecture of the feature extraction for hyperspectral image classification in detail. The proposed feature extraction architecture is presented in Fig. 1, which depicts the proposed approach as comprising three stages, namely, band clustering, the fusion of the bands of each cluster, and classification. The following section provides a detailed explanation of the various stages present in the proposed system.

Fig. 1

The architecture of the proposed EM-WAF method for hyperspectral image classification.

JARS_12_4_046015_f001.png

2.1.

Band Clustering

Hyperspectral data consist of the hundreds of spectral bands, which are highly redundant due to similar sensor responses in two adjacent bands. The objective of the band clustering is to group the highly correlated bands and group them into distant clusters. Figure 2 shows the workflow of the band clustering procedure. Here the Bhattacharya distance28 is used as band separability measure for computing the distance between each pair of spectral bands. The Bhattacharya distance between bands bi and bj is defined as

Eq. (1)

bi,j=18(μiμj)T(Σi+Σj2)1(μiμj)+12ln[|(Σi+Σj)/2||Σi|12|Σj|12].
Here, μi and μj are band means, Σi and Σj are band covariance matrices.

Fig. 2

The band clustering procedure. In this procedure, the pairwise band separability information is calculated, and then EM clustering is conducted to generate “d” band clusters.

JARS_12_4_046015_f002.png

Using the distance information, the bands are clustered using the EM clustering algorithm. The band clustering procedure using the EM clustering algorithm is explained in detail in the following section.

2.1.1.

Band clustering using EM algorithm

Using the generated distances between each pair of spectral bands, all the original bands are grouped into “d” clusters. Clustering is done using the EM algorithm.

The EM clustering algorithm features the partial allotment of points to different clusters instead of assigning them to the closest cluster center. This can be achieved by modeling each cluster using the probabilistic distribution. Finally, the algorithm is converged into the cluster with the highest probability. The K-means clustering algorithm is an incremental heuristic approach, whereas the EM algorithm is a statistical algorithm that assumes a statistical model that describes the data. The assumption of the EM algorithm to cluster analysis is that the patterns are drawn from one or several distributions. The goal here is to identify the parameters of each distribution. In this case, the parameters of a Gaussian mixture model have to be estimated. The EM algorithm44 is a probabilistic model used for finding the maximum likelihood estimates of the parameters from the patterns. Assume that bands belonging to the same cluster are drawn from a multivariate Gaussian probability distribution for forming the cluster of bands. The EM clustering algorithm converges to an optimal value of the clusters. It considered as converged when there is no further change in the assignment of the bands to cluster. The EM clustering algorithm is explained in Algorithm 1.

Algorithm 1

Band clustering using EM algorithm.

Input:b={b1,b2,b3,,bn} be the set of the bands and C={C1,C2,,Cc} be the set of centroid centers, max_iteration k.
Output: An optimal number of “d” band clusters.
Step 1: Initialization
   i) Initially select c bands randomly from the set b as cluster center. Let us consider, μj is the mean, Σj is covariance matrix, and αj is the weight. Each cluster Cj is represented by a Gaussian distribution N(μj,Σj) and αj.
Step 2: Iteration
 i) While (iteration<k)
 ii) Expectation step (E-step)
    Assign each band to one of the clusters according to the maximum a posteriori probability criteria.
    The probability of cluster Cj over bi, for each distance point bi and each cluster Cj:

Eq. (2)

p(Cj|bi)=p(bi|Cj)p(Cj)jp(bi|Cj)p(Cj).
    The probability density function p(bi|Cj) for a bivariate Gaussian distribution is given by

Eq. (3)

p(bi|Cj)=1(2π)d|Σj|e[12(biμj)TΣj1(biμj)].
 iii) Maximization step (M-step):
   Recompute the parameter values μj, Σj, and αj for the cluster Cj by using the probability p(Cj|bi) obtained in expectation step.
   The mean μj is computed as

Eq. (4)

μj=ip(Cj|bi)biip(Cj|bi).
   The covariance matrix Σj is computed as

Eq. (5)

Σj=ip(Cj|bi)(biμj)(biμj)Tip(Cj|bi).
   The weight αj is given as

Eq. (6)

αj=ip(Cj|bi)N.
where N is the total number of bands.
 iv) Eliminate the cluster C if p(Cj|bi) is less. The bands that belonged to the deleted clusters will be reassigned to the other clusters in the next iteration.
Step 3: Stopping criteria
 i) If the convergence criterion is not achieved, repeat the step 2.

2.2.

Weighted Average Fusion

Following the band clustering process, all the bands from each cluster are fused together using the weighted average fusion method. The fused bands should have the following characteristics:

  • 1. Decorrelation. Correlation among the clusters should be greatly reduced.

  • 2. Separability. Discrimination capability of fused bands should be increased.

The simple average fusion method proposed in Ref. 29 does not ensure any satisfactory way for removing redundant information. Hence, here, the weighted average fusion method is used for the preservation of the discriminative information of the original bands. Since the weight factor preserves the discriminative information of the original bands, it improves the classification results. Therefore, m bands in d’th cluster are fused as shown in

Eq. (7)

Fd=jmwd(j)*bjm  d,
where bj is the j’th band in d’th cluster, and wd(j) is the weight factor for j’th band in d’th cluster. Here we provide each band a weight value of w. An optimal weight value of each band is determined by updating the weight value w.45 Let the sum of band weight in each cluster be one, i.e.,

Eq. (8)

jdwd(j)=1.
The initial weight value of each band is evaluated by considering the variance of each band. The initial value of weight wd0(j) is calculated as:

Eq. (9)

wd0(j)=sjiNsi,
where sj represents variance of j’th band image and N represents the total number of bands in the hyperspectral image data.

The weight updating procedure is iterated for t times for finding the optimal weight value of each band. The weight value wdt(j) is determined using the following equation:

Eq. (10)

wdt(j)=α[wd0(j)+bidx(bi,bj)wdt1(i)]1αd1d=2,3dbidx(bi,bj)wdt1(i),
where t is the number of iterations, α is the balance factor between first and second term of Eq. (10), and x(bi,bj) is the distance between band bi and bj calculated by using Eq. (1).

In this propagation process, each time updating one band’s weight is done using all other information relating to the bands based on the distance between them. This process continues until all bands in the cluster have been updated once. The weight updating procedure indicated in Eq. (10) ensures following two characteristics of fused bands. The first term measures the compactness within the same cluster, whereas the second term measures the scatteredness among the discriminative clusters. There exists a concise form for Eq. (10):

Eq. (11)

wdt(j)=wd0(j)+a(xi,xj)Aij,
where

Eq. (12)

a(xi,xj)=bidx(bi,bj)wdt1(i).

The coefficient matrix A is defined as

Eq. (13)

A={α,if  bi,bjd1αd1d=2,3,,d,if  bjdandbicd.

Following the t iterations, the weight value of band bj is chosen by maximizing Eq. (10), i.e.,

Eq. (14)

wd(j)=argmaxwdt(j),jd[wdt(j)]  t.
Then weight value in each band cluster is normalized as follows:

Eq. (15)

wd(j)=wd(j)bjdwd(j).

Calculation of the weighted average of bands in each subgroup removes the noise from bands and also the redundant information for each subgroups. Weighted average fusion decorrelates the intercorrelated hyperspectral bands into a set of uncorrelated bands. The fused bands Fd from each cluster are then considered as set of extracted features. After fusion of bands using the weighted average fusion technique, the actual classification is performed with SVM classifier. The extracted features are used for training the SVM classifier. Its remarkable benefits in solving the complex problems such as nonlinear and high dimensionality of the data and limited training samples make the SVM classifier the most commonly used in the hyperspectral image classification.46

2.3.

Computational Cost Analysis

In this section, the theoretical computational cost of the proposed EM-WAF method is discussed. Both the arithmetic operations and the big O notation are used for calculation of the computational cost. The theoretical computational cost of the proposed method depends on four steps, namely, the Bhattacharya distance-based band distance measure, the EM band clustering, the weighted average fusion, and SVM classifier. The computational cost of the Bhattacharya distance measure for all pairs of bands scales is O(n2), where n is the number of the spectral bands. The computational cost of EM clustering method is O(nkd), where k is the number of iterations in EM clustering and d is the number of clusters formed. In the weighted average fusion, the computation cost comes mainly from Eq. (10), which scales as O(n2td), where t is the number of iteration in the process. For the SVM with RBF kernel, the computational cost is O(d2), where d is the number of input dimensions. Hence, the total computational cost of the proposed algorithm is the arithmetic sum of the computational costs of all stages, which is given as:

Eq. (16)

O(n2)+O(nkd)+O(n2dt)+O(d2).

Although the proposed method shows a significant classification performance, its training phase requires the determination of an optimal weight value of the band in the fusion process, which is computationally expensive.

3.

Results and Discussion

This section presents the experimental analysis of the proposed method using a standard bench-mark hyperspectral datasets widely used in the literature.

3.1.

Dataset Description

A series of experiments were conducted on four standard bench-mark datasets, namely, Indian Pines, Pavia University, Salinas, and Botswana dataset, available in Ref. 47. Datasets such as Indian Pines, Pavia University, and Salinas are small-size datasets captured by airborne sensor, whereas Botswana dataset is a large-size hyperspectral dataset, which is captured by space borne or satellite sensors. The detailed description of each dataset is given below:

  • a. Indian Pines dataset. It was acquired by Airborne Visible Infrared Imaging Spectrometer (AVIRIS) over North-Western Indiana region in June 1992. This dataset consists of 16 different classes of agriculture as well as vegetation species, namely, “alfalfa,” “corn-notill,” “corn-mintill,” “corn,” “grass-pasture,” “grass-trees,” “grass-pasture-mowed,” “hay-windrowed,” “oats,” “soybean-notill,” “soybean-mintill,” “soybean-clean,” “wheat,” “woods,” “buildings-grass-trees-drives,” and “stone-steel-towers.” The size of the dataset is 145×145  pixels with 20-m spatial resolution and 10-nm spectral resolution over the range of 400 to 2500 nm. It contains 224 spectral bands where only 200 bands remain for experimentation after the removal of 24 water absorption bands.

  • b. Pavia University dataset. It was captured by Reflective Optical System Imaging Spectrometer over Pavia, Northern Italy, in July 2002. This dataset contains nine different classes such as “water,” “trees,” “asphalt,” “self-blocking bricks,” “bitumen,” “tiles,” “shadows,” “meadows,” and “bare soil.” The size of the dataset is 610×340  pixels with 1.3-m spatial resolution over the range of 430 to 860 nm. It contains 103 spectral bands.

  • c. Salinas dataset. It was captured by AVIRIS over Salinas Valley, California. This dataset contains 16 different classes, namely, “brocoli-green-weeds1,” “brocoli-green-weeds2,” “fallow,” “fallow-rough-plow,” “fallow-smooth,” “stubble,” “celery,” “grapes-untrained,” “soil-vinyard-develop,” “corn-senesced-green-weeds,” “lettuce-romaine-4wk,” “lettuce-romaine-5wk,” “lettuce-romaine-6wk,” “lettuce-romaine-7wk,” “vinyard-untrained,” and “vinyard-vertical-trellis.” The size of the dataset is 512×217  pixels with 3.7-m spatial resolution over the range of 400 to 2500 nm. It contains 224 spectral bands.

  • d. Botswana dataset. It was captured by NASA EO-1 satellite over the Okavango Delta, Botswana from 2001 to 2004. The hyperion sensor on EO-1 acquires data at 30-m pixel resolution over a 7.7-km strip in 242 bands covering the 400- to 2500-nm portion of the spectrum in 10-nm windows. Only 145 bands remain for experimentation after removal of noisy and water absorption bands. The size of dataset is 1476×256  pixels with 30-m spatial resolution. The data contain 14 classes, namely, “water,” “hippo grass,” “floodplain grasses1,” “floodplain grasses2,” “reeds1,” “riparian,” “firescar2,” “island interior,” “Acacia woodlands,” “Acacia shrublands,” “Acacia grasslands,” “short mopane,” “mixed mopane,” and “exposed soils.”

3.2.

Evaluation Measures

The classification performance of the proposed EM-WAF technique is assessed using three commonly used quality metrics, i.e., overall accuracy (OA), average accuracy (AA), and k.

  • a. OA

Percentage of the correctly classified pixels in the whole scene:

Eq. (17)

OA=no. of correctly classified samplesno. of test samples.

  • b. AA

Mean of the percentage of the correctly labeled pixels for each class:

Eq. (18)

AA=1no. of classes(c)i=1c(OA)i.

  • c. Kappa coefficient (k)

It is a robust measure of the degree of agreement, which integrates diagonal and off-diagonal entries of a confusion matrix.

3.3.

Parameters Settings

For the EM clustering algorithm, the number of iteration k is set to 10. For an optimal weight finding procedure, the balance factor α is set to 0.5 and the number of iterations t is set to 100. The SVM classifier with RBF kernels has two parameters: the penalty parameter C and the RBF parameter γ are tuned through fivefold cross validation (γ=28,27,,28, C=28,27,,28).

3.4.

Experimental Results

In this section, the impact of different proportions of training samples on OA, the classification results obtained for Indian Pines, Pavia University, Salinas, and Botswana dataset, analysis of the features extracted by the proposed method, and remarkable findings are discussed. All the experiments are conducted using MATLAB 2018a on PC with 16 GB RAM and 2.70 GHz CPU. In the beginning, to evaluate the effectiveness of the proposed method with fewer amounts of labeled data, 20% of the samples for each class from the Indian Pines, Pavia University, Salinas, and Botswana dataset are randomly chosen as training samples, and the remaining samples in each class are used for testing purpose. Section 3.4.1 provides a detailed analysis of the different proportions of the training samples on OA. The experiment is conducted ten times to evaluate an average of OA, AA, and kappa coefficient. Four different categories of methods have been considered for comparison for verification of the superiority of the proposed method.

  • a. In the first category, clustering-based feature extraction methods, namely, CBFE30 and DCCA33 are considered.

  • b. In the second category, CBS methods9 considered are, CEM-BCC/BDC, CEM-BCM/BDM, LCMV-BCC/BDC, and LCMV-BCM/BDM.

  • c. In the third category, clustering- and ranking-based band selection method considered is E-FDPC.38

  • d. In the fourth category, a comparison of the proposed method is made with clustering and band fusion method for demonstrating the significance of the weights of the bands,29 where a simple average fusion method is used for fusing the bands from a cluster.

3.4.1.

Influence of different proportion of training samples on OA obtained by the proposed method for all four hyperspectral datasets

The performance of the proposed method is validated against different proportions of training samples, namely, 5%, 10%, 15%, 20%, 25%, 30%, 35%, 40%, 45%, and 50% of the labeled training samples per class. Figure 3 shows OA obtained using the proposed method for different proportions of the training samples. The proposed method has seen a good discriminative ability to deal even with a smaller size of the labeled samples, 5% of training sample size per class. With increase in the number of training sample, the classification performance of the proposed method increases gradually for all four datasets. The sample size of more than 20% does not have much impact on the OA. However, the increase in sample size increases the computational burden in the training phase. Hence, the proposed method is tested with 20% of the training samples.

Fig. 3

Influence of different proportions of training samples on OA for Indian Pines, Pavia University, Salinas, and Botswana dataset.

JARS_12_4_046015_f003.png

3.4.2.

Results analysis by comparing the proposed method with different classification methods on Indian Pines dataset

The ground truth data of Indian Pines dataset are shown in Fig. 4(a), where the different colors signify the various land cover categories. Figure 4(b) shows the spectral signature or the reflectance of each category. The classification maps obtained for all the competing methods on Indian Pines dataset as shown in Fig. 5 and the classification results (i.e., OA, class wise accuracy, AA, and k) are reported in Table 1. Figure 5 and Table 1 show that the proposed method achieves the better result when compared with the competing methods in terms of OA, AA, and k. It is due to the use of EM clustering algorithm for band partitioning and weighted average fusion for fusing the correlated band that leads to increase the interclass separation and decrease the intraclass separation.

Fig. 4

Indian Pines dataset information: (a) ground truth data and (b) spectral response of each category

JARS_12_4_046015_f004.png

Fig. 5

Classification map of Indian Pines dataset for all competing methods: (a) CBFE, (b) DCCA, (c) CEM-BCC/BDC, (d) CEM-BCM/BDM, (e) LCMV-BCC/BDC, (f) LCMV-BCM/BDM, (g) E-FDPC, (h) IF, and (i) EM-WAF.

JARS_12_4_046015_f005.png

Table 1

Comparison of classification accuracies (%) obtained by the proposed method with other competing methods for Indian Pines dataset.

Class nameClustering-based methodsConstrained-based selection methodsClustering and ranking-based selection methodClustering and fusion-based methods
CBFE30DCCA33CEM-BCC/BDC9CEM-BCM/BDM9LCMV-BCC/BDC9LCMV-BCM/BDM9E-FDPC38IF29EM-WAF (proposed method)
Alfalfa85.1384.1369.5354.6369.367.2357.5389.4394.01
Corn-no till72.0971.0956.4941.5956.2654.1944.4976.3981.09
Corn-min till59.0458.0462.6447.7462.4160.3450.6463.3487.24
Corn59.7158.7166.5551.6566.3264.2554.5564.0191.15
Grass-pasture99.0198.0176.161.275.8773.864.110099.87
Grass-tree93.0392.0376.161.275.8773.864.197.3399.1
Grass-pasture-mowed64.9363.9359.5544.6559.3257.2547.5569.2384.15
Hay-windrowed90.0889.0874.4859.5874.2572.1862.4894.3899.08
Oat68.8967.8958.4443.5458.2156.1446.4473.1983.04
Soybean-no till59.9358.936550.164.7762.75364.2389.6
Soybean-min till88.987.973.358.473.077161.393.297.9
Soybean-clean57.9356.9363.7748.8763.5461.4751.7762.2388.37
Wheat94.0293.0276.161.275.8773.864.198.32100
Woods88.0387.0372.4357.5372.270.1360.4392.3397.03
Buildings-grass-trees-drives61.6860.6855.9241.0255.6953.6243.9265.9880.52
Stone-steel-towers99.0398.0376.161.275.8773.864.110099.25
OA79.8878.6769.9453.5669.3367.9457.6883.5692.19
AA77.9776.5967.6552.7567.4365.3655.6581.991.96
K0.77510.76390.66020.52760.67010.66010.57010.8170.9085
Note: Highest value across the method is represented in bold font.

Table 1 shows EM-WAF method achieving a good performance compared to clustering-based methods, namely, CBFE, DCCA, CEM-BCC/BDC, CEM-BCM/BDM, LCMV-BCC/BDC, LCMV-BCM/BDM, and E-FDPC. The proposed technique shows a noticeable performance due to the presence of larger discriminative information by the clustering and fusing of highly correlated bands. The classification accuracy of the proposed EM-WAF method is much better than that of the simple IF method and highlights the importance of weight factor in the fusion process. Clustering-based methods and IF method only consider the intracluster distance, which limits the discriminative ability, whereas the proposed method considers the intercluster distance as well as intracluster distance, which leads to a better discriminative ability. Hence, the proposed EM-WAF technique preserves the useful as well as the discriminative information of the original data. When compared to the other competing approaches, the proposed EM-WAF approach achieves a substantial improvement in terms of the class wise classification accuracy as shown in Table 1 (boldface). It is evident that the classification accuracy of the classes “alfalfa,” “corn-no till,” “corn-min till,” “corn,” “grass-pasture-mowed,” “hay-windrowed,” “oat,” “soybean-no till,” “soybean-min till,” “soybean-clean,” and “woods” increases from 54.63% to 94.01%, 41.59% to 81.09%, 47.54% to 87.24%, 51.65% to 91.15%, 44.65% to 84.15%, 59.58% to 99.08%, 43.54% to 83.04%, 53% to 89.06%, 58.4% to 97.9%, 48.87% to 88.37%, and 57.53% to 97.03%, respectively. In particular, in the class “wheat” all the pixels are correctly classified through the use of the proposed method. However, it is observed that the proposed method achieves slightly lesser accuracy for the individual classes such as “grass-pasture” and “stone-steel-towers” when compared to the IF method (achieves 100% accuracy for both classes) as shown in Table 1.

3.4.3.

Results analysis by comparing the proposed method with different classification methods on Pavia University dataset

The ground truth data of Pavia University dataset are shown in Fig. 6(a), where the different colors denote the different categories. Figure 6(b) shows the spectral signature or the reflectance of each category. The classification maps obtained for all the competing techniques along with the proposed technique on Pavia University dataset are depicted in Fig. 7 and the classification results (i.e., OA, class wise accuracy, AA, and k) are presented in Table 2. Figure 7 and Table 2 show that the proposed EM-WAF technique achieving the best result among all the competing methods in terms of OA, AA, and k. It is due to the fact of EM clustering extracts more useful information and increases the separation among the spectral classes. As shown in Table 2, the classification accuracy of the proposed EM-WAF method is much better than the IF method showing the importance of the weight factor in the fusion process. In other words, the proposed method preserves the complementary information of all bands well.

Fig. 6

Pavia University dataset information: (a) ground truth data and (b) spectral response of each category.

JARS_12_4_046015_f006.png

Fig. 7

Classification map of Pavia University dataset for all competing methods: (a) CBFE, (b) DCCA, (c) CEM-BCC/BDC, (d) CEM-BCM/BDM, (e) LCMV-BCC/BDC, (f) LCMV-BCM/BDM, (g) E-FDPC, (h) IF, and (i) EM-WAF.

JARS_12_4_046015_f007.png

Table 2

Comparison of classification accuracies (%) obtained by the proposed method with other competing methods for Pavia University dataset.

Class nameClustering-based methodsConstrained-based selection methodsClustering and ranking-based selection methodClustering and fusion-based methods
CBFE30DCCA33CEM-BCC/BDC9CEM-BCM/BDM9LCMV-BCC/BDC9LCMV-BCM/BDM9E-FDPC38IF29EM-WAF (proposed method)
Asphalt89.8893.5376.1572.6988.8890.9891.4692.4895.87
Meadows94.4795.6182.5579.0993.4795.5796.6297.0499.85
Gravel31.1572.3011.447.9845.6732.2572.6065.4685.21
Trees82.5489.026965.5481.5483.6490.3389.1191.36
Painted metal sheets98.7098.7086.0582.5997.799.898.8898.51100
Bare soil62.5989.8134.7231.2661.5963.6983.6281.1892.15
Bitumen78.9582.8970.0566.5977.9580.0579.1478.0185.23
Self-blocking bricks87.7184.0475.1871.7286.7188.8183.3383.0286.38
Shadows10099.8787.3183.8590.0298.4399.60100100
OA85.5089.9267.2363.2184.5284.5291.1190.6794.10
AA80.6689.7565.8262.3680.3981.7688.4087.2092.89
K0.80120.88310.66900.62870.81230.81020.881687.5391.12
Note: Highest value across the method is represented in bold font.

A shown in Fig. 7, the proposed approach helps in the elimination of most of the noisy pixels generated by the other methods, and the overall classification accuracy increases by more than 2%. For instance, the misclassified pixels are corrected in the green region at the center of Fig. 7, which is very close to the ground truth and the classification map becomes smoother. When compared to the other competing approaches, the proposed approach shows a significant improvement in the class wise classification accuracy as shown in Table 2 (boldface). For instance, the classification accuracy of class “Gravel” increases from 7.98% to 85.21%. Moreover, the proposed method correctly classified the class “painted metal sheets.” However, EM-WAF approach is seen producing lesser classification accuracy for individual class, namely, “self-blocking bricks” when compared to LCMV-BCM/BDM method as shown in Table 2. The reason is that fusion of the spectral bands eliminates the important spectral features of the respective land cover class.

3.4.4.

Results analysis by comparing the proposed method with different classification methods on Salinas dataset

The ground truth data of the Salinas dataset are shown in Fig. 8(a), where the different colors represent the different categories. Figure 8(b) shows the spectral signature or the reflectance of each category. The classification maps of all the competing techniques on Salinas dataset are shown in Fig. 9 and the classification results (i.e., OA, class wise accuracy, AA, and k) are reported in Table 3. Table 3 and Fig. 9 show that the proposed method achieves the best performance in terms of the quantitative results and visual interpretation.

Fig. 8

Salinas dataset information: (a) ground truth data and (b) spectral response of each category.

JARS_12_4_046015_f008.png

Fig. 9

Classification map of Salinas dataset for all competing methods: (a) CBFE, (b) DCCA, (c) CEM-BCC/BDC, (d) CEM-BCM/BDM, (e) LCMV-BCC/BDC, (f) LCMV-BCM/BDM, (g) E-FDPC, (h) IF, and (i) EM-WAF.

JARS_12_4_046015_f009.png

Table 3

Comparison of classification accuracies (%) obtained by the proposed method with other competing methods for Salinas dataset.

Class nameClustering-based methodsConstrained-based selection methodsClustering and ranking-based selection methodClustering and fusion-based methods
CBFE30DCCA33CEM-BCC/BDC9CEM-BCM/BDM9LCMV-BCC/BDC9LCMV-BCM/BDM9E-FDPC38IF29EM-WAF (proposed method)
Brocoli-green-weeds196.3388.0385.0379.4184.4183.4197.7694.8397.83
Brocoli-green-weeds298.1574.9971.9966.3771.3770.3788.2285.6398.91
Fallow85.8981.1478.1472.5277.5276.5252.4194.5497.94
Fallow-rough-plow98.3985.0582.0576.4381.4380.4399.5597.1399.16
Fallow-smooth93.4694.691.685.9890.9889.9890.0198.5095.26
Stubble99.0294.691.685.9890.9889.9897.8298.6499.34
Celery98.8178.0575.0569.4374.4373.4396.2386.6599.57
Grapes-untrained83.8892.9889.9884.3689.3688.3684.7085.6288.54
Soil-vinyard-develop96.4576.9473.9468.3273.3272.3295.5798.5197.48
Corn-senesced-green-weeds80.4083.580.574.8879.8878.8880.0589.3390.46
Lettuce-romaine-4 wk80.8091.888.883.1888.1887.1878.5789.5787.7
Lettuce-romaine-5 wk99.0982.2779.2773.6578.6577.6599.2297.9299.56
Lettuce-romaine-6 wk98.3694.691.685.9890.9889.9899.0496.8597.28
Lettuce-romaine-7 wk88.9090.9387.9382.3187.3186.3187.2791.2392.64
Vinyard-untrained44.5574.4271.4265.8070.869.840.4544.1552.2
Vinyard-vertical-trellis84.7194.691.685.9890.9889.9861.5294.5398.36
OA85.1489.9484.0178.8683.1883.1481.5585.8593.96
AA89.2086.1583.1577.5482.5381.5484.2890.6892.45
K0.83390.87890.82890.76890.81450.82370.79370.84180.9036
Note: Highest value across the method is represented in bold font.

Though all the competing methods are quite useful for dimensionality reduction, CBFE and DCCA methods attain noticeable performance over E-FDPC and other CBS methods. However, the proposed method shows the significant performance over all the other competing methods. It is due to the fact that the clustering and weighted average fusion of the highly correlated bands provide more discriminative information. It shows the proposed EM-WAF technique extracting the significant features of the data. Consequently, the superiority of the EM-WAF approach can be explained by the use of weighted average of useful bands. When compared to the other competing methods, the performance of the proposed method is superior in terms of OA, AA, and k. In most of the classes, the class wise accuracy of the proposed method exceeds 90%. However, the proposed method fails to obtain a good performance for a few classes. For instance, the pixels of class “grapes-untrained” are misclassified with the pixels of “vinyard-untrained” class. This misclassification occurs as the spectral signatures of these two classes are almost the same. Figure 9 shows that the region uniformity of the classes “fallow” and “corn-senesced-green-weeds” (marked by red circles) as improved by the proposed method when compared to the other competing methods.

3.4.5.

Results analysis by comparing the proposed method with different classification methods on Botswana Dataset

The ground truth information relating to Botswana dataset used for experimentation is shown in Fig. 10(a), where the different colors signify the different land cover categories. Figure 10(b) shows the spectral signature or the reflectance of each category. The classification maps of all the competing techniques on Botswana dataset are shown in Fig. 11 and the classification results (i.e., OA, class wise accuracy, AA, and k) are summarized in Table 4.

Fig. 10

Botswana dataset information: (a) ground truth data and (b) spectral response of each category.

JARS_12_4_046015_f010.png

Fig. 11

Classification map of Botswana dataset for all competing methods: (a) CBFE, (b) DCCA, (c) CEM-BCC/BDC, (d) CEM-BCM/ BDM, (e) LCMV-BCC/BDC, (f) LCMV-BCM/ BDM, (g) E-FDPC, (h) IF, and (i) EM-WAF.

JARS_12_4_046015_f011.png

Table 4

Comparison of classification accuracies (%) obtained by the proposed method with other competing methods for Botswana dataset.

Class nameClustering-based methodsConstrained-based selection methodsClustering and ranking-based selection methodsClustering and fusion-based methods
CBFE30DCCA33CEM-BCC/BDC9CEM-BCM/BDM9LCMV-BCC/BDC9LCMV-BCM/BDM9E-FDPC38IF29EM-WAF (proposed method)
Water96.5398.1497.6899.5396.0499.5399.5099.00100
Hippo grass81.2585.0078.7586.2583.7578.7582.6677.3390.66
Floodplain grasses177.5086.0082.0092.0089.5085.0091.4590.9591.48
Floodplain grasses280.8187.2061.6275.0074.4172.6784.4778.8878.26
Reeds158.6060.4639.0665.1164.6559.5361.6960.1765.67
Riparian61.6046.5153.0250.6942.8150.6945.8747.7861.69
Firescar294.2098.5596.1397.1093.1098.5596.9097.4296.90
Island interior90.1288.0886.4186.4186.7490.7493.4280.9488.15
Acacia woodlands65.3368.2260.5560.1557.3757.3764.2568.3173.19
Acacia shrublands58.0883.8562.6261.1160.0658.5859.6759.9184.40
Acacia grasslands88.9389.1193.0392.6286.8893.4487.7190.0194.29
Short mopane45.1387.5062.5053.4767.4750.6959.2567.4494.81
Mixed mopane75.2389.2570.5661.2176.6373.3660.1978.1091.04
Exposed soils82.8988.1585.5980.2682.0582.8990.1487.3278.87
OA75.2581.7372.8675.3374.6074.8783.0177.5384.92
AA75.2582.7073.5475.7875.8975.1384.977.3784.96
K0.73190.80200.70580.73260.72490.72760.82530.75630.8336
Note: Highest value across the method is represented in bold font.

The results reported in Table 4 lead to the observation of the proposed EM-WAF method delivering a better performance than the other competing methods. Table 4 shows the classification results obtained by the proposed clustering and fusion-based method are very promising, which indicates the possibility of classification of the large-size dataset using the proposed method. Table 4 shows the E-FDPC method obtains significant performance superior that of other clustering and constrained-based selection methods, this is mainly due to the band selection strategy of the ranking-based methods. However, the proposed method is better than the E-FDPC method, since the latter technique only considers the intracluster distance between the data points, whereas the former technique considers intercluster as well as intracluster distance between the data points, resulting in good discriminative capabilities for the classification. Table 4 shows that the proposed method achieves better class wise accuracies for most of the classes. It is observed that the proposed method classifies all pixels of the class “water” correctly. Compared to the other competing methods, classes such as “hippo grass,” “Acacia woodlands,” “short mopane,” and “mixed mopane” are better distinguished by the proposed method. The performance of the proposed method is better than that of the other competing methods for the classes, “reeds1” and “riparian,” though it is not satisfactory. The main reason is that the samples selected from such classes consist of more redundant information.

3.4.6.

Analysis of number of selected bands or features for all four hyperspectral datasets

Table 5 shows the number of selected bands or features and OA for four hyperspectral datasets. Table 5 shows the ability of the proposed approach to achieve a better classification accuracy through selection of features of an optimal number. In other words, the proposed approach selects the features that separate the land cover classes well.

Table 5

Number of selected bands or features and OA (%) for all four hyperspectral datasets.

DatasetMethod
DCCACEM-BCC/BDCCEM-BCM/BDMLCMV-BCC/BDCLCMV-BCM/BDMCBFEE-FDPCIFEM-WAF
Indian PinesNumber of bands or features20201520201310257
OA (%)78.6769.9453.5669.3367.9479.8857.6883.5692.19
Pavia UniversityNumber of bands or features202020202015142511
OA (%)89.9267.2363.2184.5284.5285.5091.1190.6794.10
SalinasNumber of bands or features151520202012142513
OA (%)89.9484.0178.8683.1883.1485.1481.5585.8593.96
BotswanaNumber of bands or features303030303030302520
OA (%)75.2581.7372.8675.3374.6074.8783.0177.5384.92

As shown in Table 5, the features extracted by the proposed method for all datasets achieve the highest classification accuracy. For the Indian Pines dataset, the proposed method provides a maximum OA of 92.19% among all the competing methods for only seven features, which is found to be optimal. For the Pavia University dataset, the proposed method delivers the highest OA of 94.10% among all the competing methods for only 11 optimal features. For the Salinas dataset, CBFE method provides 85.14% OA for only 12 features, which are the minimum number of features extracted by CBFE among all other competing methods. However, the proposed method achieves a maximum OA of 93.96% among all the competing methods for an optimal number of 13 averaged bands. For Botswana dataset, the proposed method provides OA, which is slightly better than E-FDPC method. However, the proposed method achieves maximum OA (84.92%) among all the competing methods for only 20 features, which is found to be optimal one. Table 5 shows that the proposed approach extracts meaningful features from the hyperspectral data. These features are suitable and adequate for the hyperspectral image classification. These results indicate that: (a) the pairwise distance-based band separability is an important aspect for feature extraction; (b) consideration of intracluster and intercluster distance provides more discriminative information; and (c) an appropriate weighting mechanism for the weighted average fusion improves the performance of feature extraction significantly.

4.

Conclusion

In this paper, EM clustering and weighted average fusion technique-based feature extraction for hyperspectral image classification has proposed. The proposed method explores the information among the clusters and removes redundancy among the bands. The EM algorithm converges to the best number of clusters, thereby providing an effective way to determine an optimal number of features. The weight factor of the bands is calculated on the basis of the criteria of minimizing the distance inside each cluster and maximizing the distance among the different clusters, which highlights the importance of the particular band in the fusion process. The significance of this technique lies in its highly discriminative ability, which leads to a better classification performance. Experimental results and comparison with the existing approaches prove the efficiency of the proposed method for hyperspectral image classification. When compared with the other competing methods on four standard datasets, the proposed method achieves higher classification accuracy and better visual results. For the Botswana dataset, the proposed method provides better OA among all other competing methods, which makes it evident that the proposed method can classify a large-size dataset effectively. Moreover, the proposed method performs equally well for all four hyperspectral datasets, showing the robustness of the proposed method in both small- and large-size datasets.

In our future work, we will focus on integrating the spatial features with the spectral features to improve the classification performance.

Acknowledgments

The authors would like to thank the anonymous reviewers for their comments and valuable suggestions, which greatly helped us to improve the technical quality and presentation of the manuscript. The authors thank VIT for providing a VIT seed grant for carrying out this research work and the Council of Scientific & Industrial Research (CSIR), New Delhi, India for the award of CSIR-SRF.

References

1. 

H. Ren and C.-I. Chang, “Automatic spectral target recognition in hyperspectral imagery,” IEEE Trans. Aerosp. Electron. Syst., 39 (4), 1232 –1249 (2003). https://doi.org/10.1109/TAES.2003.1261124 IEARAX 0018-9251 Google Scholar

2. 

M. Khodadadzadeh et al., “A new framework for hyperspectral image classification using multiple spectral and spatial features,” in IEEE Geoscience and Remote Sensing Symp., 4628 –4631 (2014). https://doi.org/10.1109/IGARSS.2014.6947524 Google Scholar

3. 

S. S. Sawant and M. Prabukumar, “Semi-supervised techniques based hyper-spectral image classification: a survey,” in Innovations in Power and Advanced Computing Technologies (i-PACT), (2017). https://doi.org/10.1109/IPACT.2017.8244999 Google Scholar

4. 

J. Richards, Remote Sensing Digital Image Analysis, Springer-Verlag, Berlin (1999). Google Scholar

5. 

G. F. Hughes, “On the mean accuracy of statistical pattern recognizers,” IEEE Trans. Inf. Theory, 14 (1), 55 –63 (1968). https://doi.org/10.1109/TIT.1968.1054102 IETTAW 0018-9448 Google Scholar

6. 

C. Burges, “Dimension reduction: a guided tour,” Found. Trends Mach. Learn., 2 (4), 275 –364 (2010). https://doi.org/10.1561/2200000002 MALEEZ 0885-6125 Google Scholar

7. 

R. Vaddi and M. Prabukumar, “Comparative study of feature extraction techniques for hyper spectral remote sensing image classification: a survey,” in Int. Conf. on Intelligent Computing and Control Systems (ICICCS), 543 –548 (2017). https://doi.org/10.1109/ICCONS.2017.8250521 Google Scholar

8. 

C. I. Chang, “A joint band prioritization and band decorrelation approach to band selection for hyperspectral image classification,” IEEE Trans. Geosci. Remote Sens., 37 (6), 2631 –2641 (1999). https://doi.org/10.1109/36.803411 IGRSD2 0196-2892 Google Scholar

9. 

C. I. Chang and S. Wang, “Constrained band selection for hyperspectral imagery,” IEEE Trans. Geosci. Remote Sens., 44 (6), 1575 –1585 (2006). https://doi.org/10.1109/TGRS.2006.864389 IGRSD2 0196-2892 Google Scholar

10. 

X. Bai et al., “Semisupervised hyperspectral band selection via spectral–spatial hypergraph model,” IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens., 8 (6), 2774 –2783 (2015). https://doi.org/10.1109/JSTARS.2015.2443047 Google Scholar

11. 

X. Cao et al., “Fast hyperspectral band selection based on spatial feature extraction,” J. Real-Time Image Process., 1 –10 (2018). https://doi.org/10.1007/s11554-018-0777-9 Google Scholar

12. 

C. Yu, M. Song and C. Chang, “Band subset selection for hyperspectral image classification,” Remote Sens., 10 113 (2018). https://doi.org/10.3390/rs10010113 Google Scholar

13. 

Q. Chen, “Band selection algorithm based on information entropy for hyperspectral image classification,” J. Appl. Remote Sens., 11 (2), 026018 (2017). https://doi.org/10.1117/1.JRS.11.026018 Google Scholar

14. 

W. Zhang, X. Li and L. Zhao, “Hyperspectral band selection based on triangular factorization,” J. Appl. Remote Sens., 11 (2), 025007 (2017). https://doi.org/10.1117/1.JRS.11.025007 Google Scholar

15. 

S. Samiappan, S. Prasad and L. Bruce, “Non-uniform random feature selection and kernel density scoring with SVM based ensemble classification for hyperspectral image analysis,” IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens., 6 (2), 792 –800 (2013). https://doi.org/10.1109/JSTARS.2013.2237757 Google Scholar

16. 

J. C. Davis, “Introduction to statistical pattern recognition: 2nd edition, by Keinosuke Fukunaga, Academic Press, San Diego, 1990, 591 p., ISBN 0-12-269851-7, US$69.95,” Comput. Geosci., 22 (7), 833 –834 (1990). https://doi.org/10.1016/0098-3004(96)00017-9 CGEODT 0098-3004 Google Scholar

17. 

F. Tsai, E.-K. Lin and K. Yoshino, “Spectrally segmented principal component analysis of hyperspectral imagery for mapping invasive plant species,” Int. J. Remote Sens., 28 (5), 1023 –1039 (2007). https://doi.org/10.1080/01431160600887706 IJSEDK 0143-1161 Google Scholar

18. 

X. Junshi et al., “(Semi-) supervised probabilistic principal component analysis for hyperspectral remote sensing image classification,” IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens., 7 (6), 2224 –2236 (2014). https://doi.org/10.1109/JSTARS.2013.2279693 Google Scholar

19. 

A. Villa et al., “Hyperspectral image classification with independent component discriminant analysis,” IEEE Trans. Geosci. Remote Sens., 49 (12), 4865 –4876 (2011). https://doi.org/10.1109/TGRS.2011.2153861 IGRSD2 0196-2892 Google Scholar

20. 

X. Liu et al., “A maximum noise fraction transform with improved noise estimation for hyperspectral images,” Sci. China Ser. F, 52 (9), 1578 –1587 (2009). https://doi.org/10.1007/s11432-009-0156-z Google Scholar

21. 

M. Fauvel, J. Chanussot and J. A. Benediktsson, “Kernel principal component analysis for the classification of hyperspectral remote sensing data over urban areas,” EURASIP J. Adv. Signal Process., 2009 783194 (2009). https://doi.org/10.1155/2009/783194 Google Scholar

22. 

M. E. Tipping, “Probabilistic principal component analysis,” J. R. Stat. Soc. Ser. B, 61 (3), 611 –622 (1999). https://doi.org/10.1111/rssb.1999.61.issue-3 Google Scholar

23. 

T. V. Bandos, L. Bruzzone and G. Camps-Valls, “Classification of hyperspectral images with regularized linear discriminant analysis,” IEEE Trans. Geosci. Remote Sens., 47 (3), 862 –873 (2009). https://doi.org/10.1109/TGRS.2008.2005729 IGRSD2 0196-2892 Google Scholar

24. 

B. C. Kuo and D. A. Landgrebe, “Nonparametric weighted feature extraction for classification,” IEEE Trans. Geosci. Remote Sens., 42 (5), 1096 –1105 (2004). https://doi.org/10.1109/TGRS.2004.825578 IGRSD2 0196-2892 Google Scholar

25. 

B. C. Kuo, C. H. Li and J. M. Yang, “Kernel nonparametric weighted feature extraction for hyperspectral image classification,” IEEE Trans. Geosci. Remote Sens., 47 (4), 1139 –1155 (2009). https://doi.org/10.1109/TGRS.2008.2008308 IGRSD2 0196-2892 Google Scholar

26. 

M. Prabukumar et al., “Threedimensional discrete cosine transform-based feature extraction for hyperspectral image classification,” J. Appl. Remote Sens., 12 (4), 046010 (2018). https://doi.org/10.1117/1.JRS.12.046010 Google Scholar

27. 

I. Makki et al., “A survey of landmine detection using hyperspectral imaging,” ISPRS J. Photogramm. Remote Sens., 124 40 –53 (2017). https://doi.org/10.1016/j.isprsjprs.2016.12.009 IRSEE9 0924-2716 Google Scholar

28. 

A. R. Webb, Statistical Pattern Recognition, 71 (8), John Wiley & Sons, Ltd., England (2011). Google Scholar

29. 

S. Sawant and M. Prabukumar, “Band fusion based hyper spectral image classification,” Int. J. Pure Appl. Math., 117 (17), 71 –76 (2017). Google Scholar

30. 

M. Imani and H. Ghassemian, “Band clustering-based feature extraction for classification of hyperspectral images using limited training samples,” IEEE Geosci. Remote Sens. Lett., 11 (8), 1325 –1329 (2014). https://doi.org/10.1109/LGRS.2013.2292892 Google Scholar

31. 

Q. Yan et al., “Class probability propagation of supervised information based on sparse subspace clustering for hyperspectral images,” Remote Sens., 9 1017 (2017). https://doi.org/10.3390/rs9101017 Google Scholar

32. 

X. Peng et al., “Constructing the L2-graph for subspace learning and subspace clustering,” IEEE Trans. Cybern., 6 1 –14 (2016). Google Scholar

33. 

Y. Yuan, J. Lin and Q. Wang, “Dual-clustering-based hyperspectral band selection by contextual analysis,” IEEE Trans. Geosci. Remote Sens., 54 (3), 1431 –1445 (2016). https://doi.org/10.1109/TGRS.2015.2480866 IGRSD2 0196-2892 Google Scholar

34. 

M. Khoder et al., “Multicriteria classification method for dimensionality reduction adapted to hyperspectral images,” J. Appl. Remote Sens., 11 (2), 025001 (2017). https://doi.org/10.1117/1.JRS.11.025001 Google Scholar

35. 

X. Sun et al., “Hyperspectral image clustering method based on artificial bee colony algorithm,” in Sixth Int. Conf. on Advanced Computational Intelligence (ICACI), 106 –109 (2013). https://doi.org/10.1109/ICACI.2013.6748483 Google Scholar

36. 

H. Su and P. Du, “Multiple classifier ensembles with band clustering for hyperspectral image classification,” Eur. J. Remote Sens., 47 (1), 217 –227 (2014). https://doi.org/10.5721/EuJRS20144714 Google Scholar

37. 

R. Liu, H. Wang and X. Yu, “Shared-nearest-neighbor-based clustering by fast search and find of density peaks,” Inf. Sci., 450 200 –226 (2018). https://doi.org/10.1016/j.ins.2018.03.031 Google Scholar

38. 

S. Jia et al., “A novel ranking-based clustering approach for hyperspectral band selection,” IEEE Trans. Geosci. Remote Sens., 54 (1), 88 –102 (2016). https://doi.org/10.1109/TGRS.2015.2450759 IGRSD2 0196-2892 Google Scholar

39. 

H. Xie et al., “Unsupervised hyperspectral remote sensing image clustering based on adaptive density,” IEEE Geosci. Remote Sens. Lett., 15 (4), 632 –636 (2018). https://doi.org/10.1109/LGRS.2017.2786732 Google Scholar

40. 

B. Peng et al., “Weighted-fusion-based representation classifiers for hyperspectral imagery,” Remote Sens., 7 14806 –14826 (2015). https://doi.org/10.3390/rs71114806 Google Scholar

41. 

S. Prasad and L. M. Bruce, “Decision fusion with confidence-based weight assignment for hyperspectral target recognition,” IEEE Trans. Geosci. Remote Sens., 46 (5), 1448 –1456 (2008). https://doi.org/10.1109/TGRS.2008.916207 IGRSD2 0196-2892 Google Scholar

42. 

T. Lu et al., “From subpixel to superpixel: a novel fusion framework for hyperspectral image classification,” IEEE Trans. Geosci. Remote Sens., 55 (8), 4398 –4411 (2017). https://doi.org/10.1109/TGRS.2017.2691906 IGRSD2 0196-2892 Google Scholar

43. 

B. Kumar and O. Dikshit, “Hyperspectral image classification based on morphological profiles and decision fusion,” Int. J. Remote Sens., 38 (20), 5830 –5854 (2017). https://doi.org/10.1080/01431161.2017.1348636 IJSEDK 0143-1161 Google Scholar

44. 

A. Dempster, N. Laird and D. Rubin, “Maximum likelihood from incomplete data via the EM algorithm,” J. R. Stat. Soc. Ser. B, 39 (1), 1 –38 (1977). JSTBAJ 0035-9246 Google Scholar

45. 

R. Yang et al., “Representative band selection for hyperspectral image classification,” J. Vision Commun. Image Represent., 48 396 –403 (2017). https://doi.org/10.1016/j.jvcir.2017.02.002 Google Scholar

46. 

F. Melgani and L. Bruzzone, “Classification of hyperspectral remote sensing,” IEEE Trans. Geosci. Remote Sens., 42 (8), 1778 –1790 (2004). https://doi.org/10.1109/TGRS.2004.831865 IGRSD2 0196-2892 Google Scholar

Biography

Manoharan Prabukumar received his BE degree in electronics and communication engineering from Periyar University, Tamilnadu, India, in 2002, his MTech degree in computer vision and image processing from Amrita School of Engineering, Coimbatore, India, in 2007, and his PhD in computer graphics from Vellore Institute of Technology (VIT), Tamilnadu, India, in 2014. Currently, he is working as an associate professor in the School of Information Technology and Engineering, VIT. His research interests include hyperspectral remote sensing, image processing, computer graphics, and machine learning.

Sawant Shrutika received her BE and ME degrees in electronics and telecommunication engineering from Shivaji University, Maharashtra, India, in 2009 and 2012, respectively. Currently, she is pursuing her PhD in hyperspectral image processing from VIT, Vellore, Tamilnadu, India. She has been awarded with the senior research fellowship from the Council of Scientific and Industrial Research, New Delhi, India. Her research interests include hyperspectral remote sensing, image processing, and machine learning.

© 2018 Society of Photo-Optical Instrumentation Engineers (SPIE) 1931-3195/2018/$25.00 © 2018 SPIE
Manoharan Prabukumar and Sawant Shrutika "Band clustering using expectation–maximization algorithm and weighted average fusion-based feature extraction for hyperspectral image classification," Journal of Applied Remote Sensing 12(4), 046015 (2 November 2018). https://doi.org/10.1117/1.JRS.12.046015
Received: 9 June 2018; Accepted: 11 October 2018; Published: 2 November 2018
Lens.org Logo
CITATIONS
Cited by 25 scholarly publications.
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Feature extraction

Expectation maximization algorithms

Hyperspectral imaging

Image classification

Distance measurement

Principal component analysis

Statistical analysis

Back to Top