PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.
In this paper we present a method for ultrasonic robot localization without a priori world models utilizing the ideas of distinctive places and open space attraction. This method was incorporated into a move-to-station behavior, which was demonstrated on the Georgia Tech mobile robot. The key aspect of our approach was to use Dempster-Shafer theory to overcome the problem of the uncertainty in the range measurements returned by the sensors. The state of the world was modeled as a two element frame of discernment (Theta) : empty and occupied. The world itself was represented as a grid, with the belief in whether a grid element was empty or occupied was set to total ignorance (don't know) at the beginning of the robot behavior. A belief model of the range readings was used to compute the belief of points in the environment being empty, occupied, or unknown. Belief from repeated measurements updated the world map according to Dempster's rule of combination. The current belief in the empty space was used to construct a weighted centroid of the empty space (or station) after each move of the robot. By moving toward this center of mass and continually adding to the beliefs of the points in the environment the robot iteratively moved to the center of the open space. Experiments demonstrated that the robot was able to localize itself with a repeatability of 1.5 feet in a 33 foot square room, regardless of the starting position within the open space. This method is contrasted with a technique which did not explicitly model the belief in the range readings; that technique was unable to consistently converge on the center of the room within ten moves.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
A system which utilizes a function-based representation has been implemented and tested, using the object category `chair' for a case study. Functional description is used to recognize classes and identify subclasses of known categories of objects, even if the specific object has never been encountered previously. Interpretation of the functionality of an object is accomplished through qualitative reasoning about its 3-D shape. During the recognition process, evidence is gathered as to how well the functional requirements are met by the input shape. An investigation of different types of operators used in the combination of the functional evidence has been made. Three pairs of conjunctive and disjunctive operators have been used in the recognition process of the 100+ object shapes. The results are compared and differences are discussed.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Dempster-Shafer's theory of evidence is a generalization of Bayes reasoning that allows multiple information sources with varying levels of belief to contribute to probabilistic decisions. We present an algorithm that performs pixel-level segmentation based upon the Dempster-Shafer theory of evidence. The algorithm fuses image data from the multichannels of color spectra. Dempster-Shafer reasoning is used to drive the evidence accumulation process for pixel level segmentation of color scenes. Experiments are presented that use spectral information from the RGB and HSI color models to segment a color image with Dempster-Shafer reasoning. These experiments begin to point out the utility and pitfalls of using Dempster-Shafer reasoning for segmenting color images.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Integration of information from multiple sources has been one of the key steps to the success of general vision systems. It is also an essential problem to the development of color image understanding algorithms that make full use of the multichannel color data for object recognition. This paper presents a feature integration system characterized by a hybrid combination of a statistic-based reasoning technique and a symbolic logic-based inference method. A competitive evidence enhancement scheme is used in the process to fuse information from multiple sources. The scheme expands the Dempster-Shafer's function of combination and improves the reliability of the object recognition. When applied to integrate the object features extracted from the multiple spectra of the color images, the system alleviates the drawback of traditional Baysian classification system.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
We define genetic annealing as simulated annealing applied to a population of several solutions when candidates are generated from more than one (parent) solution at a time. We show that such genetic annealing algorithms can inherit the convergence properties of simulated annealing. We present two examples, one that generates each candidate by crossing pairs of parents and a second that generates each candidate from the entire population. We experimentally apply these two extreme versions of genetic annealing to a problem in vector quantization.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
We extend an approach to global convergence for genetic algorithms via a homogeneous Markov chain argument that does not depend on mutation. Our result is a proof of convergence to a set of populations which contain the optimal.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Segmentation of an image refers to partitioning the image into several subimages such that each subimage forms a connected component representing a logical entity present in the scene, and all the segments as a whole produce a meaningful interpretation of the scene being studied. The problem is inherently of NP nature and it is as hard as the simplest possible NP- complete problem called partition. Existence of the unique solution (the truly optimal segmentation) and its sensitivity to the sampling process is yet to be studied thoroughly. The formulation and implementation of a randomized search approach to segment an image, using genetic algorithms, is presented in this paper. A state space representation of partially segmented image using binary strings is considered. The dominant substrings are easily explained in terms of chromosomes. Also the operations such as crossover and mutations are easily abstracted. A modified crossover operator using boundary interaction and region adjacency graph (BIRAG) has been adopted to improve the performance. Also, a simplified mutation operator called `switch' has been devised using the BIRAG. In particular, the specific data structure used in this scheme also facilitates a means for accommodating pixel- level feed back, and model based bias for model based segmentation of images. Images from two different scenes are segmented using this approach to illustrate the applicability of such a system.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Analysis of the thermal profile of an electronic circuit card during warm-up can be useful in detecting malfunctioning components on the card. Extracting the thermal profile can require the processing of 10 to 15 images of 600,000 bytes each. By extracting the heat transient associated with the heat sources on the circuit card, this problem of characterizing the thermal transient can be reduced to one of modeling the peak temperatures associated with a handful of components. The thermal profile of each component can be modeled as a function of four parameters: three are functions of the heat dissipation characteristics of the circuit card, and the fourth is proportional to the power consumption of the component generating the heat. Extraction of the parameters was achieved through a modified genetic algorithm. The genetic algorithm was employed when traditional techniques of Newton, non-linear regression, gradient search, and binary search proved to be slow, unstable, and unreliable. In traditional genetic algorithm implementations, improvement in performance ceases when improvement requires a simultaneous mutation of two or more variables. We seem to have circumvented the difficulty by expressing the problem in the differential domain, and coupling the genetic algorithm with a cooperative `follow the leader' approach to optimization. The extracted power consumption parameters are then employed to distinguish between `good' and `bad' cards.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
A simple self-organizing neural network model, called an EXIN network, that learns to process sensory information in a context-sensitive manner, is described. EXIN networks develop efficient representation structures for higher-level visual tasks such as segmentation, grouping, transparency, depth perception, and size perception. Exposure to a perceptual environment during a developmental period serves to configure the network to perform appropriate organization of sensory data. A new anti-Hebbian inhibitory learning rule permits superposition of multiple simultaneous neural activations (multiple winners), while maintaining contextual consistency constraints, instead of forcing winner-take-all pattern classifications. The activations can represent multiple patterns simultaneously and can represent uncertainty. The network performs parallel parsing, credit attribution, and simultaneous constraint satisfaction. EXIN networks can learn to represent multiple oriented edges even where they intersect and can learn to represent multiple transparently overlaid surfaces defined by stereo or motion cues. In the case of stereo transparency, the inhibitory learning implements both a uniqueness constraint and permits coactivation of cells representing multiple disparities at the same image location. Thus two or more disparities can be active simultaneously without interference. This behavior is analogous to that of Prazdny's stereo vision algorithm, with the bonus that each binocular point is assigned a unique disparity. In a large implementation, such a NN would also be able to represent effectively the disparities of a cloud of points at random depths, like human observers, and unlike Prazdny's method
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
This paper builds upon earlier work on the wave expansion neural network (WENN) which is a neural network capable of implementing wavefront expansion operations useful for developing potential fields for path planning. The discretized operational space or configuration space (C-Space) is mapped on to the WENN neural field which subsequently develops the artificial potential field over the C-space. The WENN has been applied to develop a simple attractive potential field and a repulsive potential field over two dimensional workspaces.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Hebbian learning law plays a very important role in the feedforward learning of neural networks. In multidimensional image space, particularly in vision, the asymmetric multidimensional Hebbian learning law can perform principal component feature extraction, thus providing high dimensional feature analysis and feature separation. In this paper, we verified this principle with modified Hebbian learning when applied to Fukushima's neocognitron visual recognition architecture.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Hierarchically organized neural networks are well suited for visual information processing. These models offer a way to cope with the complexity of vision. We identify strong relationships between hierarchical neural networks and image pyramids. However, we also show that if one has the freedom to choose the input patterns, these neural networks are not intrinsically shift invariant. In order to circumvent this problem we propose a new neural network architecture called `Neural Networks in Image Pyramids.' We use hierarchical neural networks with local connectivity (image pyramids) as stem networks. These networks generate hypotheses about the expected image content. These hypotheses are checked by small neural network modules which are used selectively on parts of the image. We give an example demonstrating the solution of the shift variance problem. Finally, we outline directions of further research.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In computational perception, `visual motion analysis' is most commonly identified with the problem of measuring the infinitesimal rate of translation at various local spatial neighborhoods in a time-varying signal. Many problems associated with measuring these motion vectors can be addressed by considering the following simplified one-dimensional case. Given two samples, an original function fo(x), and another sample ft(x) taken momentarily afterwards; compute the translation parameter (tau) which provides a best-fit for the transformation model, T(tau ) : fo(x) yields ft(x) equals fo(x + (tau) ) over some finite local region. The `goodness' of this fit requires evaluation by a suitable performance metric since measurement uncertainty and added noise will corrupt the solution of (tau) . This error can be reduced if the measurement is supported by a wider spatial region. However, the `pure translation' model is usually only valid within some small local neighborhood. These two competing constraints inherently compromise the measurement process. In this paper, a new technique is developed for estimating this translation parameter using a localized (`wavelet') representation, and it provides a measure of the uncertainty of the resulting estimate. In addition, a trade-off is identified between the local neighborhood width and the uncertainty of the translation estimate. It is similar to the well-known Heisenberg uncertainty principle: The product of the variances of the uncertainty of position and translation is bounded below by a finite constant.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The effect of additional unlabeled samples in improving the supervised learning process is studied in this paper. Three learning processes, supervised, unsupervised, and combined supervised-unsupervised, are compared by studying the asymptotic behavior of the estimates obtained under each process. Upper and lower bounds on the asymptotic covariance matrices are derived. It is shown that under a normal mixture density assumption for the probability density function of the feature space, the combined supervised-unsupervised learning is always superior to the supervised learning in achieving better estimates. Experimental results are provided to verify the theoretical concepts.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
We have advanced Markov random field research by addressing the issue of obtaining a reasonable, non-trivial, noise model. We have introduced the concept of a double neighborhood MRF. In the past we have estimated MRF probabilities by sampling neighborhood frequencies from images. Now we address the issue of noise models by sampling from pairs of original images together with noisy imagery. Thus we create a probability density function for pairs of neighborhoods across both images. This models the noise within the MRF probability density function without having to make assumptions about its form. This provides an easy way to generate Markov random fields for annealing or other relaxation methods. We have successfully applied this technique, combined with a technique of Hancock and Kittler which adds theoretical noise to an MRF density function, to the problem binary image reconstruction. We now apply it to edge detection enhancement of artificial images. We train the double neighborhood MRF on true edge-maps and edge-maps generated as output of a Sobel edge detector. Our method improves the generated edge-maps - - visually, and using the metrics of number of bits incorrect, and Pratt's figure of merit for edge detectors. We have also successfully improved the output edge-maps of some real images.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Feedforward networks are used extensively in practice to learn static mappings between related sets of variables. These networks are difficult to analyze, however, both because of their nonlinearity and their complex interconnection structure. In the absence of the nonlinearity, linear algebra could provide considerable insight into the behavior of these networks, significantly beyond that possible from a detailed analysis of individual neurons. Such insights would be extremely valuable since the power of neural networks arises from their large-scale connectivity, rather than the inherent computational capacity of the individual neurons. This paper proposes algebraic category theory as the basis for obtaining such global insights for feedforward networks in spite of their nonlinearity.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
An outstanding problem in the study of adaptive learning is overspecialization of the learning system, and its consequent inability to handle new data correctly. A means of addressing this difficulty is described here. When used in conjunction with standard processes such as backpropagation, it identifies the level of corruption of the training sample, and thus provides a `best fit' to the entire domain of interest, rather than to the training sample alone. This is accomplished by a combination of simulated annealing, bootstrap estimation, and analysis methods derived from statistical mechanics. Its advantage is that data need not be reserved for an independent test set, and thus all available samples are used. A modified generalization error, defined through a thermalization parameter on the training set, provides a measure of the sample space consistent with the network function. A criterion for optimal match between network and sample set is obtained from the requirement that generalization error and training error be consistent. Numerical results are presented for examples which illustrate several distinct forms of data corruption. A quantity analogous to the specific heat in thermodynamic systems is found to exhibit anomalies at values of training error near the onset of overtraining.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
A technique for recoding multidimensional data in a representation of reduced dimensionality is presented. A non-linear encoder-decoder for multidimensional data with compact representations is developed. The technique of training a neural network to learn the identity map through a `bottleneck' is extended to networks with non-linear representations, and an objective function which penalizes entropy of the hidden unit activations is shown to result in low dimensional encodings. For scalar time series data, a common technique is phase-space reconstruction by embedding the time-lagged scalar signal in a higher dimensional space. Choosing the proper embedding dimension is difficult. By using non-linear dimensionality reduction, the intrinsic dimensionality of the underlying system may be estimated.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
It is known that using an infinite number of hidden layer nodes feedforward neural networks can approximate any continuous function with compact support arbitrarily well using very simple node nonlinearities. We investigate whether network architectures can be found that use more complicated node nonlinearities to achieve better approximation using a restricted number of nodes. Two methods are proposed, one based on modifying standard backpropagation networks, and one based on Kolmogorov's theorem. The feasibility of these networks is evaluated by considering their performance when predicting chaotic time series and memorizing the XOR mapping.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
This paper describes a new genetic approach called the structured genetic algorithm (sGA) for automatic registration of digital images. The specialty of this genetic model lies primarily in its redundant genetic material and a gene activation mechanism which utilizes a multi-layered structure for the chromosome. The additional genetic material serves to retain multiple optional solution spaces in parameter optimization. The structured genetic model is applied here to minimize the registration measures in image transformations, as investigated by Fitzpatrick and Grefenstatte with the simple GA. The results demonstrate that sGA is a much faster and robust search method that is guaranteed to reach a global optimum by adaptively estimating the subspace from the maximum space during the evolutionary process. Preliminary experimental results are reported.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Simulated annealing algorithms for optimization over continuous spaces come in two varieties: Markov chain algorithms and modified gradient algorithms. Unfortunately, there is a gap between the theory and the application of these algorithms: the convergence conditions cannot be practically implemented. In this paper we suggest a practical methodology for implementing the modified gradient annealing algorithms based on their relationship to the Markov chain algorithms.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
After the stochastic simulated annealing technique was applied in the field of image processing, there have been many research reports on the Markov random field based image processing. These MRF-based edge-preserving smoothing techniques showed good results in the field of restoration, reconstruction, edge detection, and segmentation of the images, however, they have common drawbacks. First, those methods do not work well for smoothing of the nonstationary or signal-dependent noise. In real world images, the noises are often nonstationary and signal-dependent. Second, those edge-preserving smoothing techniques employ implicit or explicit thresholds to determine the existence of the edges, and they use fixed single thresholds throughout the entire image. As a result of these drawbacks, small features in the area of low noise variance are lost or blurred in order to restore the features in the high variance area. In order to cure these problems, we need an adaptive edge-preserving smoothing method which can be applied to nonstationary or signal-dependent noise with adaptive thresholding. The adaptive mean field annealing is an adaptive version of MFA, which fulfills this purpose by taking advantage of the local nature of the MRF and the fact that nonstationary or signal-dependent noise can be approximated by locally stationary additive Gaussian noise. In AMFA, the a priori information about the noise is not necessary and, hence, the difficulty of estimating the parameters is greatly reduced.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The use of Gibbs random fields (GRF) to model images poses the important problem of the dependence of the patterns sampled from the Gibbs distribution on its parameters. Sudden changes in these patterns as the parameters are varied are known as phase transitions. In this paper, we concentrate on developing a general deterministic theory for the study of phase transitions when a single parameter, namely the temperature, is varied. This deterministic framework is based on a widely used technique in statistical physics known as the mean-field approximation. Our mean-field theory is general in that it is valid for any number of graylevels, any interaction potential, any neighborhood structure or size, and any set of constraints imposed on the desired images. The mean-field approximation is used to compute closed-form estimates for the critical temperatures at which phase transitions occur for two texture models widely used in the image modeling literature: the Potts model and the autobinomial model. The mean-field model allows us to predict the pattern structure in the neighborhood of these temperatures. These analytical results are verified by computer simulations using a novel mean-field descent algorithm.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
This paper describes a multispectral image classification technique. This technique involves two steps. First, we describe the underlying distribution of the pixel intensity vectors for the entire scene as a mixture of multivariate Gaussian distributions. We then use this mixture decomposition and a small number of labeled pixels to estimate the proportion of a mixture component that is comprised of a certain class, which enables us to use a Bayes-type decision rule to classify each pixel in the scene. Results of applying this technique to three-band SPOT data are presented. Comparisons with results obtained from a maximum likelihood classifier are also presented.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
We address automatic classification of active sonar signals using the Wigner Ville transform (WVT), the wavelet transform (WT), and the scalogram. Features are extracted by integrating over regions in time frequency (TF) distribution and are classified by a decision tree. Experimental results show classification and detection rates of up to 92% at -4 dB of SNR. The WT outperforms the WVT and the scalogram particularly at high noise levels; this can be partially attributed to the absence of cross terms in the WT.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
This paper describes a statistical model for reconstruction of emission computed tomography (ECT) images. A distinguishing feature of this model is that it is parameterized in terms of quantities of direct physiological significance, rather than only in terms of grey-level voxel values. Specifically, parameters representing regions, region means, and region volumes are included in the model formulation and are estimated directly from projection data. The model is specified hierarchically within the Bayesian paradigm. At the lowest level of the hierarchy, a Gibbs distribution is employed to specify a probability distribution on the space of all possible partitions of the discretized image scene. A novel feature of this distribution is that the number of partitioning elements, or image regions, is not assumed known a priori. In contrast, any other segmentation models (e.g., Liang et al., 1991, Amit et al., 1991) require that the number of regions be specified prior to image reconstruction. Since the number of regions in a source distribution is seldom known a priori, allowing the number of regions to vary within the model framework is an important practical feature of this model. In the second level of the model hierarchy, random variables representing emission intensity are associated with each partitioning element or region. Individual voxel intensities are assumed to be drawn from a gamma distribution with mean equal to the region mean in the third stage, and in the final stage of the hierarchy projection data are assumed to be generated from Poisson distributions with means equal to weighted sums of voxel intensities.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Bayesian estimation of transmission tomographic images presents formidable optimization tasks. Numerical solutions of this problem are limited in speed of convergence by the number of iterations required for the propagation of information across the grid. Edge-preserving prior models for tomographic images inject a nonlinear element into the Bayesian cost function, which limits the effectiveness of algorithms such as conjugate gradient, intended for linear problems. In this paper, we apply nonlinear multigrid optimization to Bayesian reconstruction of a two-dimensional function from integral projections. At each resolution, we apply Gauss-Seidel type iterations, which optimize locally with respect to individual pixel values. If the cost function is differentiable, the algorithm speeds convergence; if it is nonconvex and/or nondifferentiable, multigrid can yield improved estimates.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Segmentation of images into textural homogeneous regions is a fundamental problem in an image understanding system. Most region-oriented segmentation approaches suffer from the problem of different thresholds selecting for different images. In this paper an adaptive image segmentation based on vector quantization is presented. It automatically segments images without preset thresholds. The approach contains a feature extraction module and a two-layer hierarchical clustering module, a vector quantizer (VQ) implemented by a competitive learning neural network in the first layer. A near-optimal competitive learning algorithm (NOLA) is employed to train the vector quantizer. NOLA combines the advantages of both Kohonen self- organizing feature map (KSFM) and K-means clustering algorithm. After the VQ is trained, the weights of the network and the number of input vectors clustered by each neuron form a 3- D topological feature map with separable hills aggregated by similar vectors. This overcomes the inability to visualize the geometric properties of data in a high-dimensional space for most other clustering algorithms. The second clustering algorithm operates in the feature map instead of the input set itself. Since the number of units in the feature map is much less than the number of feature vectors in the feature set, it is easy to check all peaks and find the `correct' number of clusters, also a key problem in current clustering techniques. In the experiments, we compare our algorithm with K-means clustering method on a variety of images. The results show that our algorithm achieves better performance.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Good image segmentation can be achieved by finding the optimum solution to an appropriate energy function. A Hopfield neural network has been shown to solve complex optimization problems fast, but it only guarantees convergence to a local minimum of the optimization function. Alternatively, mean field annealing has been shown to reach the global or the nearly global optimum solution when solving optimization problems. Furthermore, it has been shown that there is a relationship between a Hopfield neural network and mean field annealing. In this paper, we combine the advantages of the Hopfield neural network and the mean field annealing algorithm and propose using an annealed Hopfield neural network to achieve good image segmentation fast. Here, we are concerned not only with identifying the segmented regions, but also finding a good approximation to the average gray level for each segment. A potential application is segmentation-based image coding. This approach is expected to find the global or nearly global solution fast using an annealing schedule for the neural gains. A weak continuity constraints approach is used to define the appropriate optimization function. The simulation results for segmenting noisy images are very encouraging. Smooth regions were accurately maintained and boundaries were detected correctly.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In most instances the boundaries between textured regions are defined by the gray level contrasts which result from the local interaction between the texture elements in each region. In such cases, the boundaries can be accurately characterized by gray level edge segments. Using these edge segments to localize the texture boundary directly addresses the major problem associated with texture segmentation, namely the localization verses classification accuracy conflict. The accuracy of segmentation methods which rely only on spatially distributed properties to characterize the texture, is limited to the spacial extent of the property used. In contrast, gray level edges are significantly more localized. However, before they can be of any use, the gray level edge segments defining the texture boundary must be isolated from the edges defining the texture elements. In this paper, we define a set of properties to do this. We also incorporate these properties into a parallel distributed algorithm which is used to segment a set of sample texture images.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Boolean logic is considered to be a good source for classification problems, an area dominated by neural networks. Although quite a few algorithms exist for training and implementing neural networks, no technique exists that can guarantee the transformation of any arbitrary Boolean function to neural networks. This paper describes a method that accomplishes exactly that. The algorithm is tested on the classic character recognition problem using translated, rotated, deformed, and noisy patterns. The initial simulation results are presented. Comparison of the proposed network to several popular existing networks has been performed and its advantages outlined. The future direction of research has also been explained.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Multi-sensor systems provide a purposeful description of the environment that a single sensor cannot offer. Fusing several types of data enhances the recognition capability of a robotic system and yields more meaningful information otherwise unavailable or difficult to acquire by a single sensory modality. Because observations provided by sensors are uncertain, incomplete, and/or imprecise, we adopted the use of the theory of fuzzy sets as a general framework to combine uncertain measurements. We developed a fusion formula based on the measure of fuzziness. This fusion formula satisfies several desirable properties. We established a fuzzification scheme by which different types of input data (images) are modeled. This process is essential in providing suitable predictions and explanations of a set of observations in a given environment. After fusion, a defuzzification scheme is carried out to recover crisp data from the combined fuzzy assessments. This approach was implemented and tested with real range and intensity images acquired using an Odetics Range Finder. The goal is to obtain better scene descriptions through a segmentation process of both images. Despite the low resolution of the images and the amount of noise present, the segmented output picture is suitable for recognition purposes.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In the field of remote sensing (RS) image classification, pattern indeterminacy due to inherent data variability is always present. Class mixture, too, is a serious handicap to conventional classifiers in order to settle proper class patterns. Fuzzy classification techniques improve the extraction of information yielded by conventional methods, i.e., statistical classification procedures, because both in the design of the classifier and when bringing out classification results, natural fuzziness present in real-world recognition processes is considered. This paper presents first the application of a fuzzy classification algorithm from Kent and Mardia to RS images, along with the analysis of the results and comparison against `hard' classifications. Secondly, we put forward one particular method to display these results (fuzzy partitions) by coding pixels' membership into a pseudocolor representation. This representation is intended to serve as an interface between fuzzy coefficients resulting from the classification process and a very natural way for humans to perceive information such as that of color mixtures.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Image clustering aims at obtaining a partition of the image in such a way that each partition sub-set can be taken as an object; most of the time this procedure is iterative. Following a string of binary decisions, errors are generated which can be avoided using fuzzy partitioning. The key point which remains is to choose for each image pixel a coefficient of assignment to the fuzzy sub-set. This makes it necessary to carefully define a fuzzy equivalence relationship but we consider that the existing definitions are not consistent if not contradictory. So we propose a new definition of the fuzzy set equivalence relationship. In this communication we first give some evidence of the importance of changing the definitions of the reflexivity and the transitivity of the relation. Then we show the consistency of the proposed new definition. A simple example is presented to define a fuzzy partition and fuzzy included partitions; this leads to an application in the specific case of image processing. A comparison is proposed between the results obtained with a classical hierarchical region growing method and with fuzzy set logic.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
This paper examines the applicability of genetic algorithms in the complete design of fuzzy logic controllers. While GA has been used before in the development of rule sets or high performance membership functions, the interdependence between these two components dictates that they should be designed together simultaneously. GA is fully capable of creating complete fuzzy controllers given the equations of motion of the system, eliminating the need for human input in the design loop. We show the application of this new method to the development of a cart controller.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The hybrid learning rule is a novel learning rule that combines the Hebbian learning rule and the back propagation algorithm. This novel learning rule was applied to the problem of isolated handwritten character recognition. The problem domain was limited to ten letters, which may be rotated or translated. The performance of the hybrid learning rule on this problem domain was measured and compared to the performance of the back propagation algorithm. While the hybrid learning rule failed to outperform the back propagation algorithm, it does generate receptive fields similar to those found by other researchers.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Two reject mechanisms are compared using a massively parallel character recognition system implemented at NIST. The recognition system was designed to study the feasibility of automatically recognizing hand-printed text in a loosely constrained environment. The first method is a simple scalar threshold on the output activation of the winning neurode from the character classifier network. The second method uses an additional neural network trained on all outputs from the character classifier network to accept or reject assigned classifications. The neural network rejection method was expected to perform with greater accuracy than the scalar threshold method, but this was not supported by the test results presented. The scalar threshold method, even though arbitrary, is shown to be a viable reject mechanism for use with neural network character classifiers. Upon studying the performance of the neural network rejection method, analyses show that the two neural networks, the character classifier network and the rejection network, perform very similarly. This can be explained by the strong non-linear function of the character classifier network which effectively removes most of the correlation between character accuracy and all activations other than the winning activation. This suggests that any effective rejection network must receive information from the system which has not been filtered through the non-linear classifier.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Iterated transformation theory (ITT), also known as fractal coding, is a relatively new block compression method which removes redundancies between different scale representations of the uncompressed signal. In ITT coding we are looking for a piecewise continuous mapping from the space of all images with the same support onto itself which has a close approximation of the desired image as a unique fixed point. The mapping is then the code for the image, and for decoding we iterate the mapping on any initial image, orders of magnitude faster than encoding. We have reduced the computational load of finding the piecewise continuous transformation by using a self-organizing feature map (SOFM) artificial neural network which finds similar features in different resolution representations of the image. The patterns are mapped onto a two-dimensional array of formal neurons forming a code book similar to vector quantization (VQ) coding. We use the (SOFM) ordering properties by searching for mapping not only to the best feature match neuron but also to its neighbors in the network. In this paper we describe the ITT-SOFM algorithm and its software implementation with application to image coding of still gray images. Computer simulations show compression results comparable to or better than state-of-the-art VQ coders, and computational complexity better than most of the well known clustering algorithms.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The study of connectionist models for pattern recognition is mainly motivated by their presumed simultaneous feature selection and classification. Character recognition is a common test case to illustrate the feature extraction and classification characteristics of neural networks. Most of the variability concerning size and rotation can be handled easily, while acquisition conditions are usually controlled. Many examples of neural character recognition applications were presented where the most successful results for optical character recognition (OCR) with image inputs were reported on a layered network (LeCun et al., 1990) integrating feature selection and invariance notions introduced earlier in neocognitron networks. Previously, we have presented a supervised learning algorithm, based on Kohonen's self- organizing feature maps, and its applications to image and speech processing (Midenet et al., 1991). From pattern recognition point of view, the first network performs local feature extraction while the second does a global statistical template matching. We describe these models and their comparative results when applied to a common French handwritten zip-code database. We discuss possible cooperation schemes and show that the performances obtained by these networks working in parallel exceed those of the networks working separately. We conclude by the possible extensions of this work for automatic document processing systems.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Stochastic simulated annealing (SSA) is a popular method for solving optimization functions in which the objective function has multiple minima. Not only can SSA find minima, but it has been proven to converge (under certain conditions) to the global minimum. The principal drawback to SSA has been its convergence rate. In order to preserve the conditions of the convergence proof, the algorithm must be run so slowly as to be impractical for many applications. In this paper, an extension to SSA is described which allows the user to provide additional a priori information to the algorithm which may allow much more rapid convergence. The new method, called `compensated simulated annealing' (CSA) is also guaranteed to converge. A problem of finding a minimum path through a recurrent multilayer graph is described. Then a practical motivating application from medical imaging is presented. The graph structure is used to model the boundary of an artery in an intra-arterial ultrasound image. The optimization problem is posed and solved by SSA and CSA as a means of comparing the two methods. The CSA approach is shown to converge significantly faster than SSA.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In this work we introduce Markov cross entropic priors in the Bayesian restoration and fusion of laser range images. These cross entropic priors are used to model smoothness of the surfaces and linearity of the discontinuities. The priors are defined over a pair of coupled Markov random fields representing the corresponding pixel and line processes. Gibbsian maximum a posteriori estimates are then found using simulated annealing. Range image data are discussed, and results are presented for synthetic range images.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Recent advances in neural networks application development for real life problems have drawn attention to network optimization. Most of the known optimization methods rely heavily on a weight sharing concept for pattern separation and recognition. The shortcoming of the weight sharing method is attributed to a large number of extraneous weights which play a minimal role in pattern separation and recognition. Our experiments have shown that up to 97% of the connections in the network can be eliminated with little or no change in the network performance. Topological separation should be used when the size of the network is large enough to tackle real life problems such as fingerprint classification. Our research has focused on the network topology by changing the number of connections as secondary method of optimization. Our findings so far indicate that for large networks topological separation yields smaller network size which is more suitable for VLSI implementation. Topological separation is based on the error surface and information content of the network. As such it is an economical way of size reduction which leads to overall optimization. The differential pruning of the connections is based on the weight contents rather than number of connections. The training error may vary with the topological dynamics but the correlation between the error surface and recognition rate decreases to a minimum. Topological separation reduces the size of the network by changing its architecture without degrading its performance. The method also results in a network which is considerably smaller in size with a better performance.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In a nearest neighbor classifier, an input sample is assigned to the class of the nearest prototype. The decision rule is simple and robust. However, it is computationally expensive in terms of memory space and computer time to implement a nearest neighbor classifier if each training sample is stored as a prototype and used to compare with every testing sample. The performance of the classifier is degraded if only a small number of training samples are used as prototypes. An algorithm is presented in this paper for modifying the prototypes so that the classification rate can be increased. This algorithm makes use of a two-layer perceptron with one second order input. The perceptron is trained and mapped back to a new nearest neighbor classifier. It is shown that the new classifier with only a small number of prototypes can even perform better than the classifier that uses all training samples as prototypes.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In this paper some of the commonly used features for texture classification based on co- occurrence statistics are studied. First, the classification capabilities of individual features in classifying among a small and a large number of texture images are evaluated. Then, the capabilities of different combinations of texture features are examined in order to establish a reduced set of features for maximum performance. An artificial neural network is used to test the suitability of promising feature groups for texture classification. It is shown that the features considered may be broadly divided into two groups in terms of their classification performance. It is also shown that with a judicious choice of features and a well trained neural network classifier, high recognition rates can be achieved.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In the paper, we are to design the optimal learning rule for the Hopfield associative memory (HAM) based on three well recognized criteria, that is, all desired attractors must be made not only isolately stable but also asymptotically stable, and the spurious stable states should be the fewest possible. To construct a satisfactory associative memory, those criteria are crucial. In the paper, we first analyze the real cause of the unsatisfactory performance of the Hebb rule and many other existing learning rules designed for HAMs and then show that three criteria actually amount to widely expanding the basin of attraction around each desired attractor. One effective way to widely expand basins of attraction of all desired attractors is to appropriately dig their respective steep kernel basin of attraction. For this, we introduce a concept called by the Hamming-stability. Surprisingly, we find that the Hamming-stability for all desired attractors can be reduced to a moderately expansive linear separability condition at each neuron and thus the well known Rosenblatt's perceptron learning rule is the right one for learning the Hamming-stability. Extensive experiments were conducted, convincingly showing that the proposed perceptron Hamming-stability learning rule did take good care of three optimal criteria.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Performance measures are derived for data-adaptive hypothesis testing by systems trained on stochastic data. The measures consist of the averaged performance of the system over the ensemble of training sets. The training set-based measures are contrasted with maximum aposteriori probability (MAP) test measures. It is shown that the training set-based and MAP test probabilities are equal if the training set is proportioned according to the prior probabilities of the hypotheses. Applications of training set-based measures are suggested for neural net and training set design.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Neural and stochastic models for signal classification generate output probabilities to indicate whether or not their inputs are members of the modeled class. This paper presents a feature enhancing neural network with weights based on the modeled class which can improve the classification performance of single output classifiers, by increasing output probabilities for members of the modeled class or decreasing output probabilities for non-members. The neural network is demonstrated as a front-end for multi-layer perceptron and semi-continuous hidden Markov model based classifiers for speech recognition applications. It is unique in that the weights and width of the input layer adapt based on extracted characteristics from the input speech signal. The connectionist architecture is motivated by the highly successful time-delay neural network and the desire to find efficient training procedures for class-dependent, short- time transformations. The weights are determined using a principal component analysis process and can be found by applying iterative or conventional algorithms. The neural network reduces false acceptances by more than one-third for a defined mono-syllable keyword spotting application using a semi-continuous hidden Markov model based system. An evaluation of the neural network as a front-end for multi-layer perceptron based classifiers which distinguish a word from confusable words is also presented.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In this paper an attention module is described, which can be used by an active vision system to generate gaze changes. This module is based on a bottom-up, feature-driven analysis of the image. The results are regions of the input image which contain strange features, i.e., locations of the most `interesting' and `important' information. The method proposed for detecting such regions is based on the decomposition of the input image into a set of independent retinotopic feature maps. Each map represents the value of a certain attribute computed on a set of low-level primitives such as contours and regions. Relevant objects can be detected if the corresponding primitives have a feature value strongly different from the neighboring ones. Local comparisons of feature values are used to compute such measures of `difference' for each feature map and give rise to a corresponding set of conspicuity maps. In order to obtain a single measure of interest for each location and to make the process robust to noise, a relaxation algorithm is run on the set of conspicuity maps. A dozen iterations are sufficient to detect a binary mask identifying the attention regions. Results on real scenes are presented.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In this paper we use neural network algorithms for office layout. A pixel matrix of coarse pixels is used to represent the objects of the room and their spatial relation. For each pixel the probabilities of the different objects are predicted from the neighboring pixels, assuming that the geometrical structure is mainly determined by local characteristics. Local receptive fields are employed to capture these local interactions using backpropagation networks. The reconstruction of the complete scene is achieved by an iterative process. Starting with given marginal constraints (or missing information for specific locations) each feature map performs an association with respect to its central pixel. This corresponds to the simulation of a Markov random field. External constraints on the sum of probabilities are taken into account using the iterative proportional fitting algorithm. The viability of the approach is demonstrated by an example.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
We apply an epistemological model of rational decision making developed by Isaac Levi to the estimation problem. The estimators thus developed are optimal with respect to an epistemic utility function that accounts for the cost of error, the decision agent's assessment of informational value, and the decision agent's specified willingness to risk errors in exchange for information acquisition; the estimators are inherently set-valued. We apply Levi's stable acceptance procedure to the estimation process, facilitating the use of unnormalized probability densities. As an example of the theory, we develop algorithms to apply an epistemic utility estimator to the output of a set-valued Kalman filter.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
This paper presents preliminary results from an investigation of edge detection with a principal components analysis. In this way the parameters of an edge detector can be derived from data about edges in images. Our preliminary investigations attempt two approaches to deriving an edge detector from the analysis, neither of which is totally successful.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The effectiveness and usefulness of vision lies in its purposive, behavioral characteristics. The Bayesian belief revision theory is examined for effective modeling of integrated knowledge of expectation and evidence in visual activities. Evaluation of decision making criteria based on distributed message propagation in Bayesian belief networks is examined for a mechanism that brings together interactions between processing modules. On the other hand, by regarding the spatio-temporal regularities in the moving patterns of objects in the scene as a network of temporally dependent belief hypothesis, visual expectations can be represented by the most likely combinations of hypotheses by updating the network in response to instantaneous visual evidence. Such expectations in turn could be used for visual attention. In particular, we relate the concept of vision as behavior with results from some of our early studies on visual augmented hidden Markov model for representing `hidden' regularities in object motion and producing dynamic expectations of the moving object in the scene.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Neural network, image-based pattern recognition is generally robust to noise. However, when applied to imaging through the atmosphere, image-based classification performance can be severely reduced by low contrast atmospheric conditions. In particular, we show that classification performance through spatially fluctuating plumes of smoke and dust is reduced by the changes in path radiance and transmittance across the image. However, by predicting the quantitative effects of propagation losses, we also show that classification performance can be significantly improved by applying a novel training strategy. Improved performance can be obtained by training a neural network on atmospheric propagation effects as an additional class and simultaneously training the network to ignore the atmospheric influence on the target classes. Successful tests of the method in actual field measurements of targets partially obscured by smoke and dust are shown. Effects both on single layer and multi-layer backpropagation neural networks are considered, and performance improvement is shown for several classification examples.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Present remote sensing systems are capable of producing digital image data at rates which far exceed the exploitation capabilities of existing processing systems. Automated image classification and interpretation tools are necessary to optimize the use of remotely sensed multispectral imagery. We have investigated the use of artificial neural networks (ANN) for spectral pattern recognition in multispectral imagery for both polarimetric synthetic aperture radar (SAR) and Landsat Thematic Mapper (TM) data. We have used ANN to segment SAR and TM scenes into a few broad land use/land cover (LU/LC) types (e.g., vegetation, bare soil, water, etc.). We believe that these broad landuse classes can be subclassified further into more refined types (e.g., vegetation, class can be partitioned into different vegetation types) using spectral information, spatial shape indicators, and contextual image information such as texture.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Certainly data integration for land-cover classification requires a non-linear system to associate satellite imagery with exogenous imagery. In this study we present some results of a Neural Network based methodology to provide land-cover classifications. Two approaches are investigated: a) The Monolithic integration: all required registred images are the inputs of only one Back-Error Propagation (BEP) network. The network is trained on purpose to get the final classification. b) The class-distributed integration: for each class a specific network learns from all sattelite imageries its class characteristics. In both approachs, topographic mapping is taken into account as exogenous data.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In this paper the neural network concept is studied, as a non-linear dynamic system, for predicting spatiotemporal patterns. The relative behavior of two back-error propagation neural network (BPNN) configurations is investigated in the context of real world data from geostationary meteorological satellite (GOES) images. One of them explores only temporal information, the other one takes into account spatial-contextual pattern aspects. The results demonstrate that neural networks are a useful tool for time series prediction of spatial patterns. It means that with certain accuracy future states of a spatial phenomena can be generated before the satellite captures them in its next imaging.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
A target detection scheme based on FLIR image sequence segmentation is described. This includes an estimation of velocity field and a segmentation of FLIR images using a Hopfield like neural network. Experimental results using real world IR image sequence taken by airborne FLIR are presented.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In this paper, we investigate the performance of image compression and decompression using counterpropagation neural network. It is noted that by using a few training image patterns (less than 20) we can compress the unlearned image data without knowing its characteristic. The selection of training image patterns is discussed. Examples are presented to demonstrate the compression method.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
A novel weighted outer-product learning (WOPL) scheme for associative memory neural networks (AMNNs) is presented. In the scheme, each fundamental memory is allocated a learning weight to direct its correct recall. Both the Hopfield and multiple training models are instances of the WOPL model with certain sets of learning weights. A necessary condition of choosing learning weights for the convergence property of the WOPL model is obtained through neural dynamics. A criterion for choosing learning weights for correct associative recalls of the fundamental memories is proposed. In this paper, an important parameter called signal to noise ratio gain (SNRG) is devised, and it is found out empirically that SNRGs have their own threshold values which means that any fundamental memory can be correctly recalled when its corresponding SNRG is greater than or equal to its threshold value. Furthermore, a theorem is given and some theoretical results on the conditions of SNRGs and learning weights for good associative recall performance of the WOPL model are accordingly obtained. In principle, when all SNRGs or learning weights chosen satisfy the theoretically obtained conditions, the asymptotic storage capacity of the WOPL model will grow at the greatest rate under certain known stochastic meaning for AMNNs, and thus the WOPL model can achieve correct recalls for all fundamental memories. The representative computer simulations confirm the criterion and theoretical analysis.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The digital morphological skeleton representation provides a means of improving lossless coding in a communication system. This is due to the observation that the entropy of a morphological skeleton is less than its original image. One way to improve coding efficiency is to minimize the morphological skeleton representation by choosing a more appropriate structuring element. For an image with consistent shape distribution such as a texture pattern, a more efficient and useful skeleton representation is expected. Analysis of simulated and natural image patterns show the activated points in a morphological skeleton to range between 30 and 327 points using different structuring elements. A procedure is proposed which allows for the selection of a more effective structuring element from a basis set of structuring elements. The decision process is based on the minimum-distance measurement in a multiprototype pattern classification. The structuring element for morphological skeletonization is from the closest match between the chain code edge vector and the basis set of structuring elements. The proposed procedure represents an organized means for choosing a more meaningful structuring element for morphological analysis. It is shown that a significant reduction in the number of required activated skeleton points will result for morphological skeletonization.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Data fusion has been widely used in various fields of automation. This paper describes a multisensor integration system: range and intensity image processing system, which can be used for object recognition and classification. In the data fusion processing, a new method called generalized evidence inference method is used by the system. The method presented here unifies both Bayesian theory and Dempster-Shafer's evidential reasoning (DSER) for the combination of information from diversified sources, and overcomes the disadvantages of both approaches. At the same time, we adopt these three approaches: the Bayesian theory, the DSER, and the unified approach to fuse the reports in the system for object recognition and classification, the results are compared and analyzed.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Increasingly huge amounts of digital data from a wide range of sources such as B-ISDN services, satellite transmission of photographs, and police database of human face images are being transmitted and stored. Therefore, both transmission channel capacity and disk space are limited. For some advanced techniques, such as multi-media terminal and HDTV etc., the problems are even more apparent. Based on this it is important that efficient image compression algorithms are used in order to reduce the transmission capacity and storage space. In this paper, a scheme of image data compression with an adaptive BP neural network is presented. The data compression property of mapping original image to a feature space of reduced dimensionality is utilized. Images are divided as a set of 8 X 8 sub-image blocks which apply to a three layer BP neural network as inputs. It is shown from computer simulation that the results are better than Sonehara, et al.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In this paper, Hopfield networks, Hamming networks, and neocognitron models and their application in handwritten digit recognition are discussed. The neocognitron model is a multilayer network for a mechanism of visual pattern recognition and self-organized by `learning without a teacher,' and it acquires an ability to recognize stimulus patterns based on the geometrical similarity of their shapes without being affected by their positions and distortions, so it showed higher ability to recognize handwritten digits. We developed a handwritten digit recognition system based on the neocognitron (HDRSBN), and carried on the simulation experiments.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
This work investigates the application of a stochastic search technique, evolutionary programming, for developing self-organizing neural networks. The chosen stochastic search method is capable of simultaneously evolving both network architecture and weights. The number of synapses and neurons are incorporated into an objective function so that network parameter optimization is done with respect to computational costs as well as mean pattern error. Experiments are conducted using feedforward networks for simple binary mapping problems.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
A novel method of color human face image recognition is presented in this paper. First, an input color face image is transformed into a monochrome image which contains enough useful information for recognition. This monochrome image is then transformed into a standard image. The face recognition is completed via classification of the projective feature vectors of the standard image by a minimum distance classifier. Experimental results showed that the method is effective.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Liquid metal combustion chambers are under consideration as power sources for propulsion devices used in undersea vehicles. Characteristics of the reactive jet are studied to gain information about the internal combustion phenomena, including temporal and spatial variation of the jet flame, and the effects of phase changes on both the combustion and imaging processes. A ray tracing program which employs simplified Monte Carlo methods has been developed for use as a predictive tool for radiographic imaging of closed liquid metal combustors. A complex focal spot is characterized by either a monochromatic or polychromatic emission spectrum. For the simplest case, the x-ray detection system is modeled by an integrating planar detector having 100% efficiency. Several simple geometrical shapes are used to simulate jet structures contained within the combustor, such as cylinders, paraboloids, and ellipsoids. The results of the simulation and real time radiographic images are presented and discussed.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The higher order cumulants and their Fourier transforms, polyspectra, are used in order to achieve a number of objects which may not be possible to obtain using second order statistics. In this paper, we study different approaches to estimate the bispectrum and apply the result to the image reconstruction and communication signal identification. One of the key advantages of using cumulants in the signal processing are is that cumulants are blind to all kinds of Gaussian processes. Thus, when a cumulants method is used on non-Gaussian signals polluted by additive Gaussian noise, it will improve the signal/noise ratio.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
This paper proposes an unsupervised learning algorithm for linear neural network (LNN), two activity measurements are designed to classify the image subblocks into four categories. In order to improve the performance of LNN, an adaptive scheme is presented. The simulation results show that better reconstructed image quality is achieved than previous algorithms.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
We propose the use of self-organizing maps (SOMs) and learning vector quantization (LVQ) as an initialization method for the training of the continuous observation density hidden Markov models (CDHMMs). We apply CDHMMs to model phonemes in the transcription of speech into phoneme sequences. The Baum-Welch maximum likelihood estimation method is very sensitive to the initial parameter values if the observation densities are represented by mixtures of many Gaussian density functions. We suggest the training of CDHMMs to be done in two phases. First the vector quantization methods are applied to find suitable placements for the means of Gaussian density functions to represent the observed training data. The maximum likelihood estimation is then used to find the mixture weights and state transition probabilities and to re-estimate the Gaussians to get the best possible models. The result of initializing the means of distributions by SOMs or LVQ is that good recognition results can be achieved using essentially fewer Baum-Welch iterations than are needed with random initial values. Also, in the segmental K-means algorithm the number of iterations can be remarkably reduced with a suitable initialization. We experiment, furthermore, to enhance the discriminatory power of the phoneme models by adaptively training the state output distributions using the LVQ-algorithm.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
A method based on differential geometry, is presented for mathematically describing the shape of the facial surface. Three-dimensional data for the face are collected by optical surface scanning. The method allows the segmentation of the face into regions of a particular `surface type,' according to the surface curvature. Eight different surface types are produced which all have perceptually meaningful interpretations. The correspondence of the surface type regions to the facial features are easily visualized, allowing a qualitative assessment of the face. A quantitative description of the face in terms of the surface type regions can be produced and the variation of the description between faces is demonstrated. A set of optical surface scans can be registered together and averages to produce an average male and average female face. Thus an assessment of how individuals vary from the average can be made as well as a general statement about the differences between male and female faces. This method will enable an investigation to be made as to how reliably faces can be individuated by their surface shape which, if feasible, may be the basis of an automatic system for recognizing faces. It also has applications in physical anthropology, for classification of the face, facial reconstructive surgery, to quantify the changes in a face altered by reconstructive surgery and growth, and in visual perception, to assess the recognizability of faces. Examples of some of these applications are presented.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The paper presented here explores the possibility of applying neural networks to identify authorized users of a computer system. Computer security can be ensured only by restricting access to a computer system. This in turn requires a sure means of identifying authorized users. The related research is based on the fact that every human being is distinguished by many unique physical characteristics. It has been known even before the age of computers that no two individuals sign their names identically. Signature samples collected from a group of individuals are analyzed and a neural network-based system that can recognize these signatures is designed.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Over the years a considerable amount of research has been conducted in the area of passive stereo vision. Usually attempts have been made to solve the stereo correspondence problem in its most general sense and build an all purpose stereo module. Possible matches are proposed for all parts or edges of the image. The above general approach is not always necessary. Indeed there is evidence that the human vision system only attempts to match a small number of possible edges in a particular scene. In this paper we describe a computationally simple algorithm which takes advantage of the nature of the object being tracked. Disparity measurements are made for the entire edge and statistics used to provide subpixel accuracy. This approach reduces the problems caused by quantization noise when attempts are made to rectify the depth information. We show that stereo algorithms can be used and adapted in an application specific manner to construct viable systems in the areas of alarms and `invisible wall' detection. Results are presented to show the effectiveness of the algorithm in a number of both difficult and simple sequences. In conclusion, we believe our work demonstrates an industrially viable vision system requiring minimal hardware for implementation.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Tomographic reconstruction in two dimensions is concerned with the reconstruction of a positive, bounded function f(x, y) and its compact domain of support (Omicron) from noisy and possibly sparse samples of its radon-transform projections, g(t, (Omicron) ). If the pair (f, (Omicron) ) is referred to as an object, a finitely parameterized object is one in which both f(x, y) and (Omicron) are determined uniquely by a finite number of parameters. For instance, a binary N-sided polygonal object in the plane is uniquely specified by exactly 2N parameters which may be the vertices, normals to the sides, etc. In this work we study the optimal reconstruction of finitely parameterized objects from noisy projections. In specific, we focus our study on the optimal reconstruction of binary polygonal objects from noisy projections. We show that when the projections are corrupted by Gaussian white noise, the optimal maximum likelihood (ML) solution to the reconstruction problem is the solution to a nonlinear optimization problem. This optimization problem is formulated over a parameter space which is a finite dimensional Euclidean space. We also demonstrate that in general, the moments of an object can be estimated directly from the projection data and that using these estimated moments, a good initial guess for the numerical solution to the nonlinear optimization problem may be constructed. Finally, we study the performance of the proposed algorithms from both statistical and computational viewpoints.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Cloud classification is a key input to global climate models. Cloud spectra are typically mixed, however, thus difficult to classify using the maximum likelihood rule. In contrast to maximum likelihood, a densely interconnected, trained neural network can form powerful generalizations that distinguish unique statistical trends among otherwise ambiguous spectral response patterns. Accordingly, cloud classification accuracies produced by a neural network can exceed accuracies produced using the maximum likelihood criterion.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.