KEYWORDS: COVID 19, Data analysis, Data conversion, Web 2.0 technologies, Visualization, Medicine, Data processing, Data modeling, Data visualization, Data acquisition
In this paper, we determine associations between social media use and beliefs in conspiracy theories and misinformation among African American communities in Tuskegee County. We will study a high community with significant social problems. The primary goal of this work is to visualize how information (both false and accurate) flows through social media, traditional media, and social networks to influence decision-making in rural areas. The second goal is to examine how various other factors moderate this influence. We will examine the impacts of education, age, and other demographics, as well as measure Gigerenzer’s concept of “risk literacy” which examines the accuracy of people’s perceived notions of risk. We will develop our model based on data collected from in-person meetings and town halls, questionnaires, and other information collected to measure peoples’ social media use, social networks, and their beliefs about issues such as the efficacy of COVID vaccines, their trust in the health care system, their beliefs about mental health.
Virtual Reality (VR) has made significant strides, offering users a multitude of ways to interact with virtual environments. Each sensory modality in VR provides distinct inputs and interactions, enhancing the user's immersion and presence. However, the potential of additional sensory modalities, such as haptic feedback and 360° locomotion, to improve decision-making performance has not been thoroughly investigated. This study addresses this gap by evaluating the impact of a haptic feedback, 360° locomotion-integrated VR framework and longitudinal, heterogeneous training on decision-making performance in a complex search-and-shoot simulation. The study involved 32 participants from a defence simulation base in India, who were randomly divided into two groups: experimental (haptic feedback, 360° locomotion-integrated VR framework with longitudinal, heterogeneous training) and placebo control (longitudinal, heterogeneous VR training without extrasensory modalities). The experiment lasted 10 days. On Day 1, all subjects executed a search-and-shoot simulation closely replicating the elements/situations in the real world. From Day 2 to Day 9, the subjects underwent heterogeneous training, imparted by the design of various complexity levels in the simulation using changes in behavioral attributes/artificial intelligence of the enemies. On Day 10, they repeated the search-and-shoot simulation executed on Day 1. The results showed that the experimental group experienced a gradual increase in presence, immersion, and engagement compared to the placebo control group. However, there was no significant difference in decision-making performance between the two groups on day 10. We intend to use these findings to design multisensory VR training frameworks that enhance engagement levels and decision-making performance.
Hyperspectral imaging records data over a broad range of electromagnetic spectrum wavelengths and presents a viable option for fruit maturity detection when incorporated with deep neural networks. This paper focuses on improving the accuracy of the Kiwi and Avocado fruit hyperspectral dataset by introducing a modified version of depthwise separable convolution and comparing the results with state-of-the-art models to prove our model’s reliability. The research aims to use the proposed model to predict the fruits’ ripeness, firmness, and sugar content levels.
As the population of the earth grows, the demand for food grows proportionally. Early and cost-effective detection of plant diseases can result in less food loss throughout the world. The current methods for image-based plant disease detection tend to fail in field conditions. Our method uses region proposal networks to localize diseased leaves for detection. We discard no prior anchor boxes, which increases the average recall of the network, resulting in better localization.
Cancer has a tremendous present impact on human existence due to its extremely high global death rate. Malignant melanoma of the skin accounts for 20 daily deaths in the United States. Malignant melanomas (MEL), basal cell carcinomas (BCC), actinic keratoses intraepithelial carcinomas (AKIEC), nevi (melanocytic), keratinocytic lesions (BKL), dermatofibromas (DF), and vascular lesions (VL) are the seven main types of skin cancer (VASC). It might be challenging to recognize and classify different cancer kinds frombiomedical imaging, as there are many sub-cancer types that differ significantly from one another. Several researchers and doctors are currently trying to pinpoint the most effective means of spotting skin cancer in its earliest stages. Using multiple residual and sequential convolutional neural networks,we present a learning strategy for cancer classification in this research. An effort is made here to more precisely categorize MEL, BCC, and BKL cancers. F1 score, precision, recall, and accuracy are used to verify the validity of the proposed model. Results show the reliability and validity of the model.
Measuring classroom engagement is an important but challenging task in education. In this paper, we present an automated method for the assessment of the degree of classroom engagement using computer vision techniques that integrate data from multiple sensors, including the front and back of the student's seating arrangement. The students' engagement is evaluated based on attributes such as facial expression, gesture, head position, and distractions visible from the frontal view of the students. Moreover, using the videos from the back of the classroom, the professor's teaching content as well as their alignment with student engagement, are calculated. We leverage deep learning methods to extract emotion and behavior features to aid in the evaluation of engagement. These AI methods will quantify the classroom engagement process.
This paper is concerned with the correlation in the African American community between their social media usage and their degree of Covid vaccine hesitancy and other general health attributes. In the past, various studies have found associations between social media use and beliefs in conspiracy theories and misinformation, however, most of these studies focus on large data sets which lack in accuracy or too general and lack of sufficient quantitative methodologies such as machine learning techniques. In this work, we experimented a pilot study with a small number of African American community regarding COVID-19 vaccine and their social beliefs. In other words, this pilot study is important for the improvement of the quality and efficiency of the main study in the future. In addition, it helps to understand the pattern of a certain community regarding certain views.
Classroom engagement is one important factor that determines whether students learn from lectures. Most of the traditional classroom assessment tools are based on summary judgements of students in the form of student surveys filled out in class or online once a semester. Such ratings are often bias and do not capture the real time teaching of professors. In addition, they fail for the most part to capture the locus of teaching and learning difficulties. They cannot differentiate whether ongoing poor performance of students is a function of the instructor's lack of teaching skill or the student's lack of engagement in the class. So, in order to streamline and improve the evaluation of classroom engagement, we introduce human gestures as additional tool to improve teaching evaluation along with other techniques. In this paper we report the results of using a novel technique that uses a semi-automatic computer vision based approach to obtain accurate prediction of classroom engagement in classes where students do not have digital devices like laptops and, cellphones during lectures. We conducted our experiment in various class room sizes at different times of the day. We computed the engagement through a semi- automatic process (using Azure, and manual observation). We combined our initial computational algorithms with human judgment to build confidence the validity of the results. Application of the technique in the presence of distractors like laptops and cellphones is also discussed.
Multiquadrics (MQ) are radial basis spline function that can provide an efficient interpolation of data points
located in a high dimensional space. MQ were developed by Hardy to approximate geographical surfaces and
terrain modelling. In this paper we frame the task of interactive image segmentation as a semi-supervised
interpolation where an interpolating function learned from the user provided seed points is used to predict
the labels of unlabeled pixel and the spline function used in the semi-supervised interpolation is MQ. This
semi-supervised interpolation framework has a nice closed form solution which along with the fact that MQ
is a radial basis spline function lead to a very fast interactive image segmentation process. Quantitative and
qualitative results on the standard datasets show that MQ outperforms other regression based methods, GEBS,
Ridge Regression and Logistic Regression, and popular methods like Graph Cut,4 Random Walk and Random
Forest.6
A scanning electron microscope (SEM) is a type of electron microscope that produces images of a sample by scanning it with a focused beam of electrons. The electrons interact with the sample atoms, producing various signals that are collected by detectors. The gathered signals contain information about the sample’s surface topography and composition. The electron beam is generally scanned in a raster scan pattern, and the beam’s position is combined with the detected signal to produce an image. The most common configuration for an SEM produces a single value per pixel, with the results usually rendered as grayscale images. The captured images may be produced with insufficient brightness, anomalous contrast, jagged edges, and poor quality due to low signal-to-noise ratio, grained topography and poor surface details. The segmentation of the SEM images is a tackling problems in the presence of the previously mentioned distortions. In this paper, we are stressing on the clustering of these type of images. In that sense, we evaluate the performance of the well-known unsupervised clustering and classification techniques such as connectivity based clustering (hierarchical clustering), centroid-based clustering, distribution-based clustering and density-based clustering. Furthermore, we propose a new spatial fuzzy clustering technique that works efficiently on this type of images and compare its results against these regular techniques in terms of clustering validation metrics.
In this paper we present a technique to detect pedestrian. Histogram of gradients (HOG) and Haar wavelets
with the aid of support vector machines (SVM) and AdaBoost classifiers show good identification performance
on different objects classification including pedestrians. We propose a new shape descriptor derived from the
intra-relationship between gradient orientations in a way similar to the HOG. The proposed descriptor is a two
2-D grid of orientation similarities measured at different offsets. The gradient magnitudes and phases derived
from a sliding window with different scales and sizes are used to construct two 2-D symmetric grids. The first grid
measures the co-occurence of the phases while the other one measures the corresponding percentage of gradient
magnitudes for the measured orientation similarity. Since the resultant matrices will be symmetric, the feature
vector is formed by concatenating the upper diagonal grid coefficients collected in a raster way. Classification is
done using SVM classifier with radial basis kernel. Experimental results show improved performance compared
to the current state-of-art techniques.
Registering the 2D images is one of the important pre-processing steps in many computer vision applications like 3D reconstruction, building panoramic images. Contemporary registration algorithm like SIFT (Scale Invariant Feature transform) was not quite success in registering the images under symmetric conditions and under poor illuminations using DoF (Difference of Gaussian) features. In this paper, we introduced a novel approach for registering the images under symmetric conditions.
A graph-based approach for modeling and solving the LiDAR filtering problem in urban areas is established. Our method consists of three steps. In the first step, we construct a graph-based representation of the LiDAR data, where either the Delaunay triangulation or the K-nearest neighbors graph is used. Given a set of features extracted from LiDAR data, we introduce an algorithm to label the edges of this graph. In this second step, we define criteria to eliminate some of the graph edges and then use a connected components algorithm to detect the different components in the graph representation. Finally, these components are classified into terrain or objects. Different datasets with different characteristics have been used to analyze the performance of our method. We compared our method against two other methods, and results show that our method outperforms the other methods in most tests cases.
The evaluation and discrimination of similar objects in real versus synthetically generated aerial color images is needed for security and surveillance purposes among other applications. Identification of appropriate discrimination metrics between real versus synthetic images may also help in more robust generation of these synthetic images. In this paper, we investigate the effectiveness of three different metrics based on Gaussian Blur, Differential Operators and singular value decomposition (SVD) to differentiate between a pair of same objects contained in real and synthetic overhead aerial color images. We use nine pairs of images in our tests. The real images were obtained in the visible aerial color image domain. The proposed metrics are used to discriminate between pairs of real and synthetic objects such as cooling units, industrial buildings, houses, conveyors, stacks, piles, railroads and ponds in these real and synthetically generated images respectively. The proposed method successfully discriminates between the real and synthetic objects in aerial color images without any apriori knowledge or extra information such as optical flow. We ranked these metrics according to their effectiveness to discriminate between synthetic and real objects in overhead images.
A graph-based approach for modeling and solving the LiDAR filtering problem in urban areas is established. Our method consists of three steps. In the first step we construct a graph-based representation of the LiDAR data, Delaunay triangulation or the KNN graph can be used in this step. An algorithm is introduced to label the edges of this graph. In this second step, we defined criteria to eliminate some of the graph edges, then we used a connected components algorithm to separate the graph representation into different components. Finally, these components are classified into terrain or objects. Different datasets with different characteristics have been used to analyze the performance of our method.
KEYWORDS: 3D modeling, LIDAR, Cameras, Data modeling, Image segmentation, Visual process modeling, 3D image processing, Visualization, Image registration, 3D acquisition
An automated, fast, robust registration of visual images with three-dimensional data generated from light detection and ranging (LiDAR) sensor is described. The method is intended to provide a coarse registration of the images, suitable for input into other algorithms or where only a rough registration is required. Our approach consists of two steps. In the first step, two-dimensional lines are extracted as features from both the visual and the LiDAR data. In the second step, a novel and efficient search is performed to recover the external camera parameters, and thus the camera matrix (as a calibrated camera is assumed). An error metric involving line matchings is used to guide the search. We demonstrate the performance of our algorithm on both synthetic and real-world data.
Image registration is basic step in image fusion and in many 2D applications. Registering the 2D image with recent robust algorithms like SIFT (Scale invariant Feature Transform) works well in most situations. However, registering the 2D images under poor illumination is a challenging problem. In several situations, conventional registration algorithms like SIFT fail to register the images. Aside from poor illumination conditions, images involving too much symmetry can also pose registration difficulties for conventional methods. In our approach, we overcome these limitations by using the knowledge of the intrinsic camera parameters together with a new registration method to help in registering the features (lines) between the two overlapping images. Our approach is useful especially in registering the images taken by different sensors or the same sensor at different times under poor illumination conditions. Experiments are tested on real world environments.
KEYWORDS: Cameras, 3D modeling, Visualization, LIDAR, 3D image processing, 3D image reconstruction, Image visualization, Image segmentation, Visual process modeling, Data modeling
3D image reconstruction is desirable in many applications such as city planning, cartography and many vision
applications. The accuracy of the 3D reconstruction plays a vital role in many real world applications. We
introduce a method which uses one LiDAR image and N conventional visual images to reduce the error and to
build a robust registration for 3D reconstruction. In this method we used lines as features in both the LiDAR
and visual images. Our proposed system consists of two steps. In the first step, we extract lines from the LiDAR
and visual images using Hough transform. In the second step, we estimate the camera matrices using a search
algorithm combined with the fundamental matrices for the visual cameras. We demonstrate our method on a
synthetic model which is an idealized representation of an urban environment.
Accurate analysis of wireless capsule endoscopy (WCE) videos is vital but tedious. Automatic image analysis can expedite
this task. Video segmentation of WCE into the four parts of the gastrointestinal tract is one way to assist a physician. The
segmentation approach described in this paper integrates pattern recognition with statiscal analysis. Iniatially, a support
vector machine is applied to classify video frames into four classes using a combination of multiple color and texture
features as the feature vector. A Poisson cumulative distribution, for which the parameter depends on the length of segments,
models a prior knowledge. A priori knowledge together with inter-frame difference serves as the global constraints
driven by the underlying observation of each WCE video, which is fitted by Gaussian distribution to constrain the transition
probability of hidden Markov model.Experimental results demonstrated effectiveness of the approach.
Change detection is a important problem which plays a crucial role in many applications like environmental
monitoring and city planning. The goal of change detection is to detects changes in specific features within
certain time intervals. In this paper, we develop an automated method for detecting changes in urban areas
over a period of time using lines and colors as features. Our proposed algorithm consists of two steps. In the
first step, we detect corresponding lines between two images taken over different periods of time and we match
them using our search algorithm. To be specific, first we use the Hough transform to detect lines. In the second
step, we use colors to detect the changes over static and dyanmic objects. In a test of the method using aerial
images over the our university campus area, we obtained reasonably good pose recovery and detection of scene
changes.
Automatic speech processing systems are widely used in everyday life such as mobile communication, speech and
speaker recognition, and for assisting the hearing impaired. In speech communication systems, the quality and
intelligibility of speech is of utmost importance for ease and accuracy of information exchange. To obtain an
intelligible speech signal and one that is more pleasant to listen, noise reduction is essential. In this paper a new
Time Adaptive Discrete Bionic Wavelet Thresholding (TADBWT) scheme is proposed. The proposed technique
uses Daubechies mother wavelet to achieve better enhancement of speech from additive non- stationary noises
which occur in real life such as street noise and factory noise. Due to the integration of human auditory system
model into the wavelet transform, bionic wavelet transform (BWT) has great potential for speech enhancement
which may lead to a new path in speech processing. In the proposed technique, at first, discrete BWT is applied to
noisy speech to derive TADBWT coefficients. Then the adaptive nature of the BWT is captured by introducing a
time varying linear factor which updates the coefficients at each scale over time. This approach has shown better
performance than the existing algorithms at lower input SNR due to modified soft level dependent thresholding on
time adaptive coefficients. The objective and subjective test results confirmed the competency of the TADBWT
technique. The effectiveness of the proposed technique is also evaluated for speaker recognition task under noisy
environment. The recognition results show that the TADWT technique yields better performance when compared
to alternate methods specifically at lower input SNR.
KEYWORDS: 3D modeling, Cameras, LIDAR, 3D image processing, Image registration, Data modeling, Visualization, Visual process modeling, Image segmentation, Global Positioning System
We develop a robust framework for the registration of light detection and ranging (LiDAR) images with 2-D visual images using a method based on intensity gradients. Our proposed algorithm consists of two steps. In the first step, we extract lines from the digital surface model (DSM) given by the LiDAR image, then we use intensity gradients to register the extracted lines from the LiDAR image onto the visual image to roughly estimate the extrinsic parameters of the calibrated camera. In our approach, we overcome some of the limitations of 3-D reconstruction methods based on the matching of features between the two images. Our algorithm achieves an accuracy for the camera pose recovery of about 98% for the synthetic images tested, and an accuracy of about 95% for the real-world images we tested, which were from the downtown New Orleans area.
KEYWORDS: Cameras, LIDAR, 3D modeling, 3D image processing, Image segmentation, Calibration, Error analysis, Data modeling, CCD cameras, Global Positioning System
This paper focuses on the 2D-3D camera pose estimation using one LiDAR view and one calibrated camera view.
The pose estimation employs an intelligent search over the extrinsic camera parameters and uses an error metric
based on line-segment matching. The goal of this search process is to estimate the pose parameters without
any apriori knowledge and in less processing time. We demonstrated the validity of the proposed approach by
experimenting on two sets of perspective views using lines as feature.
This papers explores the use of an error metric based on intensity gradients in an automatic camera pose recovery
method for 2D-3D image registration. The method involves extraction of lines from the 3D image and then uses
intensity gradients to register these onto the 2D image. This approach have overcome the limitations of matching
the features to register the 2D-3D images. The goal of our algorithm is to estimate pose parameters without
any apriori knowledge (GPS) and in less processing time. We demonstrated the validity of our approach by
experimenting on perspective view using lines as feature.
Image registration plays a vital role in many real time imaging applications. Registering the images in a precise
manner is a challenging problem. In this paper, we focus on improving image registration error computation
using the projection onto convex sets (POCS) techniques which improves the sub-pixel accuracy in the images
leading to better estimates for the registration error. This can be used in turn to improve the registration
itself. The results obtained from the proposed technique match well with the ground truth which validates the
accuracy of this technique. Furthermore, the proposed technique shows better performance compared to existing
methods.
This paper discusses the error analysis and performance estimation between two different mathematical
methods for registering a sequence of images taken by an airborne sensor. Here both methods use homography
matrices to obtain the panoramic image, but they use different mathematical techniques to obtain the same result. In
Method-I, we use Discrete Linear Transform and Singular Value Decomposition to obtain the homographies and in
Method-II we use the Levenberg-Marquardt algorithm as iterative technique to re-estimate the homography in order
to obtain the same panoramic image. These two methods are analyzed, compared based on reliability and robustness
of registration. We also compare their performance using an error metric that compares their registration accuracies
with respect to ground truth. Our results demonstrate that Levenberg-Marquardt algorithm clearly outperforms
Discrete Linear Transform algorithm.
Image segmentation is one of the important applications in computer vision applications. In this paper, we
present an image registration method that stiches multiple images into one complete view. Also, we demonstrate
how image segmentation is used as an error metric to evaluate image registration. This paper explains about the
error analysis using pattern recognition algorithm such as watershed algorithm for calculating the error for image
registration applications. In this paper, we compare pixel intensity-based error metric with object-based error metric
for evaluating the registration results. We explain in which situation pattern recognition algorithm is superior to
other conventional algorithm such as mean square error.
Optical coherence tomography (OCT) is an interferometric, noninvasive and non contact imaging technique that
generates images of biological tissues at micrometer scale resolution. Images obtained from the OCT process are
often noisy and of low visual contrast level. This work focuses on improving the visual contrast of OCT images
using digital enhancement and fusion techniques. Since OCT images are often corrupted with noise, our first
approach is to use the most effective noise reduction algorithm. This process is followed by a series of digital
enhancement techniques that are suitable to enhance the visual contrast of the OCT images. We also investigate any
gain in visual contrast if combined enhancement is employed. In the image fusion methods, images taken at different
depths are fused together using discrete wavelet transform (DWT) and logical fusion algorithms. We answer the
question of it is more efficient to enhance images before or after fusion. This work concludes by suggesting future
work needed to complement the current one.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.