To tackle the problem of human trajectory prediction in complex scenes, we propose a model using hypergraph convolutional neural networks for social interaction (HGCNSI). Our model leverages a hypergraph structure to capture both high-order interactions and complex social dynamics among pedestrians (who often influence each other in a nonlinear and structured manner). First, we propose a social interaction module that improves the accuracy of interaction modeling by distinguishing between interacting and non-interacting pedestrians. Then, the hypergraph structure that can capture the complex and nonlinear relations among multiple pedestrians from the social interaction module is constructed. Furthermore, we exploit an improved attention mechanism called scene-coordinates attention that fuses the spatial and temporal features and models the unique historical movement information of each person. Finally, we introduce the SIRF module that filters the trajectories within one iteration to reduce the computational complexity and improve the prediction performance. We evaluate the proposed HGCNSI model on five publicly available datasets and demonstrate that it achieves state-of-the-art results. Specifically, the experiments show that our model outperforms existing methods in terms of prediction accuracy, using evaluation metrics, such as the average displacement error and the final displacement error.
Betweenness centrality is a measure of node importance in networks, but conventional exact algorithms require a lot of time as the network size grows dramatically. This paper aims to enhance the efficiency and accuracy of betweenness centrality computation for network nodes. We propose an algorithm based on shortest path approximation and adaptive sampling. The algorithm first selects high-quality seed nodes according to degree, then approximates shortest paths, and finally chooses appropriate samples to approximate betweenness centrality. We conduct experiments on 5 different datasets, and the results show that our algorithm outperforms the baseline algorithms in terms of sample size and running time. Our algorithm not only reduces the computational cost effectively, but also guarantees the computational accuracy.
Deep clustering algorithms based on graph convolutional networks are widely used due to their strong ability to mine network structure. However, the construction of neighborhood graphs may introduce noise and affect the clustering results. Meanwhile, focusing on ordinary topology alone ignores the higher-order connections between data about attributes. To address the above problems, an unsupervised hypergraph convolutional clustering network (UHCCN) is proposed in this paper. We construct hypergraph structures using attributes and incorporate higher-order information encoding into representation learning through hypergraph convolution. Using an attribute encoder will extract node features and fuse it into the hypergraph convolution. Finally, representation learning and clustering are optimized jointly. The experiments validate the effectiveness and superiority of UHCCN.
Attention mechanism in image captioning model can help model focus on relative regions while generating caption. However, existing attention mechanisms are unable to identify important regions and important visual features in images. This problem makes models sometimes pay excessive attention to non-important regions and non-important features in the process of generating image captions, which makes model generate coarse-grained and even wrong image captions. To address this problem, we propose an “Importance Discrimination Attention” (IDA) module, which could discriminate important feature and non-important features and reduce the possibility of misleading by non-important features in the process of generating image captions. We also propose a IDA-based image captioning model IDANet, which is completely based on transformer framework. The encoder of IDANet consists of two parts, one is pretrained Vision Transformer (VIT), which is used to extract visual features in a fast way. The other is refining module which is added into encoder to model position and semantic relationships of different grids. For the decoder, we propose IDA-Decoder which has similar framework with transformer decoder. IDA-Decoder is guided by IDA to focus on crucial regions and features instead of all regions and features while generating image caption. Compared with others attention mechanism, IDA could capture semantic relevance of important regions with other regions in a fine-grained and high-efficient way. The caption generated by IDANet could accurately capture the relevance of different objects and discriminate objects that have similar size and shape. The performance on MSCOCO “Karpathy” offline test split achieves 132.0 CIDEr-D score and 40.3 BLEU-4 score.
To resolve the problem of occlusion of the depth information of x-ray images and the detection of small-scale contraband in the detection of contraband objects, an improved prohibited item detection network has been proposed based on YOLOX. First, a material-aware atrous convolution module (MACM) is added to the feature pyramid network to enhance the model’s multiscale fusion and extraction ability for material information in x-ray image. Second, a spatial pyramid split attention mechanism (SPSA) is proposed to fuse spatial and channel attention for different scale spatial information features. Finally, CutMix data augmentation strategy is adopted to improve the robustness of the model. The overall performance identification experiments were conducted on the publicly available OPIXray dataset. The average accuracy (mean average precision, mAP) of the method is 93.10%. Compared with the baseline model YOLOX, the mAP is improved by 3.25%. The experimental results show that our method achieves state-of-the-art detection accuracy compared with existing methods.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.