9 September 2020 Spatial–temporal graph attention networks for skeleton-based action recognition
Qingqing Huang, Fengyu Zhou, Jiakai He, Yang Zhao, Runze Qin
Author Affiliations +
Abstract

Human action recognition based on skeleton currently has attracted a wide range of attention. The structure of skeleton data exists in the form of graph, thus most researchers use graph convolutional networks (GCN) to model skeleton sequences. However, the graph convolution network shares the same weight for all neighbor nodes and relies on the connection of graph edges. We introduce a method, a spatial–temporal graph attention networks (ST-GAT), to overcome the disadvantages of GCN. First, the ST-GAT defines the spatial–temporal neighbor nodes of the root node and the aggregation function through the attention mechanism. The adjacency matrix is only used in GAT to define related nodes, and the calculation of association weight is dependent on the feature expression of nodes. Then ST-GAT network attaches the obtained attention coefficient to each neighbor node to automatically learn the representation of spatiotemporal skeletal features and output the classification results. Extensive experiments on two challenging datasets consistently demonstrate the superiority of our method.

© 2020 SPIE and IS&T 1017-9909/2020/$28.00 © 2020 SPIE and IS&T
Qingqing Huang, Fengyu Zhou, Jiakai He, Yang Zhao, and Runze Qin "Spatial–temporal graph attention networks for skeleton-based action recognition," Journal of Electronic Imaging 29(5), 053003 (9 September 2020). https://doi.org/10.1117/1.JEI.29.5.053003
Received: 10 March 2020; Accepted: 28 August 2020; Published: 9 September 2020
Lens.org Logo
CITATIONS
Cited by 7 scholarly publications.
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Head

RGB color model

Convolution

Data modeling

Video

Performance modeling

Neural networks

RELATED CONTENT


Back to Top