research-article

Human Action Recognition Using Convolutional Neural Network and Depth Sensor Data

Authors:
Zeeshan Ahmad

Ryerson University, Toronto, Canada

Ryerson University, Toronto, Canada
View Profile

,
Kandasamy Illanko

Ryerson University, Toronto, Canada

Ryerson University, Toronto, Canada
View Profile

,
Naimul Khan

Ryerson University, Toronto, Canada

Ryerson University, Toronto, Canada
View Profile

,
Dimitri Androutsos

Ryerson University, Toronto, Canada

Ryerson University, Toronto, Canada
View Profile

ITCC '19: Proceedings of the 2019 International Conference on Information Technology and Computer CommunicationsAugust 2019Pages 1–5https://doi.org/10.1145/3355402.3355419

Published:16 August 2019Publication History

ITCC '19: Proceedings of the 2019 International Conference on Information Technology and Computer Communications

Pages 1–5

ABSTRACT

The paper proposes a technique for Human Action Recognition (HAR) that uses a Convolutional Neural Network (CNN). Depth data sequences from the motion sensing devices are converted into images and fed into a CNN rather than using any conventional or statistical method. The initial data was obtained from 10 actions performed by six subjects captured by the Kinect v2 sensor as well as 20 actions performed by 7 subjects from the MSR 3D Action data set. A custom CNN architecture consisting of three convolutional and three max pooling layers followed by a fully connected layer was used. Training, validation, and testing was carried out on a total of 39715 images. An accuracy of 97.23% was achieved on the Kinect data set. On the MSR data set the accuracy was 87.1%.

References

C. Chen, R. Jafari, and N. Kehtarnavaz. Fusion of depth, skeleton, and inertial data for human action recognition. In 2016 IEEE international conference on acoustics, speech and signal processing (ICASSP), pages 2712--2716. IEEE, 2016.Google ScholarDigital Library
C. Chen, R. Jafari, and N. Kehtarnavaz. A real-time human action recognition system using depth and inertial sensor fusion. IEEE Sensors Journal, 16(3):773--781, 2016.Google ScholarCross Ref
H. Gammulle, S. Denman, S. Sridharan, and C. Fookes. Two stream lstm: A deep fusion framework for human action recognition. In Applications of Computer Vision (WACV), 2017 IEEE Winter Conference on, pages 177--186. IEEE, 2017.Google ScholarCross Ref
S. Ha, J.-M. Yun, and S. Choi. Multi-modal convolutional neural networks for activity recognition. In Systems, Man, and Cybernetics (SMC), 2015 IEEE International Conference on, pages 3017--3022. IEEE, 2015.Google Scholar
A. Krizhevsky, I. Sutskever, and G. E. Hinton. Imagenet classification with deep convolutional neural networks. In Advances in neural information processing systems, pages 1097--1105, 2012.Google ScholarDigital Library
Y. LeCun, Y. Bengio, et al. Convolutional networks for images, speech, and time series. The handbook of brain theory and neural networks, 3361(10):1995, 1995.Google ScholarDigital Library
C. Li, Y. Hou, P. Wang, and W. Li. Joint distance maps based action recognition with convolutional neural networks. IEEE Signal Processing Letters, 24(5):624--628, 2017.Google ScholarCross Ref
W. Li, Z. Zhang, and Z. Liu. Action recognition based on a bag of 3d points. In 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition-Workshops, pages 9--14. IEEE, 2010.Google ScholarCross Ref
T. Lima, B. Fernandes, and P. Barros. Human action recognition with 3d convolutional neural network. In Computational Intelligence (LA-CCI), 2017 IEEE Latin American Conference on, pages 1--6. IEEE, 2017.Google ScholarCross Ref
J. Qin, L. Liu, Z. Zhang, Y. Wang, and L. Shao. Compressive sequential learning for action similarity labeling. IEEE Transactions on Image Processing, 25(2):756--769, 2016.Google ScholarDigital Library
A. Shafaei and J. J. Little. Real-time human motion capture with multiple depth cameras. In Computer and Robot Vision (CRV), 2016 13th Conference on, pages 24--31. IEEE, 2016.Google ScholarCross Ref
A. Tomas and K. Biswas. Human activity recognition using combined deep architectures. In Signal and Image Processing (ICSIP), 2017 IEEE 2nd International Conference on, pages 41--45. IEEE, 2017.Google ScholarCross Ref
P. Vepakomma, D. De, S. K. Das, and S. Bhansali. A-wristocracy: Deep learning on wrist-worn sensing for recognition of user complex activities. In Wearable and Implantable Body Sensor Networks (BSN), 2015 IEEE 12th International Conference on, pages 1--6. IEEE, 2015.Google ScholarCross Ref
J. Wang, Y. Chen, S. Hao, X. Peng, and L. Hu. Deep learning for sensor-based activity recognition: A survey. Pattern Recognition Letters, 2018.Google Scholar
Q. Xiao and Y. Si. Human action recognition using autoencoder. In Computer and Communications (ICCC), 2017 3rd IEEE International Conference on, pages 1672--1675. IEEE, 2017.Google ScholarCross Ref

Index Terms

Human Action Recognition Using Convolutional Neural Network and Depth Sensor Data
1. Information systems
  1. Information systems applications
    1. Data mining
      1. Data stream mining

Recommendations

Video spatiotemporal mapping for human action recognition by convolutional neural network
Abstract
In this paper, a 2D representation of a video clip called video spatiotemporal map (VSTM) is presented. VSTM is a compact representation of a video clip which incorporates its spatial and temporal properties. It is created by vertical ...
Read More
Human action recognition based on quaternion spatial-temporal convolutional neural network and LSTM in RGB videos

Convolutional neural networks (CNN) are the state-of-the-art method for action recognition in various kinds of datasets. However, most existing CNN models are based on lower-level handcrafted features from gray or RGB image sequences from small datasets,...
Read More
Human Action Recognition using Pre-trained Convolutional Neural Networks
VSIP '20: Proceedings of the 2020 2nd International Conference on Video, Signal and Image Processing

Recognition of human action is one of the challenges in the field of artificial intelligence. Deep learning model has become a research issue in action recognition applications due to its ability to outperform traditional machine learning approaches. ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in

ITCC '19: Proceedings of the 2019 International Conference on Information Technology and Computer Communications
August 2019
132 pages
ISBN:9781450372282
DOI:10.1145/3355402

Copyright © 2019 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 16 August 2019
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Convolutional neural network
Depth sensors
human action recognition
Qualifiers
- research-article
- Research
- Refereed limited
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 18
  Total Citations
  View Citations
- 220
  Total Downloads
- Downloads (Last 12 months)16
- Downloads (Last 6 weeks)2
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Human Action Recognition Using Convolutional Neural Network and Depth Sensor Data

ITCC '19: Proceedings of the 2019 International Conference on Information Technology and Computer Communications

ABSTRACT

References

Cited By

Index Terms

Recommendations

Video spatiotemporal mapping for human action recognition by convolutional neural network

Human action recognition based on quaternion spatial-temporal convolutional neural network and LSTM in RGB videos

Human Action Recognition using Pre-trained Convolutional Neural Networks

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Human Action Recognition Using Convolutional Neural Network and Depth Sensor Data

ITCC '19: Proceedings of the 2019 International Conference on Information Technology and Computer Communications

ABSTRACT

References

Cited By

Index Terms

Recommendations

Video spatiotemporal mapping for human action recognition by convolutional neural network

Human action recognition based on quaternion spatial-temporal convolutional neural network and LSTM in RGB videos

Human Action Recognition using Pre-trained Convolutional Neural Networks

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media