HActivityNet: A Deep Convolutional Neural Network for Human Activity Recognition

A Deep Convolutional Neural Network for Human Activity Recognition

  • Md. Khaliluzzaman Dept. of Computer Science and Engineering, International Islamic University Chittagong (IIUC), Chattogram-4318, Bangladesh
  • Md. Abu Bakar Siddiq Sayem Dept. of Computer Science and Engineering, International Islamic University Chittagong (IIUC), Chattogram-4318, Bangladesh
  • Lutful KaderMisbah Dept. of Computer Science and Engineering, International Islamic University Chittagong (IIUC), Chattogram-4318, Bangladesh
Keywords: Human activity recognition (HAR), convolutional neural network (CNN), KTH dataset, computer vision, vanishing gradient problem


Human Activity Recognition (HAR), a vast area of a computer vision research, has gained standings in recent years due to its applications in various fields. As human activity has diversification in action, interaction, and it embraces a large amount of data and powerful computational resources, it is very difficult to recognize human activities from an image. In order to solve the computational cost and vanishing gradient problem, in this work, we have proposed a revised simple convolutional neural network (CNN) model named Human Activity Recognition Network (HActivityNet) that is automatically extract and learn features and recognize activities in a rapid, precise and consistent manner. To solve the problem of imbalanced positive and negative data, we have created two datasets, one is HARDataset1 dataset which is created by extracted image frames from KTH dataset, and another one is HARDataset2 dataset prepared from activity video frames performed by us. The comprehensive experiment shows that our model performs better with respect to the present state of the art models. The proposed model attains an accuracy of 99.5% on HARDatase1 and almost 100% on HARDataset2 dataset. The proposed model also performed well on real data.


Download data is not yet available.


Xu, W. , Pang, Y., Yang, Y., and Liu, Y., "Human Activity Recognition Based On Convolutional Neural Network," 2018 24th International Conference on Pattern Recognition (ICPR), Beijing, pp. 165-170, 2018, DOI: https://doi.org/10.1109/ICPR.2018.8545435

Moya Rueda, F., Grzeszick, R., Fink, G.A., Feldhorst, S. and Ten Hompel, M., “Convolutional neural networks for human activity recognition using body-worn sensors,” In Informatics, Vol. 5, No. 2, p. 26, 2018. DOI: https://doi.org/10.3390/informatics5020026

Bevilacqua, A., MacDonald, K., Rangarej, A., Widjaya, V., Caulfield, B. and Kechadi, T., “Human activity recognition with convolutional neural networks,” In Joint European Conference on Machine Learning and Knowledge Discovery in Databases, pp. 541-552, 2018, September, Springer, Cham. DOI: https://doi.org/10.1007/978-3-030-10997-4_33

Basavaiah, J. and Patil, C. M., “Human activity detection and action recognition in videos using convolutional neural networks,” Journal of Information and Communication Technology, Vol. 19, No. 2, pp. 157-183, 2020. DOI: https://doi.org/10.32890/jict2020.19.2.1

Bearman, A., & Dong, C. “Human pose estimation and activity classification using convolutional neural networks,” CS231n Course Project Reports, 2015.

Koohzadi, M., & Charkari, N. M. “Survey on deep learning methods in human action recognition,” IET Computer Vision, Vol. 11, NO. 8, pp. 623-632, 2017. DOI: https://doi.org/10.1049/iet-cvi.2016.0355

Yu, S., Cheng, Y., Xie, L., & Li, S. Z. “Fully convolutional networks for action recognition,” IET Computer Vision, Vol. 11, NO. 8, pp. 744-749, 2017. DOI: https://doi.org/10.1049/iet-cvi.2017.0005

Jayabalan, A., Karunakaran, H., Murlidharan, S., &Shizume, T. “Dynamic Action Recognition: A convolutional neural network model for temporally organized joint location data,” arXiv preprint arXiv:1612.06703, 2016.

Chun, S., & Lee, C. S. “Human action recognition using histogram of motion intensity and direction from multiple views,” IET Computer vision, Vol. 10, No. 4, pp. 250-257, 2016. DOI: https://doi.org/10.1049/iet-cvi.2015.0233

Milenkoski, M., Trivodaliev, K., Kalajdziski, S., Jovanov, M., & Stojkoska, B. R. “Real time human activity recognition on smartphones using LSTM Networks,” In 2018 41st International Convention on Information and Communication Technology, Electronics and Microelectronics (MIPRO), pp. 1126-1131, 2018, May, IEEE. DOI: https://doi.org/10.23919/MIPRO.2018.8400205

Baccouche, M., Mamalet, F., Wolf, C., Garcia, C., & Baskurt, A., “Sequential deep learning for human action recognition,” In International workshop on human behavior understanding , pp. 29-39, 2011, November, Springer, Berlin, Heidelberg. DOI: https://doi.org/10.1007/978-3-642-25446-8_4

Geng, C., & Song, J. “Human action recognition based on convolutional neural networks with a convolutional auto-encoder,” In 2015 5th International Conference on Computer Sciences and Automation Engineering ICCSAE 2015. 2016, February. Atlantis Press. DOI: https://doi.org/10.2991/iccsae-15.2016.173

Montes, A., Salvador, A., Pascual, S. and Giro-i-Nieto, X., “Temporal activity detection in untrimmed videos with recurrent neural networks,” arXiv preprint arXiv:1608.08128, 2016.

Zhu, F., Shao, L., Xie, J. and Fang, Y., “From handcrafted to learned representations for human action recognition: A survey,” Image and Vision Computing, Vol. 55, pp.42-52, 2016. DOI: https://doi.org/10.1016/j.imavis.2016.06.007

Laptev I., “On space-time interest points,” International Journal of Computer Vision, Vol. 64, No. 2, pp. 107-23, 2005. DOI: https://doi.org/10.1007/s11263-005-1838-7

Kovashka, A. and Grauman, K., “Learning a hierarchy of discriminative space-time neighborhood features for human action recognition,” In 2010 IEEE computer society conference on computer vision and pattern recognition, pp. 2046-2053, IEEE, 2010. DOI: https://doi.org/10.1109/CVPR.2010.5539881

Murtaza, F., Yousaf, M.H. and Velastin, S.A., “Multi‐view human action recognition using 2D motion templates based on MHIs and their HOG description,” IET Computer Vision, Vol. 10, No. 7, pp. 758-767, 2016. DOI: https://doi.org/10.1049/iet-cvi.2015.0416

Chaaraoui, A.A., Climent-Pérez, P. and Flórez-Revuelta, F., “Silhouette-based human action recognition using sequences of key poses,” Pattern Recognition Letters, Vol. 34, No. 15, pp. 1799-1807, 2013. DOI: https://doi.org/10.1016/j.patrec.2013.01.021

Orrite, C., Rodriguez, M., Herrero, E., Rogez, G. and Velastin, S.A., “Automatic segmentation and recognition of human actions in monocular sequences,” In 2014 22nd International Conference on Pattern Recognition, pp. 4218-4223, IEEE, 2014. DOI: https://doi.org/10.1109/ICPR.2014.723

Wang, H. and Schmid, C., “Action recognition with improved trajectories,” In Proceedings of the IEEE international conference on computer vision, pp. 3551-3558, 2013. DOI: https://doi.org/10.1109/ICCV.2013.441

Wang, Y. and Mori, G., “Human action recognition by semilatent topic models,” IEEE transactions on pattern analysis and machine intelligence, Vol. 31, No. 10, pp. 1762-1774, 2009. DOI: https://doi.org/10.1109/TPAMI.2009.43

Ji, S., Xu, W., Yang, M. and Yu, K., “3D convolutional neural networks for human action recognition,” IEEE transactions on pattern analysis and machine intelligence, Vol. 35, No. 1, pp.221-231, 2012. DOI: https://doi.org/10.1109/TPAMI.2012.59

Memisevic, R. and Hinton, G., “Unsupervised learning of image transformations,” In 2007 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1-8, IEEE, 2007. DOI: https://doi.org/10.1109/CVPR.2007.383036

LeCun, Y., Bottou, L., Bengio, Y., Haffner, P., “Gradient-based learning applied to document recognition,” Proceedings of the IEEE, Vol. 86, No. 11, pp. 2278–2324, 1998. DOI: https://doi.org/10.1109/5.726791

LeCun, Y., Kavukcuoglu, K., Farabet, C., Convolutional networks and applications in vision, In IEEE International Symposium on Circuits and Systems, pp. 253–256, 2010. DOI: https://doi.org/10.1109/ISCAS.2010.5537907

Clarkson, B.P., “Life patterns: structure from wearable sensors (Doctoral dissertation, Massachusetts Institute of Technology), 2002.

Ojala, T., Pietikainen, M. and Maenpaa, T., “Multiresolution gray-scale and rotation invariant texture classification with local binary patterns,” IEEE Transactions on pattern analysis and machine intelligence, Vol. 24, No. 7, pp. 971-987, 2002. DOI: https://doi.org/10.1109/TPAMI.2002.1017623

Dalal, N. and Triggs, B., “Histograms of oriented gradients for human detection,” In 2005 IEEE computer society conference on computer vision and pattern recognition (CVPR'05), Vol. 1, pp. 886-893, IEEE, 2005.

Rublee, E., Rabaud, V., Konolige, K. and Bradski, G., “ORB: An efficient alternative to SIFT or SURF,” In 2011 International conference on computer vision, pp. 2564-2571, IEEE. DOI: https://doi.org/10.1109/ICCV.2011.6126544

Guo G., Wang H., Bell D., Bi Y., Greer K., “KNN Model-Based Approach in Classification”, Meersman R., Tari Z., Schmidt D.C. (eds) On the Move to Meaningful Internet Systems 2003: CoopIS, DOA, and ODBASE. OTM2003. Vol. 2888, pp: 986-996, 2003. DOI: https://doi.org/10.1007/978-3-540-39964-3_62

Anagnostopoulos, G.C. “SVM-Based Target Recognition From Synthetic Aperture Radar Images using TargetRegion Outline Descriptors,” Nonlinear Analysis: Theory, Methods & Applications, Vol. 71, Issue. 12, pp:2934–2939, 2009. DOI: https://doi.org/10.1016/j.na.2009.07.030

YoshuaBengio, "Learning Deep Architectures for AI", Foundations and Trends® in Machine Learning, Vol.2, pp. 1-127, 2009. DOI: https://doi.org/10.1561/2200000006

Schmidhuber, J., “Deep learning in neural networks: An overview,” Neural Networks, Vol. 61, pp. 85 –117, 2015. DOI: https://doi.org/10.1016/j.neunet.2014.09.003

Sudharshan, D.P. and Raj, S., “Object recognition in images using convolutional neural network,” In 2018 2nd International Conference on Inventive Systems and Control (ICISC), pp. 718-722, IEEE, 2018. DOI: https://doi.org/10.1109/ICISC.2018.8398893

Safiyah, R. D., Rahim, Z. A., Syafiq, S., Ibrahim, Z., & Sabri, N, “Performance Evaluation for Vision-Based Vehicle Classification Using Convolutional Neural Network,“International Journal of Engineering and Technology (UAE), Vol. 7, pp: 86-90, 2018. DOI: https://doi.org/10.14419/ijet.v7i3.15.17507

Krizhevsky, A., Sutskever, I., Hinton, G.E, “Imagenet Classification with Deep Convolutional Neural Networks,” Proceedings of the Neural Information Processing System (NIPS), Harrahs and Harveys,Lake Tahoe, NV, USA, Vol.2, pp: 1097-1105, 2012.

Simonyan, K., Zisserman, A, “Very Deep Convolutional Networks for Large-Scale Image Recognition,” Conference paper at ICLR 2015, arXiv:1409.1556.

Gomathi, V. "Indian Sign Language Recognition through Hybrid ConvNet-LSTM Networks," EMITTER International Journal of Engineering Technology, Vol. 9, No. 1, pp. 182-203, 2021. DOI: https://doi.org/10.24003/emitter.v9i1.613

How to Cite
Khaliluzzaman, M., Md. Abu Bakar Siddiq Sayem, & Lutful KaderMisbah. (2021). HActivityNet: A Deep Convolutional Neural Network for Human Activity Recognition. EMITTER International Journal of Engineering Technology, 9(2), 357-376. https://doi.org/10.24003/emitter.v9i2.642