Improving 3D Human Pose Orientation Recognition Through Weight-Voxel Features And 3D CNNs

  • Moch. Iskandar Riansyah Institut Teknologi Sepuluh Nopember, Institut Teknologi Telkom Surabaya
  • Oddy Virgantara Putra Institut Teknologi Sepuluh Nopember, Universitas Darussalam Gontor
  • Farah Zakiyah Rahmanti Institut Teknologi Sepuluh Nopember, Institut Teknologi Telkom Surabaya
  • Ardyono Priyadi Institut Teknologi Sepuluh Nopember
  • Diah Puspito Wulandari Institut Teknologi Sepuluh Nopember
  • Tri Arief Sardjono Institut Teknologi Sepuluh Nopember
  • Eko Mulyanto Yuniarno Institut Teknologi Sepuluh Nopember
  • Mauridhi Hery Purnomo Institut Teknologi Sepuluh Nopember
Keywords: 3D CNN, Weighted, Voxel, Human Orientation, Classification

Abstract

Preprocessing is a widely used process in deep learning applications, and it has been applied in both 2D and 3D computer vision applications. In this research, we propose a preprocessing technique involving weighting to enhance classification performance, incorporated with a 3D CNN architecture. Unlike regular voxel preprocessing, which uses a zero-one (binary) approach, adding weighting incorporates stronger structural information into the voxels. This method is tested with 3D data represented in the form of voxels, followed by weighting preprocessing before entering the core 3D CNN architecture. We evaluate our approach using both public datasets, such as the KITTI dataset, and self-collected 3D human orientation data with four classes. Subsequently, we tested it with five 3D CNN architectures, including VGG16, ResNet50, ResNet50v2, DenseNet121, and VoxNet. Based on experiments conducted with this data, preprocessing with the 3D VGG16 architecture, among the five architectures tested, demonstrates an improvement in accuracy and a reduction in errors in 3D human orientation classification compared to using no preprocessing or other preprocessing methods on the 3D voxel data. The results show that the accuracy and loss in 3D object classification exhibit superior performance compared to specific preprocessing methods, such as binary processing within each voxel.

Downloads

Download data is not yet available.

References

Banerjee A, Galassi F, Zacur E, De Maria GL, Choudhury RP, and Grau V, Point-Cloud Method for Automated 3D Coronary Tree Reconstruction From Multiple Non-Simultaneous Angiographic Projections, IEEE Trans Med Imaging, vol. 39, pp. 1278–90, 2020.

Han L, Zheng T, Zhu Y, Xu L, and Fang L, Live Semantic 3D Perception for Immersive Augmented Reality, IEEE Trans Vis Comput Graph, vol. 26, pp. 2012-2022, 2020.

Li J, Qin H, Wang J, and Li J, OpenStreetMap-Based Autonomous Navigation for the Four Wheel-Legged Robot Via 3D-Lidar and CCD Camera. IEEE Transactions on Industrial Electronics, vol. 69, pp. 2708-2717, 2022.

Zeng Y, Hu Y, Liu S, Ye J, Han Y, Li X, Sun N, RT3D: Real-Time 3-D Vehicle Detection in LiDAR Point Cloud for Autonomous Driving, IEEE Robot Autom Lett, vol. 3, pp. 3434-3440, 2018.

Ma L, Li Y, Li J, Yu Y, Junior JM, Goncalves WN, Chapman MA., Capsule-Based Networks for Road Marking Extraction and Classification From Mobile LiDAR Point Clouds, IEEE Transactions on Intelligent Transportation Systems, vol. 22, pp. 1981-1995, 2021.

Caesar H, Bankiti V, Lang AH, Vora S, Liong VE, Xu Q, Krishnan A, Pan Y, Baldan G, Beijbom O, nuScenes: A multimodal dataset for autonomous driving, 2019.

Duan Y, Zheng Y, Lu J, Zhou J, and Tian Q, Structural Relational Reasoning of Point Clouds, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 949–58, 2019.

Song X, Wang P, Zhou D, Zhu R, Guan C, Dai Y, Su H, Li H, Yang R, ApolloCar3D: A Large 3D Car Instance Understanding Benchmark for Autonomous Driving, 2018.

Lv C, Lin W, and Zhao B, Voxel Structure-Based Mesh Reconstruction From a 3D Point Cloud, IEEE Trans Multimedia, vol. 24, pp. 1815-1829, 2022.

Kang Z, Yang J, Zhong R, Wu Y, Shi Z, and Lindenbergh R, Voxel-Based Extraction and Classification of 3-D Pole-Like Objects From Mobile LiDAR Point Cloud Data, IEEE J Sel Top Appl Earth Obs Remote Sens, vol. 11, pp. 4287-4298, 2018.

Agrawal S, Bhanderi S, Doycheva K, and Elger G. Static Multitarget-Based Autocalibration of RGB Cameras, 3-D Radar, and 3-D Lidar Sensors, IEEE Sens J, vol. 23, pp. 21493-21505.

Kettelgerdes M, and Elger G, In-Field Measurement and Methodology for Modeling and Validation of Precipitation Effects on Solid-State LiDAR Sensors, IEEE Journal of Radio Frequency Identification, vol. 7, pp. 192-202, 2023.

Liu W, Tang X, and Zhao C, Robust RGBD Tracking via Weighted Convolution Operators, IEEE Sens J, vol. 20, pp. 4496-4503, 2020.

Sun W, Iwata S, Tanaka Y, and Sakamoto T, Radar-Based Estimation of Human Body Orientation Using Respiratory Features and Hierarchical Regression Model, IEEE Sens Lett, vol. 7, pp. 1-4, 2023.

Cardarelli S et al, Single IMU Displacement and Orientation Estimation of Human Center of Mass: A Magnetometer-Free Approach, IEEE Trans Instrum Meas, vol. 69, pp. 5629-5639, 2020.

Li S, Li L, Shi D, Zou W, Duan P, and Shi L, Multi-Kernel Maximum Correntropy Kalman Filter for Orientation Estimation, IEEE Robot Autom Lett, vol. 7, pp. 6693-6700, 2022.

Zhang J-H, Li P, Jin C-C, Zhang W-A, and Liu S, A Novel Adaptive Kalman Filtering Approach to Human Motion Tracking With Magnetic-Inertial Sensors, IEEE Transactions on Industrial Electronics, vol. 67, pp. 8659-8669, 2020.

Fisch M and Clark R, Orientation Keypoints for 6D Human Pose Estimation, IEEE Trans Pattern Anal Mach Intell, vol. 44, pp. 10145-10148, 2022.

Lee D, Yang M-H, and Oh S, Head and Body Orientation Estimation Using Convolutional Random Projection Forests, IEEE Trans Pattern Anal Mach Intell, vol. 41, pp. 107-120, 2019.

Wu C, Chen Y, Luo J, Su C-C, Dawane A, Hanzra B, Deng Z, Liu B, Wang J, Kuo C-H, MEBOW: Monocular Estimation of Body Orientation In the Wild, 2020.

Riansyah MochI, Sardjono TA, Yuniarno EM, and Purnomo MH, Prediction of Human Body Orientation based on Voxel Using 3D Convolutional Neural Network, 2023 International Seminar on Intelligent Technology and Its Applications (ISITIA), IEEE, pp. 99–104 2023.

Xie H, Yao H, Sun X, Zhou S, and Tong X, Weighted voxel, Proceedings of the 10th International Conference on Internet Multimedia Computing and Service, New York, pp. 1–4 2018.

Dewantara BSB, Saputra RWA, and Pramadihanto D, Estimating human body orientation from image depth data and its implementation, Mach Vis Appl, vol. 33, pp. 38, 2022.

Menze M, and Geiger A, Object scene flow for autonomous vehicles. 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), IEEE, pp. 3061–70, 2015.

Maturana D and Scherer S, VoxNet: A 3D Convolutional Neural Network for real-time object recognition, 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), IEEE, pp. 922–928, 2015.

Published
2025-06-16
How to Cite
Riansyah, M. I., Putra, O. V., Rahmanti, F. Z., Priyadi, A., Wulandari, D. P., Sardjono, T. A., Yuniarno, E. M., & Hery Purnomo, M. (2025). Improving 3D Human Pose Orientation Recognition Through Weight-Voxel Features And 3D CNNs. EMITTER International Journal of Engineering Technology, 13(1), 22-36. https://doi.org/10.24003/emitter.v13i1.847
Section
Articles