-
CiteScore
2.29
Impact Factor
Volume 2, Issue 4, IECE Transactions on Internet of Things
Volume 2, Issue 4, 2024
Submit Manuscript Edit a Special Issue
Academic Editor
Jinchao Chen
Jinchao Chen
Northwestern Polytechnical University, China
Article QR Code
Article QR Code
Scan the QR code for reading
Popular articles
IECE Transactions on Internet of Things, Volume 2, Issue 4, 2024: 83-94

Free to Read | Research Article | 08 December 2024
Optimized CNNs for Rapid 3D Point Cloud Object Recognition
1 College of Engineering, Northeastern University, Boston 02115, MA, United States
2 University of Pennsylvania, Philadelphia 19104, PA, United States
3 School of Electrical Engineering and Computer Science, Oregon State University, Corvallis 97333, OR, United States
4 Carnegie Mellon University, College of Engineering, Pittsburgh 15213, PA, United States
5 George Washington University, Washington 20052, DC, United States
6 Georgia Institute of Technology, Atlanta 30332, GA, United States
7 Faculty of Management, McGill University, Montreal H3B0C7, QC, Canada
8 Department of Mechanical Engineering, Carnegie Mellon University, Pittsburgh 15213, PA, United States
* Corresponding Author: Yiping Dong, [email protected]
Received: 09 October 2024, Accepted: 19 November 2024, Published: 08 December 2024  
Abstract
This study introduces a method for efficiently detecting objects within 3D point clouds using convolutional neural networks (CNNs). Our approach adopts a unique feature-centric voting mechanism to construct convolutional layers that capitalize on the typical sparsity observed in input data. We explore the trade-off between accuracy and speed across diverse network architectures and advocate for integrating an L1 penalty on filter activations to augment sparsity within intermediate layers. This research pioneers the proposal of sparse convolutional layers combined with L1 regularization to effectively handle large-scale 3D data processing. Our method’s efficacy is demonstrated on the MVTec 3D-AD object detection benchmark. The Vote3Deep models, with just three layers, outperform the previous state-of-the-art in both laser-only approaches and combined laser-vision methods. Additionally, they maintain competitive processing speeds. This underscores our approach’s capability to substantially enhance detection performance while ensuring computational efficiency suitable for real-time applications.

Graphical Abstract
Optimized CNNs for Rapid 3D Point Cloud Object Recognition

Keywords
object detection
L1 penalty
point cloud
MVTec 3D-AD

Funding
This work was supported without any funding.

References
  1. Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2012). Imagenet classification with deep convolutional neural networks. Advances in neural information processing systems, 25.
    [Google Scholar]
  2. Simonyan, K., & Zisserman, A. (2014). Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556.
    [Google Scholar]
  3. Hu, H., Gu, J., Zhang, Z., Dai, J., & Wei, Y. (2018). Relation networks for object detection. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 3588-3597).
    [Google Scholar]
  4. Pan, X., Xia, Z., Song, S., Li, L. E., & Huang, G. (2021). 3d object detection with pointformer. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 7463-7472).
    [Google Scholar]
  5. Wang, D. Z., & Posner, I. (2015, July). Voting for voting in online point cloud object detection. In Robotics: science and systems (Vol. 1, No. 3, pp. 10-15).
    [Google Scholar]
  6. Geiger, A., Lenz, P., & Urtasun, R. (2012, June). Are we ready for autonomous driving? the kitti vision benchmark suite. In 2012 IEEE conference on computer vision and pattern recognition (pp. 3354-3361). IEEE.
    [Google Scholar]
  7. Li, B., Zhang, T., & Xia, T. (2016). Vehicle detection from 3d lidar using fully convolutional network. arXiv preprint arXiv:1608.07916.
    [Google Scholar]
  8. Chauhan, R., Ghanshala, K. K., & Joshi, R. C. (2018, December). Convolutional neural network (CNN) for image detection and recognition. In 2018 first international conference on secure cyber computing and communication (ICSCCC) (pp. 278-282). IEEE.
    [Google Scholar]
  9. Fathy, M., & Siyal, M. Y. (1995). An image detection technique based on morphological edge detection and background differencing for real-time traffic analysis. Pattern Recognition Letters, 16(12), 1321-1330.
    [Google Scholar]
  10. Liang, S., Li, Y., & Srikant, R. (2017). Enhancing the reliability of out-of-distribution image detection in neural networks. arXiv preprint arXiv:1706.02690.
    [Google Scholar]
  11. Suthaharan, S., & Suthaharan, S. (2016). Support vector machine. Machine learning models and algorithms for big data classification: thinking with examples for effective learning, 207-235.
    [Google Scholar]
  12. Maturana, D., & Scherer, S. (2015, September). Voxnet: A 3d convolutional neural network for real-time object recognition. In 2015 IEEE/RSJ international conference on intelligent robots and systems (IROS) (pp. 922-928). IEEE.
    [Google Scholar]
  13. Maturana, D., & Scherer, S. (2015, May). 3d convolutional neural networks for landing zone detection from lidar. In 2015 IEEE international conference on robotics and automation (ICRA) (pp. 3471-3478). IEEE.
    [Google Scholar]
  14. Graham, B. (2014). Spatially-sparse convolutional neural networks. arXiv preprint arXiv:1409.6070.
    [Google Scholar]
  15. Graham, B. (2015). Sparse 3D convolutional neural networks. arXiv preprint arXiv:1505.02890.
    [Google Scholar]
  16. Jampani, V., Kiefel, M., & Gehler, P. V. (2016). Learning sparse high dimensional filters: Image filtering, dense crfs and bilateral neural networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 4452-4461).
    [Google Scholar]
  17. Chen, H., Dou, Q., Yu, L., & Heng, P. A. (2016). Voxresnet: Deep voxelwise residual networks for volumetric brain segmentation. arXiv preprint arXiv:1608.05895.
    [Google Scholar]
  18. Dou, Q., Chen, H., Yu, L., Zhao, L., Qin, J., Wang, D., ... & Heng, P. A. (2016). Automatic detection of cerebral microbleeds from MR images via 3D convolutional neural networks. IEEE transactions on medical imaging, 35(5), 1182-1195.
    [Google Scholar]
  19. Prasoon, A., Petersen, K., Igel, C., Lauze, F., Dam, E., & Nielsen, M. (2013, September). Deep feature learning for knee cartilage segmentation using a triplanar convolutional neural network. In International conference on medical image computing and computer-assisted intervention (pp. 246-253). Berlin, Heidelberg: Springer Berlin Heidelberg.
    [Google Scholar]
  20. Derpanis, K. G. (2010). Overview of the RANSAC Algorithm. Image Rochester NY, 4(1), 2-3.
    [Google Scholar]
  21. Khan, K., Rehman, S. U., Aziz, K., Fong, S., & Sarasvady, S. (2014, February). DBSCAN: Past, present and future. In The fifth international conference on the applications of digital information and web technologies (ICADIWT 2014) (pp. 232-238). IEEE.
    [Google Scholar]
  22. Zhou, Y., Ren, F., Nishide, S., & Kang, X. (2019, November). Facial sentiment classification based on resnet-18 model. In 2019 International Conference on electronic engineering and informatics (EEI) (pp. 463-466). IEEE.
    [Google Scholar]
  23. Bergmann, P., Jin, X., Sattlegger, D., & Steger, C. (2021). The mvtec 3d-ad dataset for unsupervised 3d anomaly detection and localization. arXiv preprint arXiv:2112.09045.
    [Google Scholar]
  24. Rudolph, M., Wehrbein, T., Rosenhahn, B., & Wandt, B. (2023). Asymmetric student-teacher networks for industrial anomaly detection. In Proceedings of the IEEE/CVF winter conference on applications of computer vision (pp. 2592-2602).
    [Google Scholar]
  25. Bergmann, P., & Sattlegger, D. (2023). Anomaly detection in 3d point clouds using deep geometric descriptors. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (pp. 2613-2623).
    [Google Scholar]
  26. Cao, Y., Xu, X., & Shen, W. (2024). Complementary pseudo multimodal feature for point cloud anomaly detection. Pattern Recognition, 156, 110761.
    [Google Scholar]
  27. Wei, X., Yu, R., & Sun, J. (2020). View-GCN: View-based graph convolutional network for 3D shape analysis. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (pp. 1850-1859).
    [Google Scholar]
  28. Fischler, M. A., & Bolles, R. C. (1981). Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography. Communications of the ACM, 24(6), 381-395.
    [Google Scholar]
  29. Ester, M., Kriegel, H. P., Sander, J., & Xu, X. (1996, August). A density-based algorithm for discovering clusters in large spatial databases with noise. In kdd (Vol. 96, No. 34, pp. 226-231).
    [Google Scholar]
  30. Zhou, Q. Y., Park, J., & Koltun, V. (2018). Open3D: A modern library for 3D data processing. arXiv preprint arXiv:1801.09847.
    [Google Scholar]
  31. Rusu, R. B., Blodow, N., & Beetz, M. (2009, May). Fast point feature histograms (FPFH) for 3D registration. In 2009 IEEE international conference on robotics and automation (pp. 3212-3217). IEEE.
    [Google Scholar]
  32. He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 770-778).
    [Google Scholar]
  33. Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., ... & Fei-Fei, L. (2015). Imagenet large scale visual recognition challenge. International journal of computer vision, 115, 211-252.
    [Google Scholar]
  34. Zagoruyko, S. (2016). Wide residual networks. arXiv preprint arXiv:1605.07146.
    [Google Scholar]
  35. Horwitz, E., & Hoshen, Y. (2023). Back to the feature: classical 3d features are (almost) all you need for 3d anomaly detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (pp. 2968-2977).
    [Google Scholar]

Cite This Article
APA Style
Lyu, T., Gu, D., Chen, P., Jiang, Y., Zhang, Z., Pang, H., Zhou, L., & Dong, Y. (2024). Optimized CNNs for Rapid 3D Point Cloud Object Recognition. IECE Transactions on Internet of Things, 2(4), 83–94. https://doi.org/10.62762/TIOT.2024.758153

Article Metrics
Citations:

Crossref

0

Scopus

0

Web of Science

0
Article Access Statistics:
Views: 533
PDF Downloads: 95

Publisher's Note
IECE stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions
Institute of Emerging and Computer Engineers (IECE) or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
IECE Transactions on Internet of Things

IECE Transactions on Internet of Things

ISSN: 2996-9298 (Online)

Email: [email protected]

Portico

Portico

All published articles are preserved here permanently:
https://www.portico.org/publishers/iece/

Copyright © 2024 Institute of Emerging and Computer Engineers Inc.