Scaling AI with Limited Labeled Data: A Self-Supervised Learning Approach

Praveen Kumar Myakala

doi:10.62762/TETAI.2025.607708

CiteScore

3.44

Impact Factor

Volume 2, Issue 1, IECE Transactions on Emerging Topics in Artificial Intelligence

Volume 2, Issue 1, 2025

Submit Manuscript Edit a Special Issue

Academic Editor

Aditya Kumar Sahu

Amrita School of Computing, India

Article QR Code

Scan the QR code for reading

Popular articles

Research on A Ship Trajectory Classification Method Based on Deep Learning YOLOv7-Bw: A Dense Small Object Efficient Detector Based on Remote Sensing Image A Mimic Fusion Algorithm for Dual Channel Video Based on Possibility Distribution Synthesis Theory Deep Prediction Network Based on Covariance Intersection Fusion for Sensor Data Bridging Modalities: A Survey of Cross-Modal Image-Text Retrieval Visual Feature Extraction and Tracking Method Based on Corner Flow Detection Inaugural Editorial of the Chinese Journal of Information Fusion Simultaneous Spatiotemporal Bias Compensation and Data Fusion for Asynchronous Multisensor Systems YOLOv8-Lite: A Lightweight Object Detection Model for Real-time Autonomous Driving Systems Extraction of Motion Information from Occupancy Grid Map Using Keystone Transform

IECE Transactions on Emerging Topics in Artificial Intelligence, Volume 2, Issue 1, 2025: 26-35

Free to Read | Research Article | 15 March 2025

Scaling AI with Limited Labeled Data: A Self-Supervised Learning Approach

Praveen Kumar Myakala 1 *

1 University of Colorado Boulder, Boulder, CO 80309, United States

* Corresponding Author: Praveen Kumar Myakala, [email protected]

DOI: 10.62762/TETAI.2025.607708

Received: 09 February 2025, Accepted: 01 March 2025, Published: 15 March 2025

PDF (946.13 KB) Full-Text HTML XML

Article Metrics Cite This Article

Abstract

The scalability of modern AI is fundamentally limited by the availability of labeled data. While supervised learning achieves remarkable performance, it relies on large annotated datasets, which are expensive and time-consuming to acquire. This work explores self-supervised learning (SSL) as a promising solution to this challenge, enabling AI to scale effectively in data-scarce scenarios. This study demonstrates the effectiveness of the proposed SSL framework using the EuroSAT dataset, a benchmark for land cover classification where labeled data is limited and costly. The proposed approach integrates contrastive learning with multi-spectral augmentations, such as spectral jittering and band shuffling, along with masked autoencoding that applies spatial-spectral masking based on local variance in spectral bands. This method effectively captures the unique spatial and spectral characteristics of EuroSAT imagery. Experimental results show that the proposed SSL-based models achieve 81.2% accuracy with only 10% of the labeled data, outperforming supervised learning by 2.7% and semi-supervised methods by 2.1%. These results demonstrate the potential of SSL to reduce reliance on labeled data and enable effective AI deployment in data-constrained environments. The proposed work highlights the transformative potential of SSL in reducing annotation burdens, paving the way for more scalable, accessible, and cost-effective AI solutions.

Graphical Abstract

Scaling AI with Limited Labeled Data: A Self-Supervised Learning Approach

Keywords

self-supervised Learning (SSL)

limited labeled data

data-scarce scenarios

contrastive learning

masked autoencoding

scalable AI

Data Availability Statement

Data will be made available on request.

Funding

This work was supported without any funding.

Conflicts of Interest

The author declare no conflicts of interest.

Ethical Approval and Consent to Participate

Not applicable.

References

LeCun, Y., Bengio, Y., & Hinton, G. (2015). Deep learning. Nature, 521(7553), 436-444.
[CrossRef] [Google Scholar]
Chen, T., Kornblith, S., Norouzi, M., & Hinton, G. (2020). A simple framework for contrastive learning of visual representations. In International Conference on Machine Learning (pp. 1597-1607). PMLR.
[CrossRef] [Google Scholar]
Wang, M., Fu, W., He, X., Hao, S., & Wu, X. (2020). A survey on large-scale machine learning. IEEE Transactions on Knowledge and Data Engineering, 34(6), 2574-2594.
[CrossRef] [Google Scholar]
Mukhamediev, R. I., Popova, Y., Kuchin, Y., Zaitseva, E., Kalimoldayev, A., Symagulov, A., ... & Yelis, M. (2022). Review of artificial intelligence and machine learning technologies: Classification, restrictions, opportunities and challenges. Mathematics, 10(15), 2552.
[CrossRef] [Google Scholar]
Salehi, S., & Schmeink, A. (2023). Data-centric green artificial intelligence: A survey. IEEE Transactions on Artificial Intelligence, 5(5), 1973-1989.
[CrossRef] [Google Scholar]
He, K., Chen, X., Xie, S., Li, Y., Dollár, P., & Girshick, R. (2022). Masked autoencoders are scalable vision learners. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (pp. 16000-16009).
[CrossRef] [Google Scholar]
Zhang, C., Zhang, C., Song, J., Yi, J. S. K., Zhang, K., & Kweon, I. S. (2022). A survey on masked autoencoder for self-supervised learning in vision and beyond. arXiv preprint arXiv:2208.00173.
[CrossRef] [Google Scholar]
Li, G., Yu, Z., Yang, K., Lin, M., & Chen, C. P. (2024). Exploring feature selection with limited labels: A comprehensive survey of semi-supervised and unsupervised approaches. IEEE Transactions on Knowledge and Data Engineering, 36(11), 6124-6144.
[CrossRef] [Google Scholar]
Paheding, S., Saleem, A., Siddiqui, M. F. H., Rawashdeh, N., Essa, A., & Reyes, A. A. (2024). Advancing horizons in remote sensing: A comprehensive survey of deep learning models and applications in image classification and beyond. Neural Computing and Applications, 36(27), 16727-16767.
[CrossRef] [Google Scholar]
Janga, B., Asamani, G. P., Sun, Z., & Cristea, N. (2023). A review of practical AI for remote sensing in earth sciences. Remote Sensing, 15(16), 4112.
[CrossRef] [Google Scholar]
Helber, P., Bischke, B., Dengel, A., & Borth, D. (2019). Eurosat: A novel dataset and deep learning benchmark for land use and land cover classification. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 12(7), 2217-2226.
[CrossRef] [Google Scholar]
Gidaris, S., Singh, P., & Komodakis, N. (2018). Unsupervised representation learning by predicting image rotations. arXiv preprint arXiv:1803.07728.
[CrossRef] [Google Scholar]
Noroozi, M., & Favaro, P. (2016). Unsupervised learning of visual representations by solving jigsaw puzzles. In European Conference on Computer Vision (pp. 69-84). Springer International Publishing.
[CrossRef] [Google Scholar]
He, K., Fan, H., Wu, Y., Xie, S., & Girshick, R. (2020). Momentum contrast for unsupervised visual representation learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (pp. 9729-9738).
[CrossRef] [Google Scholar]
Montanaro, A., Valsesia, D., Fracastoro, G., & Magli, E. (2022). Semi-supervised learning for joint SAR and multispectral land cover classification. IEEE Geoscience and Remote Sensing Letters, 19, 1-5.
[CrossRef] [Google Scholar]
Stojnic, V., & Risojevic, V. (2021). Self-supervised learning of remote sensing scene representations using contrastive multiview coding. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (pp. 1182-1191).
[CrossRef] [Google Scholar]
Berthelot, D., Carlini, N., Goodfellow, I., Papernot, N., Oliver, A., & Raffel, C. A. (2019). Mixmatch: A holistic approach to semi-supervised learning. Advances in Neural Information Processing Systems, 32.
[CrossRef] [Google Scholar]
Sohn, K., Berthelot, D., Carlini, N., Zhang, Z., Zhang, H., Raffel, C. A., ... & Li, C. L. (2020). Fixmatch: Simplifying semi-supervised learning with consistency and confidence. Advances in Neural Information Processing Systems, 33, 596-608.
[CrossRef] [Google Scholar]
Tan, C., Sun, F., Kong, T., Zhang, W., Yang, C., & Liu, C. (2018). A survey on deep transfer learning. In Artificial Neural Networks and Machine Learning–ICANN 2018: 27th International Conference on Artificial Neural Networks, Rhodes, Greece, October 4-7, 2018, Proceedings, Part III 27 (pp. 270-279). Springer International Publishing.
[CrossRef] [Google Scholar]
Settles, B. (2009). Active learning literature survey.
[CrossRef] [Google Scholar]

Cite This Article

APA Style

Myakala, P.K. (2025). Scaling AI with Limited Labeled Data: A Self-Supervised Learning Approach. IECE Transactions on Emerging Topics in Artificial Intelligence, 2(1), 26–35. https://doi.org/10.62762/TETAI.2025.607708

Article Metrics

Citations:

Google Scholar

Crossref

Scopus

Web of Science

Article Access Statistics:

PDF Downloads: 36

Publisher's Note

IECE stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Institute of Emerging and Computer Engineers (IECE) or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

IECE Transactions on Emerging Topics in Artificial Intelligence

ISSN: 3066-1676 (Online) | ISSN: 3066-1668 (Print)

Email: [email protected]

Portico

All published articles are preserved here permanently:
https://www.portico.org/publishers/iece/

Google Scholar

Crossref

Scopus

Web of Science

We use cookies