-
CiteScore
1.08
Impact Factor
IECE Transactions on Intelligent Systematics, 2024, Volume 1, Issue 3: 190-202

Free Access | Research Article | 12 November 2024
1 Department of Computer Science, University of Peshawar, Pakistan
2 School of Electronic and Control Engineering, Chang’an University, Xián 710064, China
3 School of Computer Science and Technology, Zhejiang Gongshang University, Hangzhou 310018, China
4 School of Mathematics and Statistics, Zhejiang Gongshang University, Hangzhou 310018, China
5 Department of Computer Science and Bioinformatics, Khushal Khan Khattak University Karak, Pakistan
6 Department of Health Science and Technology, Gachon University, Incheon 21936, Republic of Korea
7 Gachon Advanced Institute for Health Sciences and Technology, Gachon University, Incheon 21936, Republic of Korea
8 Department of Computer Science and Information Technology, University of Malakand, Chakdara, Pakistan
9 School of International Education, Zhejiang Gongshang University, Hangzhou 310018, China
* Corresponding author: Samsonova Diana, email: [email protected]
Received: 01 October 2024, Accepted: 08 November 2024, Published: 12 November 2024  

Abstract
The challenge of accurately estimating effort for software development projects is critical for project managers (PM) and researchers. A common issue they encounter is missing data values in datasets, which complicates effort estimation (EE). While several models have been introduced to address this issue, none have proven entirely effective. The Analogy-Based Effort Estimation (ABEE) model is the most widely used approach, relying on historical data for estimation. However, the common practice of deleting cases or cells with missing observations results in a reduction of statistical power and negatively impacts the performance of ABEE, leading to inefficiencies and biases. This study employs the Multiple Imputation (MI) technique to address missing data by filling in incomplete cases. A comparison is conducted between the original and imputed ISBSG datasets for both small- and large-scale projects, using six other imputation techniques to identify the most effective method for ABEE. The results demonstrate that the MI technique enhances effort estimation, providing more accurate and efficient outcomes while preserving valuable information throughout the project estimation process.

Graphical Abstract
Improving Effort Estimation Accuracy in Software Development Projects Using Multiple Imputation Techniques for Missing Data Handling

Keywords
analogy-based effort estimation
multiple imputation
software development effort estimation

References

[1] Kelkar, B. A. (2022). Missing data imputation: a survey. International Journal of Decision Support System Technology (IJDSST), 14(1), 1-20.

[2] Bardsiri, A. K., & Hashemi, S. M. (2014). Software effort estimation: a survey of well-known approaches. International Journal of Computer Science Engineering (IJCSE), 3(1), 46-50.

[3] Hosni, M., & Idri, A. (2018). Software development effort estimation using feature selection techniques. In New trends in intelligent software methodologies, tools and techniques (pp. 439-452). IOS Press.

[4] Shah, M. A., Jawawi, D. N., Isa, M. A., Wakil, K., Younas, M., & Ahmed, M. (2019). MINN: A missing data imputation technique for Analogy-Based Effort Estimation. International Journal of Advanced Computer Science and Applications, 10(2).

[5] Idri, A., & Abnane, I. (2017, August). Fuzzy analogy based effort estimation: An empirical comparative study. In 2017 IEEE International Conference on Computer and Information Technology (CIT) (pp. 114-121). IEEE.

[6] Song, L., Minku, L. L., & Yao, X. (2018, October). A novel automated approach for software effort estimation based on data augmentation. In Proceedings of the 2018 26th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering (pp. 468-479).

[7] El Bajta, M. (2015, July). Analogy-based software development effort estimation in global software development. In 2015 IEEE 10th International Conference on Global Software Engineering Workshops (pp. 51-54). IEEE.

[8] Sharma, P., & Singh, J. (2017, December). Systematic literature review on software effort estimation using machine learning approaches. In 2017 International Conference on Next Generation Computing and Information Systems (ICNGCIS) (pp. 43-47). IEEE.

[9] Jones, T. C. (2007). Estimating software costs. McGraw-Hill, Inc..

[10] Abnane, I., & Idri, A. (2018, September). Improved analogy-based effort estimation with incomplete mixed data. In 2018 Federated Conference on Computer Science and Information Systems (FedCSIS) (pp. 1015-1024). IEEE.

[11] Kim, Y., & Lee, K. (2005). A comparison of techniques for software development effort estimating. SYSTEM, 407.

[12] Azzeh, M., Elsheikh, Y., & Alseid, M. (2017). An optimized analogy-based project effort estimation. arXiv preprint arXiv:1703.04563.

[13] Shepperd, M., & Schofield, C. (1997). Estimating software project effort using analogies. IEEE Transactions on software engineering, 23(11), 736-743.

[14] Wang, J., & Johnson, D. E. (2019). An examination of discrepancies in multiple imputation procedures between SAS® and SPSS®. The American Statistician, 73(1), 80-88.

[15] Little, R. J., & Rubin, D. B. (2019). Statistical analysis with missing data (Vol. 793). John Wiley & Sons.

[16] Cartwright, M. H., Shepperd, M. J., & Song, Q. (2004, September). Dealing with missing software project data. In Proceedings. 5th International Workshop on Enterprise Networking and Computing in Healthcare Industry (IEEE Cat. No. 03EX717) (pp. 154-165). IEEE.

[17] Song, Q., Shepperd, M., Chen, X., & Liu, J. (2008). Can k-NN imputation improve the performance of C4. 5 with small software project data sets? A comparative evaluation. Journal of Systems and software, 81(12), 2361-2370.

[18] Chuzel, L. (2021). Application of functional metagenomics to the field of glycobiology (Doctoral dissertation, Shaker Verlag Düren).

[19] Mardhia, M. M., & Handayaningsih, S. (2018). Analogy-based model for software project effort estimation. International Journal of Advances in Intelligent Informatics, 4(3).

[20] Bala, A., & Abran, A. (2016). Use of the multiple imputation strategy to deal with missing data in the ISBSG repository. Journal of Information Technology & Software Engineering, 6, 171.

[21] Bala, A., & Abran, A. (2018). Impact analysis of multiple imputation on effort estimation models with the ISBSG repository of software projects. Softw. Meas. News, 23(1), 17-34.

[22] Idri, A., Abnane, I., & Abran, A. (2016). Missing data techniques in analogy-based software development effort estimation. Journal of Systems and Software, 117, 595-611.

[23] Holman, C. D. A. J., Bass, J. A., Rosman, D. L., Smith, M. B., Semmens, J. B., Glasson, E. J., ... & Stanley, F. J. (2008). A decade of data linkage in Western Australia: strategic design, applications and benefits of the WA data linkage system. Australian health review, 32(4), 766-777.

[24] Tamura, K., Kakimoto, T., Toda, K., Tsunoda, M., Monden, A., & Matsumoto, K. I. (2008). Empirical Evaluation of Missing Data Techniques for Effort Estimation. n3n.

[25] Read, S. H. (2015). Applying missing data methods to routine data using the example of a population-based register of patients with diabetes.

[26] Sentas, P., & Angelis, L. (2006). Categorical missing data imputation for software cost estimation by multinomial logistic regression. Journal of Systems and Software, 79(3), 404-414.

[27] Zhu, X. (2014). Comparison of four methods for handing missing data in longitudinal data analysis through a simulation study. Open Journal of Statistics, 4(11), 933.

[28] González-Ladrón-de-Guevara, F., Fernández-Diego, M., & Lokan, C. (2016). The usage of ISBSG data fields in software effort estimation: A systematic mapping study. Journal of Systems and Software, 113, 188-215.

[29] Mohanty, S. K., & Bisoi, A. K. (2012). Software effort estimation approaches-a review. International Journal of Internet Computing, 1(3), 82-88.

[30] Jakobsen, J. C., Gluud, C., Wetterslev, J., & Winkel, P. (2017). When and how should multiple imputation be used for handling missing data in randomised clinical trials–a practical guide with flowcharts. BMC medical research methodology, 17, 1-10.

[31] Shukla, S., & Kumar, S. (2021). An Extreme Learning Machine based Approach for Software Effort Estimation. In ENASE (pp. 47-57).

[32] Papageorgiou, G., Grant, S. W., Takkenberg, J. J., & Mokhles, M. M. (2018). Statistical primer: how to deal with missing data in scientific research?. Interactive cardiovascular and thoracic surgery, 27(2), 153-158.

[33] Mahdi, M. N., Mohamed Zabil, M. H., Ahmad, A. R., Ismail, R., Yusoff, Y., Cheng, L. K., ... & Happala Naidu, H. (2021). Software project management using machine learning technique—A Review. Applied Sciences, 11(11), 5183.

[34] Živadinović, J., Medić, Z., Maksimović, D., Damnjanović, A., & Vujčić, S. (2011, June). Methods of effort estimation in software engineering. In Proc. Int. Symposium Engineering Management and Competitiveness (EMC) (pp. 417-422).


Cite This Article
APA Style
Hayat, S., Akbar, W., Hussain, T., Haq, M. I. U., Hussian, A., Khalil, I., Khan, M. M., & Diana, S. (2024). Improving Effort Estimation Accuracy in Software Development Projects Using Multiple Imputation Techniques for Missing Data Handling. IECE Transactions on Intelligent Systematics, 1(3), 190-202. https://doi.org/10.62762/TIS.2024.751418

Article Metrics
Citations:

Crossref

0

Scopus

0

Web of Science

0
Article Access Statistics:
Views: 345
PDF Downloads: 54

Publisher's Note
IECE stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions
IECE or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
IECE Transactions on Intelligent Systematics

IECE Transactions on Intelligent Systematics

ISSN: 2998-3355 (Online) | ISSN: 2998-3320 (Print)

Email: [email protected]

Portico

Portico

All published articles are preserved here permanently:
https://www.portico.org/publishers/iece/

Copyright © 2024 Institute of Emerging and Computer Engineers Inc.