CROSS PROJECT SOFTWARE DEFECT PREDICTION USING ENSEMBLE LEARNING, A COMPREHENSIVE REVIEW
Keywords:
Software defect prediction, cross project defect prediction, ensemble learningAbstract
Software defect prediction (SDP) is a key task in software engineering that aims to identify potential defects in software projects to improve quality and reduce maintenance costs. Traditional defect prediction models are often limited by the availability of labeled data within a single project. Cross-Project Defect Prediction (CPDP) addresses this limitation by leveraging data from multiple projects. This review paper provides a comprehensive survey of the state-of-the-art in CPDP, with a special focus on the application of ensemble learning methods. Ensemble learning, which combines multiple models to improve prediction accuracy, shows great promise in enhancing CPDP performance. This paper systematically reviews various ensemble learning methods, including bagging, boosting, and stacking, and their applications in CPDP. Key challenges such as data heterogeneity, model transferability, and evaluation metrics are discussed. Furthermore, this review also highlights recent advances, comparative analysis, and future research directions in CPDP using ensemble learning. This article aims to provide a thorough understanding of the current situation and guide future research in developing more robust and accurate CPDP models.