Random Forest and Extreme Gradient Boosting with Bayesian Hyperparameter Optimization for Landslide Susceptibility Mapping in Penang Island, Malaysia
- 1 Faculty of Computer Science and Information Technology, University of Malaysia Sarawak, Malaysia
- 2 School of Science and Technology, International University College of Advanced Technology Sarawak, Malaysia
- 3 Data Science Department of Jain University, Bangalore, India
- 4 Faculty of Computing and Informatics, Universiti Malaysia Sabah (UMS), Sabah, Malaysia
Abstract
Landslide susceptibility models often face challenges of overfitting and overestimation. This research focuses on improving the predictive capabilities of the Extreme Gradient Boosting (XGBoost) and Random Forest (RF) algorithms by applying Bayesian Hyperparameter Optimization (BayesOpt). Penang Island, a region in Malaysia prone to frequent landslides, was chosen as the study area. Ten Landslide Conditioning Factors (LCFs), including elevation, slope angle, NDVI, and proximity to streams and roads, were derived using Geographic Information Systems (GIS). From the total of 886 landslide and non-landslide data points, a 70:30 split was employed for training and testing, respectively. BayesOpt-RF emerged as the top-performing model among all those assessed with an AUC of 99.50% (Success Rate) and 95.80% (Prediction Rate). RF (SR: 100.00%, PR: 95.60%), XGBoost (SR: 100.00%, PR: 95.20%), and BayesOpt-XGBoost (SR: 96.70%, PR: 93.00%) followed. While BayesOpt did not consistently improve prediction performance, it effectively minimized overfitting and ensured optimal model operation. For effective site selection, the generated landslide susceptibility maps are significant, infrastructure planning, and disaster mitigation.
DOI: https://doi.org/10.3844/jcssp.2025.2273.2291
Copyright: © 2025 Dorothy Anak Martin Atok, Soo See Chai, Kok Luong Goh, Neha Gautam and Kim On Chin. This is an open access article distributed under the terms of the
Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
- 39 Views
- 10 Downloads
- 0 Citations
Download
Keywords
- Bayesian Hyperparameter Optimization
- Extreme Gradient Boosting Landslide Susceptibility Mapping
- Random forest