Optimizing LSTM networks and feature selection algorithms using GEE data.
Feature selection uncertainty and suboptimal model configuration in flood susceptibility mapping (FSM) are critical challenges for disaster risk reduction. This study introduces a novel integrated framework that couples feature selection strategy with metaheuristic-optimized deep learning for high-precision FSM in the flood-prone Khuzestan Province, Iran. An initial set of 19 factors was sourced from Google Earth Engine (GEE). The most influential variables were identified using an ensemble of nine feature selection methods, including Boruta, Boruta-SHAP, Elastic-Net, Mutual Information, Permutation Importance, Recursive Feature Elimination (RFE), Sequential Forward Selection (SFS), Stability Selection, and Deep Feature Importance. For model development, 1,000 sample points were used, consisting of 500 randomly selected non-flood points (value 0) and 500 flood points (value 1), with the trained model subsequently generalized to the entire study area. In this process, a frequency-based consensus rule was applied, whereby variables were retained only if selected by a majority of methods. This process established the Normalized Difference Vegetation Index (NDVI) and Daily Minimum Temperature (TMMN) as the most critical predictors. A Long Short-Term Memory (LSTM) was developed using this optimal feature set and subsequently enhanced through hyperparameter optimization with five advanced metaheuristic algorithms, including WOA, GWO, OOA, CSA, and HOA. The model validation demonstrated that optimization significantly boosted performance, with the LSTM-WOA model emerging as superior, achieving the highest F1-Score (0.88) and Cohen's Kappa (0.75). The final FSM identified the northwestern and central regions as the highest susceptibility. The study innovation lies in its formalized consensus feature selection and comparative metaheuristic optimization, providing a reliable tool for FSM in arid and semi-arid regions.