2025-09-15

Benchmarking Tree-Based Ensemble Methods for Multi-Year Daily Precipitation Forecasting Across the Contiguous United States (2000–2023)

Description

This study presents a comparative evaluation of the LightGBM and XGBoost algorithms for the task of next-day (J+1) daily precipitation forecasting. The analysis utilizes a comprehensive dataset of 8,765 daily meteorological observations spanning the entire continental United States over a 24-year period (2000–2023). The research focuses on assessing predictive performance in relation to seasonal climatic variables and evaluates model robustness against interannual variability. With a pedagogical objective , the study aims to identify the most influential climatic determinants for short-term hydrometeorological prediction.






馃幆 The detailed methodology and results can be accessed through this link:

馃憠Click here now! :  https://github.com/abdibasidadan-byte


Abdi-Basid ADAN, 2025


2025-09-14

Analysis and Downscaling of Precipitation over East Africa and Djibouti: Observed Data, GCM-CMIP6, and CORDEX

This study provides a multi-scale comparison of simulated and observed precipitation. Global simulations from the CanESM5 model (CMIP6, 282 km) are contrasted with results obtained through stochastic downscaling at 3.5 km using CSTools. Observed rainfall for 1981 is spatially interpolated using Inverse Distance Weighting (IDW).

In addition, climate projections from the EC-Earth3-Veg model (CMIP6) under the SSP585 scenario are analyzed for the period 2021–2040, focusing on both the Republic of Djibouti and the wider East Africa region. Finally, downscaled daily precipitation from CORDEX (1981–1985) is generated using Nearest Neighbor and Bilinear interpolation, allowing an assessment of the sensitivity of results to methodological choices.




Figure 0.
Comparison of rainfall variability from satellite products versus observation in situ from 1980 to 2021.


Figure 1. CMIP6 GCM CanESM5 precipitation for 1981 (spatial resolution: 282 km).


Figure 2. CMIP6 GCM CanESM5 precipitation for 1981 downscaled to 3.5 km using stochastic methods with CSTools.



Figure 3. Spatial distribution of observed rainfall in 1981 using Inverse Distance Weighting (IDW) interpolation




Figure 4. Projected total monthly precipitation (mm) from the EC-Earth3-Veg model (GCM-CMIP6), based on the ssp585 scenario Over the Republic of Djibouti during 2021-2040.



Figure 5. Projected total monthly precipitation (mm) from the EC-Earth3-Veg model (GCM-CMIP6), based on the ssp585 scenario Over the Eastern of Africa during 2021-2040.


Figure 6. Downscaled daily precipitation from CORDEX (1981–1985) using (a) Nearest Neighbor interpolation, (b) Bilinear interpolation, and (c) original CORDEX data for comparison.



Figure 7. Performance comparison of the occurrence, duration and intensity of rainfall simulated by Canadian global and regional climate models against the observed rainfall at the airport station.



Table Accuracy of GCM CMIP6 model performance using Delta - QM EQM biais correction for precipitation at Djibouti airport station

 

RMSE

PBIAIS

CanESM5_Delta_ssp585

-0.005

4.942

0.024

CanESM5_QM_ssp585

-0.003

8.118

0.019

CanESM5_EQM_ssp585

-0.005

5.171

0.299

CanESM5_Delta_ssp119

-0.006

4.984

0.023

CanESM5_QM_ssp119

-0.003

8.173

0.09

CanESM5_EQM_ssp119

-0.005

5.237

0.187

 

 


The Abdi-Basid Courses Institute (tABCi)

@ 2023 Abdi-Basid ADAN

2025-09-13

Predictive Analysis of Customer Behavior in E-Commerce: Prediction of Average Order Value and Identification of High-Value Customers

Description:

This data analysis project explores customer behavior on an e-commerce platform using a dataset containing key metrics such as session duration, product detail views, app transactions, add-to-cart rate per session, discount rate per visited products, credit card info saving, average order value (“avg order value”), and a high-value customer indicator (“high_value_customer”). The code is structured in several steps:

Data Preparation: Loading from the clipboard, cleaning (replacing commas with periods for decimals), numeric conversion, and encoding of categorical variables (e.g., yes/no via LabelEncoder). 

Regression Modeling: Use of an XGBoost model to predict average order value, with evaluation via RMSE and R² on a test set (30% of the data). Visualizations include a scatter plot of predictions vs. actual values, a correlation matrix, a boxplot of average basket by card saving, and a histogram of prediction errors. 

Classification Modeling: Logistic regression with L2 regularization to identify high-value customers, based on selected features (session duration, product views, etc.). Evaluation via ROC-AUC score and ROC curve.







Abdi-Basid ADAN, 09–2025

馃幆 The detailed methodology and results can be accessed through this link:

馃憠Click here now! : https://github.com/abdibasidadan-byte

LinkedIn group page: https://www.linkedin.com/groups/ 

馃彌️ tABCi Laboratories :


馃И Abdi-Basid ADAN LABS  Focus: Interdisciplinary Sciences & Education.

馃 The Deep Thinking Lab Focus: Philosophy & Human Sciences.

馃實 The EcoClimate Hub Focus: Climate Science & Environment.