2025-12-30

NCAA Match Prediction Script Using Logistic Regression

This script aims to predict the probability that a college basketball team wins against another in a match. It leverages historical data on teams, regular season results, and tournament seeds to build a predictive model.

Purpose :
- Understand how to transform raw match data into features usable
by a machine learning model.
- Build a supervised model capable of predicting the match winner.
- Evaluate the model using standard metrics (log loss, ROC-AUC)
and apply it to tournament simulations.

Data :
- "teams": team information (TeamID, TeamName, first and last
Division 1 season)
- "results": regular season match results (winning team, losing team,
score, match day)
- "seed_round_slots": information on tournament seeds and match slots

Variables:
- "team_stats": number of wins and losses per team per season
- "match_data": prepared match dataset for model training
- "X", "y": features and target for training
- "model": trained logistic regression model
- "matchup_example": sample tournament matches for prediction

Model:
- Logistic Regression
- It is supervised because it learns from labeled data: each historical
match has a label "1" if Team1 wins, "0" otherwise.
- Suitable for binary classification and allows estimating the probability
of a team winning.

Objectives:
1. Load the necessary CSV files.
2. Compute wins and losses for each team and season.
3. Create a match dataset ready for training.
4. Normalize the data and split into training and test sets.
5. Train a supervised Logistic Regression model.
6. Evaluate the model using log loss and ROC-AUC.
7. Prepare a sample tournament matchup and predict win probabilities.

馃幆 The detailed methodology and results can be accessed through this link:

馃憠click here now! :  https://github.com/




    
The Abdi-Basid Courses Institute (tABCi)



@2025 Abdi-Basid ADAN