This project aims to build a predictive model to assess credit risk using a dataset of financial and demographic variables. The notebook walks through the full pipeline from data preprocessing to model evaluation.
This analysis aims to evaluate the probability of default or financial distress of clients using machine learning techniques. The Data Used for this project can be found in Kaggle: https://www.kaggle.com/datasets/ranadeep/credit-risk-dataset/data
-
Data loading and cleaning
-
Exploratory Data Analysis (EDA)
-
Feature engineering
-
Model Implementation (Binary Target)
- Model training and evaluation (Logistic Regression, XGBoost)
- Model performance metrics (Accuracy, ROC AUC, Confusion Matrix)
-
Model Implementation (Multi-class Target)
- Model training and evaluation (XGBoost)
- Model performance metrics (Accuracy, ROC AUC, Confusion Matrix)
- Pandas
- Scikit-learn
- Matplotlib / seaborn
- SHAP
- Others
- Exploration of other models (LightGBM)
- Hyperparameter Tuning
- Others
Gorka - @gorkbravo