The application is publicly available on Streamlit Community Cloud at the following link:
🐍
📄
A model to predict individual income thresholds in the US (
<=50K
or >50K
) using synthetically generated data via CTGAN and classified with XGBoost.
To simulate a realistic dataset of US adults and build a machine learning model capable of predicting annual income, then visualize everything interactively using Streamlit.
- 📊 Original Dataset: UCI Adult Dataset
- 🧬 Synthetic Data Generator: CTGAN
- 🤖 Classifier: XGBoost
- 🔍 Model Explainability: SHAP values
- 🌐 Interactive Dashboard: Streamlit
The Streamlit app includes the following sections:
- Synthetic dataset generated with CTGAN
- XGBoost model performance report
- SHAP plots to explain feature importance
- Download button to export synthetic data
👉 Try it now by clicking the badge above!
Streamlit_USAdult_Income/ ├── data/ # Original and synthetic data ├── outputs/ # SHAP plots and model results ├── streamlit_app/ # Streamlit app code │ └── app.py ├── requirements.txt # Python dependencies └── README.md
git clone https://github.com/DanteTrb/Streamlit_USAdult_Income.git
cd Streamlit_USAdult_Income
pip install -r requirements.txt
streamlit run streamlit_app/app.py
👤 Author
Dante Trabassi
LinkedIn: https://www.linkedin.com/in/dante-trabassi-663b3718b/