15A—Health-Insurance-Claims
The purpose of this repository is to display our AI4ALL Ignite project.
Health Insurance Claims
Developed a model utilizing advanced Python techniques and data analysis methodologies, all within AI4ALL’s cutting-cutting edge AI4ALL Ignite Accelerator to assist insurance carriers to sort and classify health insurance claimants into three risk categories.
Problem Statement
Insurance agents as field underwriters, spend working hours gathering information for insurance actuaries. The model would assist these actuaries in classifying health insurance claimants into three respective risk tiers while also ensuring fairness to both consumers and suppliers by identifiying outliers in each classification.
Key Results
- Sorted over 1330 claimaints into three risk classifications.
- Accurately classified claimants with a 98.8% accuracy rate.
- Identified only 5% of claimants as outliers within their sorted group.
Methodologies
To accomplish this, we utilized K-means clustering to sort 1338 claimants into three clusters. Each cluster was then processed and labeled as either preferred, standard, or high-cost. Accuracy and outliers were then determined by random forest and isolation forest algorithms.
Data Sources
Kaggle Datasets: https://www.kaggle.com/code/yash9439/health-insurance-claims-eda/notebook
Technologies Used
- Python
- pandas
- K-means clustering
- Random Forest
- Isolation Forest
Authors
This project was completed in collaboration with:
- Sebastian Davalos (sebas06lex@gmail.com)
- Drexana Rolle ( drex.rolle909@gmail.com)