preloader
image

Employee Risk Management

Hello everyone,

This is my take on the binary classification of determining employees who are at a risk of termination or not.

It is a Binary Classification Problem. The tools used are:

  • Pandas for data manipulation and ingestion

  • Numpy for multidimensional array computing

  • Matplotlib and seaborn for data visualization

  • Word Cloud for geeting the most populare string

  • Imblearn for oversampling of the model

  • Scikit Learn for Data Preprocessing

Project Details:

Dataset Used:

https://www.kaggle.com/manasdalakoti/univai-hack-data

For modelling:

  • Random Forest Classifier:

Accuracy Reached: 95.74%

  • XG Boost Classifier:

Accuracy Reached: 93.17%

  • Light Gradient Boosting:

Accuracy Reached: 91.10%

  • Cat Boost classifier:

Accuracy Reached: 95.74%