Prudential Life Insurance - Classification of Risk

This ‘Featured’ Kaggle competition was from Prudential Life Insurance to assess and identify risk classification of customers based on their extensive personal and medical information. This will help Prudential to shorten their current 30-day turnaround to be fast enough to produce the quote and send it out to customers.

The competition ran from 23-Nov-2015 to 15-Feb-2016 and there were 2619 teams who participated across the globe.

In this Supervised Learning, customers’ personal and medical information, life insurance product chosen, and actual risk were provided in Train Data set. The quadratic weighted kappa, which measures the agreement between two ratings, was used to evaluate and score the Risk prediction. Refer here to understand more about how to evaluate it on predicted value.

To predict the risk classification, I first tried with Recursive-Partition (rpart) Classification Tree Model to understand the performance of this basic model. It scored 0.60115 in Public Leaderboard. After that I built the following set of models:

After preparing multiple predictions with various rounds, objectives, and algorithms, I did a mode of all the outputs and arrived at my final submission. Finally, at the end of the competition, I could get into top <19% (485 out of 2619) in the Private Leaderboard.

screenshot