Mohan Kumar .B
1. Introduction:
1.1 Problem Statement:
Khajana Bank has been facing the loss in business revenue from loan portfolio. Accounts to which loans have been issued to their client and then who might payback the loan EMI on regular basis (within 30 days). If he doesn’t payback pending amount in 30 days then these may further shift to default as well. It is vital for the bank to track such transitions (from current to 30 days delinquent) in order to evade delinquency or bad debt in long run of the bank operation.
1.2 Purpose:
The purpose of this report was to analyse the loan level data of Khajana Bank historic data. And to predict the probability of delinquencies against each account holder in Khajana Bank.
1.3 Assumption: …show more content…
• HPI life change = Current HPI rate – HPI rate on loan effective date.
• Monthly unemployment = Average no. of days account holder were unemployed in a month.
• In dependent variable, where 0 is defaulter and 1 is Non-defaulter.
1.4 Data Explanation: Data set contains loan ID with loan origin date account holder details like his/her location, payment history, unemployment history, respective account balance, outstanding bill amount and its end date, loan maturity date, next payment due date, debit to account holder income, no. of months for loan to get matured, some important KPI like change in HPI from origin date, FICO score, bank delinquency details, ratio of uncleared loan value to HPI rate, status of account and dependent outcome details.
1.5 Tools Used:
Feature Selection and Variable Significant Testing: R Statistical Tool
Exploratory Data Analysis: Tableau Visualization Tool
Descriptive Statistics: Microsoft …show more content…
Solution:
This is supervised learning case study where dependent outcome is known. Using this variable we can able to estimate the probability of dependent variable outcome by logistic regression technique. To do so I have done feature selection using Least Absolute Shrinkage and Selection Operator (LASSO) and estimate the probability of each account holder delinquency.
• Using LASSO technique, Current Balance, Home Price Index Ratio, Updated FICO Score, State, Change in HPI, No of missed payments, Delinquency, Months to maturity and Monthly unemployment variables are selected.
• After building a prediction model it is evaluated by Cox & Snell R Squared value as 0.005 (Very Low). And Odd Ratio is 1021.853 which mean every unit change in odds, odd ratio increases by