P2P Forum

Lending Club Discussion => Investors - LC => Topic started by: larrydag on February 10, 2019, 12:00:00 AM

Title: Lending Club loan default prediction model question
Post by: larrydag on February 10, 2019, 12:00:00 AM
I've built a loan default prediction model with Lending Club about 2 years ago and I've been investing modestly with it since then.  I'm getting about 5.5 to 6% adj. return on my loans.  So I think its working fairly well.  I'm trying to improve the model hopefully one day achieve 10% returns.  I'm wondering if anyone else has built similar models and have come up with creative variable transformations on the historical loan data?  Here are some that I've come up with

loan_to_income = loan amount / income
payment_to_income = installment / income
time_since_earliest_credit_line = earliest credit line date - issue date
open_acc_ratio = open_acc / total_acc
curr_bal_ratio = tot_cur_bal / total_bal_ex_mort

some of these are more or less predictive.  Anyone have any other interesting transforms?

My inspiration for developing a Lending Club model came from LendingRobot  http://blog.lendingrobot.com/research/predicting-the-number-of-payments-in-peer-lending/
Title: Lending Club loan default prediction model question
Post by: Rob L on December 31, 1969, 07:00:00 PM
I recommend the book "Credit Scoring, Response Modeling and Insurance Rating" by Steven Finlay.
Also recommend the statistical package R as it's free, open source and very powerful.
Finally, I recommend the following LA thread and particularly the post by brycemason 12/23/2015.
In particularly note the referral to "the four horsemen of the consumer credit scoring apocalypse".
https://forum.lendacademy.com/index.php/topic,3570.msg31594.html#msg31593

Anyway, test transformations for covariance with your other model factors to see if they statistically add value;
loan_to_income (one of your transformations) is (was) one of the four biggies.
Good luck with that 10%!

Title: Lending Club loan default prediction model question
Post by: AnilG on December 31, 1969, 07:00:00 PM
Three important transformations are:
Title: Lending Club loan default prediction model question
Post by: TravelingPennies on December 31, 1969, 07:00:00 PM
from: AnilG on February 11, 2019, 12:29:43 AM
Title: Lending Club loan default prediction model question
Post by: TravelingPennies on December 31, 1969, 07:00:00 PM
Installment to income represents capability to pay, what does loan amount to income represent?

from: Rob L on February 11, 2019, 10:30:33 AM
Title: Lending Club loan default prediction model question
Post by: TravelingPennies on December 31, 1969, 07:00:00 PM
from: AnilG on February 11, 2019, 08:17:30 PM
Title: Lending Club loan default prediction model question
Post by: Roux on December 31, 1969, 07:00:00 PM
Our Data Scientist, Guangming Lang, used machine learning to mine the LC historical data. He used a combination of R and XGBoost to train our Liquid P2P loan selection models. I believe these are one click installs on AWS if you're inclined to tackle such a project.

https://liquidp2p.com/
https://www.linkedin.com/in/gmlang/
https://www.r-project.org/about.html
https://xgboost.readthedocs.io/en/latest/
Title: Lending Club loan default prediction model question
Post by: TravelingPennies on December 31, 1969, 07:00:00 PM
Thanks for all of the replies.  I should have shared a little about myself and my methods.  I have experience building predictive credit models in financial institutions.  My primary tool of choice to build predictive models is R.  I'm very fond of the GLMNET package and my methods resemble Frank Harrells "Regression Modeling Strategies". 
Title: Lending Club loan default prediction model question
Post by: TravelingPennies on December 31, 1969, 07:00:00 PM
Maybe you and Guangming should have a chat... lol. I’m a serial entrepreneur, not a data scientist. I knew what I wanted to build and assembled a team. He obviously was a critical team member. Guangming also authored a book on scoring consumer credit. I would be happy to show and discuss some of his work in detail if you want to pm me.


Sent from my iPhone using Tapatalk
Title: Lending Club loan default prediction model question
Post by: TravelingPennies on December 31, 1969, 07:00:00 PM
from: larrydag on February 12, 2019, 08:55:24 PM
Title: Lending Club loan default prediction model question
Post by: mikedev10 on December 31, 1969, 07:00:00 PM
from: Rob L on February 13, 2019, 09:40:44 AM
Title: Lending Club loan default prediction model question
Post by: TravelingPennies on December 31, 1969, 07:00:00 PM
Did you use installment/income  term in addition to loan amount/income and FICO in your multivariate logistic regression? Did you also use separate monthly income term in your regression? If not, then your statement is ingenuous as you didn't considered the relative importance of these terms in respect to each other. If you had considered relative merits of these terms together in your regression, you would know that monthly income is a very important "borrower characteristics" datapoint and any transformation containing monthly income will be weighted heavily in a regression. The first step of any regression analysis is to identify important and influential attributes to include in the regression.

The English language explanation for loan amount/income transformation is simple. This transformation represents whether a borrower given certain income can pay back the loan amount or not irrespective of duration. The installment/income transformation represents whether a borrower given certain income can make regular payment of installment amount over certain duration to payback loan amount or not. It is a "borrower indebtedness" datapoint and goes along with DTI.

When you are lending on LC primary market, you are deciding whether to lend on the LC given terms of lending (interest rate, duration installment). If you were deciding the terms of lending yourself (for ex: Prosper 1.0), then your strategy of not considering platform recommended terms of lending in assessing the loan quality will be effective and you will come up with your own acceptable terms of lending at which you will lend.

Sorry to see you discontinue the lending but not surprised.

from: Rob L on February 12, 2019, 10:15:42 AM
Title: Lending Club loan default prediction model question
Post by: TravelingPennies on December 31, 1969, 07:00:00 PM
from: AnilG on February 14, 2019, 09:47:15 PM
Title: Lending Club loan default prediction model question
Post by: TravelingPennies on December 31, 1969, 07:00:00 PM
from: Rob L on February 13, 2019, 09:40:44 AM
Title: Lending Club loan default prediction model question
Post by: TravelingPennies on December 31, 1969, 07:00:00 PM
So, you had no theoretical basis/reason for excluding "installment/income" in favor of "loan amount/income" from your model. That's all I wanted to highlight as a forum participant reached out to me offline for more clarification on merit of using installment over loan amount. I typically don't get into back and forth on internet forums. Thanks for your time in explaining the reasoning.
 
from: Rob L on February 15, 2019, 11:36:31 AM
Title: Lending Club loan default prediction model question
Post by: rawraw on February 16, 2019, 12:00:00 AM
Quote"> from: larrydag on February 15, 2019, 08:55:34 PM
Title: Lending Club loan default prediction model question
Post by: larrydag on December 31, 1969, 07:00:00 PM
There are a lot of opportunities to get started in auto finance if you have the right skillsets.  The typical skillsets that auto finance companies look for are STEM degrees and business degrees.  You can easily look up on a job aggregator to see the job descriptions.  Most auto finance companies are like every other company in they want to be able to make data driven decisions about loan applicants ability to repay on loans.  If you don't have previous financial or lending experience I believe you can still get in the door at an analyst or IT developer level and build your experience.  Even if you can't find an auto finance job you can find an analyst job at a bank and learn about credit and lending in that position.  The important things to know in auto finance is credit bureau data and loan portfolio management.

Getting started in predictive modeling is more broad.  There is a huge swath of companies and industries looking for that skill.  In fact even if you current job doesn't require you could probably take it on as as side project and show how your model would help your current organization.  Here is the secret untold story about predictive modeling that most academics do not tell you.  Predictive modeling is 80% data acquisition and management and 20% modeling.  So make sure you are a data skill hawk meaning that you can download, pull, connect, manipulate, slice, dice, warehouse, store, and distribute data.  That means having skills in SQL, Python, R or other data programming tool.   Trust me your bosses would like it even if you can just manage multiple data sources and provide meaningful data analysis.  Chances are the business decision makers in an organization doesn't know how and doesn't know the data exists.   

Getting your chops up in the statistical and applied math of predictive modeling can be done on your own via online learning or in a more structured classroom setting.  Do it in baby steps if you have no applied math background.  Start with basic statistics 101 and move on to more advanced. 
Title: Lending Club loan default prediction model question
Post by: Rob L on December 31, 1969, 07:00:00 PM
Quote"> from: larrydag on February 15, 2019, 08:55:34 PM
Title: Lending Club loan default prediction model question
Post by: TravelingPennies on December 31, 1969, 07:00:00 PM
from: larrydag on February 16, 2019, 08:33:29 AM
Title: Lending Club loan default prediction model question
Post by: TravelingPennies on December 31, 1969, 07:00:00 PM
from: rawraw on February 16, 2019, 08:50:58 PM
Title: Lending Club loan default prediction model question
Post by: TravelingPennies on December 31, 1969, 07:00:00 PM
Good luck to you rawraw.  You'll find auto finance to be a rewarding and challenging industry
Title: Lending Club loan default prediction model question
Post by: TravelingPennies on December 31, 1969, 07:00:00 PM
from: larrydag on February 18, 2019, 05:53:00 PM