P2P Lending / NFT Lending Forum

Lending Club Discussion => Investors - LC => Topic started by: breitenm on October 19, 2012, 11:00:00 PM

Title: Additional variables in data & models for loans
Post by: breitenm on October 19, 2012, 11:00:00 PM
Hi,

Does anybody know when the new variables LC announced in their most recent blog-post ( http://blog.lendingclub.com/2012/09/28/investor-updates-and-enhancements/) are going to show up in the CSV files for download?

Also, when you guys are picking loans, how do you count loans marked as late (or grace period) in the LoanStats.csv file? I'm building a model on the data and am currently counting them as bad, but that might discount the fact that many of them do recover. This may make the model forecast too pessimistic. Any thoughts?

Markus
Title: Additional variables in data & models for loans
Post by: Peter on October 19, 2012, 11:00:00 PM
Most people who are doing models for Lending Club and Prosper, when looking at the entire database, use a discounting method for late loans. You can see Lendstats model here:
http://www.lendstats.com/loansearch/lc/lcloanfilter.php
They use loss factors of 0.5 for payment plans, 0.25 for in grace period, 0.5 for 16-30 days late, 0.75 for 31-120 days late and 0.99 for defaults.
Others, such as Interest Radar, use the Lending Club recovery rate data which is more optimistic than Lendstats:
https://www.lendingclub.com/info/statistics-performance.action
Title: Additional variables in data & models for loans
Post by: brycemason on October 20, 2012, 11:00:00 PM
One way to go is only to model on loans that have termed out. No discounting necessary. Downside is you're three years out of date.
Title: Additional variables in data & models for loans
Post by: TravelingPennies on October 23, 2012, 11:00:00 PM
Does the LendStats model use historic data? It seems that the data available for download shows the current status of loans and does not include historical status changes. For example, I just dug out old loan files from September and October and LoanID 1024323 went from Late31-120 to Late16-30 in September back to Current in October. Unless these changes over time are accounted for and, say, the "worst" state for over the lifetime of the loan is being used in the model, then the discounting factor of the loan would change all the time.

Does anybody know of a way to get old versions of the LoanStats files?
Title: Additional variables in data & models for loans
Post by: TravelingPennies on October 23, 2012, 11:00:00 PM
You are right that once a loan goes back to current at Lendstats and others it is treated as always being current. But we know that it has a higher likelihood of default than a loan that has never been late.

There is no way to obtain old versions of the Loanstats.csv, it is updated every day. I have downloaded about a dozen versions on my computer dating back to 2010 so I can see how things change over time.
Title: Additional variables in data & models for loans
Post by: TravelingPennies on October 24, 2012, 11:00:00 PM
Could you put them in zip-file somewhere? :-) Then I can update my model to account for loans that were paying late.
Title: Additional variables in data & models for loans
Post by: TravelingPennies on October 24, 2012, 11:00:00 PM
Actually I don't mind at all. Here is a link to the Zip file with five different Loanstats files from 2010 and 2011. Warning, the file is 85 Meg.
https://www.dropbox.com/s/mh8jl5lh5dfhpzu/LoanStatsArchive.zip

Let me know what your analysis finds.
Title: Additional variables in data & models for loans
Post by: TravelingPennies on October 26, 2012, 11:00:00 PM
Sweet! Thank you very much. I'll look into it and will report back :-)

The loan ratings my current model comes up with (updated occasionally) are at http://cervisia.org/lc_credit/ , btw. The model uses a variety of features I've derived from the data (among them some from the loan description), but is a simple binary model that counts late/grace/defaults as bad and fully paid as good (loans that are current aren't used). Given the recovery rate of loans it is probably too conservative in the estimates.
Title: Additional variables in data & models for loans
Post by: TravelingPennies on October 28, 2012, 11:00:00 PM
Thanks. Look forward to your results. And thanks for the URL to your model - I have seen something similar developed by other investors.
Title: Additional variables in data & models for loans
Post by: TravelingPennies on November 16, 2012, 11:00:00 PM
Quick update: it looks like it skewed my model into predicting better (more accurate?) results for loans in lower credit tranches. It's a bit odd and I had to change my definition of bad loans to exclude loans in grace period. It looks like almost everyone misses a payment every now and then.