Clicky

  • Welcome to P2P Lending / NFT Lending Forum.
 

ETH.LOAN

News:

This was the original Lend Academy peer-to-peer lending forum, since forensically restored by deBanked and now reintroduced to eth.loan.

To restore access to your user account, email [email protected]. We apologize for errors you may experience during the recovery.

Main Menu
NEW LOANS:   | 804.eth 2.500 Ξ | remoraid.eth 0.299 Ξ | remoraid.eth 0.299 Ξ | ALL

Additional variables in data & models for loans

Started by Peter, October 19, 2012, 11:00:00 PM

Previous topic - Next topic

breitenm

Hi,

Does anybody know when the new variables LC announced in their most recent blog-post ( http://blog.lendingclub.com/2012/09/28/investor-updates-and-enhancements/" class="bbc_link" target="_blank">http://blog.lendingclub.com/2012/09/28/investor-updates-and-enhancements/) are going to show up in the CSV files for download?

Also, when you guys are picking loans, how do you count loans marked as late (or grace period) in the LoanStats.csv file? I'm building a model on the data and am currently counting them as bad, but that might discount the fact that many of them do recover. This may make the model forecast too pessimistic. Any thoughts?

Markus

Peter

Most people who are doing models for Lending Club and Prosper, when looking at the entire database, use a discounting method for late loans. You can see Lendstats model here:
http://www.lendstats.com/loansearch/lc/lcloanfilter.php" class="bbc_link" target="_blank">http://www.lendstats.com/loansearch/lc/lcloanfilter.php
They use loss factors of 0.5 for payment plans, 0.25 for in grace period, 0.5 for 16-30 days late, 0.75 for 31-120 days late and 0.99 for defaults.
Others, such as Interest Radar, use the Lending Club recovery rate data which is more optimistic than Lendstats:
https://www.lendingclub.com/info/statistics-performance.action" class="bbc_link" target="_blank">https://www.lendingclub.com/info/statistics-performance.action
Publisher of the Lend Academy blog

See my returns here: http://www.lendacademy.com/returns

brycemason

One way to go is only to model on loans that have termed out. No discounting necessary. Downside is you're three years out of date.

TravelingPennies

Does the LendStats model use historic data? It seems that the data available for download shows the current status of loans and does not include historical status changes. For example, I just dug out old loan files from September and October and LoanID 1024323 went from Late31-120 to Late16-30 in September back to Current in October. Unless these changes over time are accounted for and, say, the "worst" state for over the lifetime of the loan is being used in the model, then the discounting factor of the loan would change all the time.

Does anybody know of a way to get old versions of the LoanStats files?

TravelingPennies

You are right that once a loan goes back to current at Lendstats and others it is treated as always being current. But we know that it has a higher likelihood of default than a loan that has never been late.

There is no way to obtain old versions of the Loanstats.csv, it is updated every day. I have downloaded about a dozen versions on my computer dating back to 2010 so I can see how things change over time.

TravelingPennies

Could you put them in zip-file somewhere? :-) Then I can update my model to account for loans that were paying late.

TravelingPennies

Actually I don't mind at all. Here is a link to the Zip file with five different Loanstats files from 2010 and 2011. Warning, the file is 85 Meg.
https://www.dropbox.com/s/mh8jl5lh5dfhpzu/LoanStatsArchive.zip" class="bbc_link" target="_blank">https://www.dropbox.com/s/mh8jl5lh5dfhpzu/LoanStatsArchive.zip

Let me know what your analysis finds.

TravelingPennies

Sweet! Thank you very much. I'll look into it and will report back :-)

The loan ratings my current model comes up with (updated occasionally) are at http://cervisia.org/lc_credit/" class="bbc_link" target="_blank">http://cervisia.org/lc_credit/ , btw. The model uses a variety of features I've derived from the data (among them some from the loan description), but is a simple binary model that counts late/grace/defaults as bad and fully paid as good (loans that are current aren't used). Given the recovery rate of loans it is probably too conservative in the estimates.

TravelingPennies

Thanks. Look forward to your results. And thanks for the URL to your model - I have seen something similar developed by other investors.

TravelingPennies

Quick update: it looks like it skewed my model into predicting better (more accurate?) results for loans in lower credit tranches. It's a bit odd and I had to change my definition of bad loans to exclude loans in grace period. It looks like almost everyone misses a payment every now and then.

NEW LOANS:   | 804.eth 2.500 Ξ | remoraid.eth 0.299 Ξ | remoraid.eth 0.299 Ξ | ALL