Clicky

  • Welcome to P2P Lending / NFT Lending Forum.
 

ETH.LOAN

News:

This was the original Lend Academy peer-to-peer lending forum, since forensically restored by deBanked and now reintroduced to eth.loan.

To restore access to your user account, email [email protected]. We apologize for errors you may experience during the recovery.

Main Menu
NEW LOANS:   | remoraid.eth 0.299 Ξ | pineco.eth 0.299 Ξ | seaking.eth 1.500 Ξ | ALL

LC database sloppiness

Started by Peter, April 22, 2014, 11:00:00 PM

Previous topic - Next topic

Fred93

While processing the loans and notes files distributed by LC, I've noted a few things that make one scratch the noggin'.  Some of them are just nonsensical sloppiness, such as giving the same field, containing the same data different names in different files, or providing the very same information in different files, but with different formats or conventions.  Then there are some things that just some kind of wrong.

Lending Club employees, feel free to defend yourself right here on the forum!

Those of you who've been thru this will  no doubt chuckle, 'cause you've all been thru this before me.

This is a mix of venting and a question or two.  I'll just spew...

1. In the loans files, there are some loans that have the status of "Current" and yet have never made any payments!  It was my understanding that the status "Issued" was for new loans that have not yet made payments, and "Current" was for loans that have made payments.  That's true sometimes, however it clearly ain't that simple.  I didn't count 'em, but there are over a dozen, and they are all recent.

Examples: ID = 12385147, or 12407908, or 12625678

In each case, total_pymnt = 0 and last_pymnt is null.  These loans have made no payments.

2. In the loans files, the term of a loan is in a field named "term" and is a text string, either " 36 months" or " 60 months" (yes it begins with a space!), whereas in the notes.csv file, the same information is in a field named "LoanMaturity.Maturity", and here it is a simple numeric, ie "36" or "60".  Different name.  Different format.  Same information.

3. In the loans files, the size of the loan is called "funded_amount".  In the notes.csv file, the same information is called "AmountLent".  (ok,  of course in one case its the total loan amount, and in the other case its the note amount, but its the same concept, and if I read both the loan files and my notes file in to do similar analysis on them, this number goes the same place in that calculation.)

4. Interest rate is really interesting.  It should be identical, but it has a different name, different format, and different data.  I'll use numbers from Loan ID 367384.  In the loans files, the field is named "int_rate", and the data is "11.26%"  (yes the percent sign is in the data), however, in the notes.csv file the field is named "InterestRate" and the data is "0.112628"  .  This is not only a different format, it is a different number.  Seems to be that way for every loan.  How can this be?  Where did those extra digits come from?  They're not shown to borrowers or lenders on the web site.  Are they paying me that extra money?

5. In the loans file, there are empty fields in some loans.  These are simply empty, ie they end up null fields when you read them into a database.  In the notes.csv file, there are also some empty fields, but IN ADDITION, there are also fields that contain the text string "null"!

These are not the only differences.  They're a sample.

Looking at these files one might imagine that they came from different companies, but in fact they came from the very same company!

If anyone has any insight on #1 or #4 I'd be interested to hear it. 




rawraw

Core, have you asked LC these concerns?  I'm sure Stephanie is waiting for your email or call


TravelingPennies

At LC's request, I sent them a screen shot showing that the notes files has more digits in the interest rate than the loans files.  I thought words explained this one pretty well, but I figured I wouldn't argue about a picture.

I think we should all ask for a data dictionary for the notes.csv file.  That way next time some guy sees that he got paid back more principal than the "AmountLent" he won't be so befuddled.


TravelingPennies

I wonder if LC has some sort of internal or external data validation process, like an internal audit of the systems.  It seems like in their line of work, they'd need it.  But I don't know how closely they are regulated since they aren't a bank.


edward

Not sure why you need to act so harsh to posters.

I agree with your concern, lascott. Some people on here, as knowledgeable as they are, just haven't mastered basic manners. Even if someone makes a mistake or doesn't understand something, we should be helping each other, or at least having a civil discourse while we exchange opinions under an inviting atmosphere. I'd bet there are some of this formum's followers who want to ask a question or make a comment, but once they read some of the postings on here, don't dare ask a question or try to contribute. I wish the forum was friendlier. I've learned so much on here, but somedays I hate having to wade through all the muck to get there.

TravelingPennies

Edward, perhaps you would be so kind as to define "muck" as it applies here.  Let me guess, your definition of "muck" is "that which does not interest Edward that day".  Do you seriously find it annoying to have to read 7 total posts per day rather than just the 4 informative ones?  This forum isn't exactly high volume.

As for being harsh to posters, I do not see what is harsh about pointing out that LC's numbers appear to be wrong.  If anything the harshness was directed at LC, not lascott.  I don't know what you guys are on about.

Is harsh anything like this edward?
https://forum.lendacademy.com/index.php?topic=1502.msg11542#msg11542">Quote"> from: edward on August 29, 2013, 11:17:47 AM


TravelingPennies

I have a theory on the interest rates with six digits, such as "0.112628" in my notes.csv file.

I observed that these only occur on very old notes.  That  has led me to suspect that this is a "historical oddity", which likely has little importance for the present or future. 

I usually apply date cutoffs when I analyze my notes.csv file anyway, because my strategy has changed so much over time that I am usually uninterested in performance of old notes.



NEW LOANS:   | remoraid.eth 0.299 Ξ | pineco.eth 0.299 Ξ | seaking.eth 1.500 Ξ | ALL