Confessions of a Fintech Chief Data ScientistJanuary 8, 2017 | By: Justin Dickerson, PhD, MBA, PStat
My name is Justin Dickerson. For most of 2016, I was the Chief Data Scientist at Snap Advances (Snap), a funding company of merchant cash advances based in Salt Lake City, Utah. I can’t discuss my awesome work at Snap for obvious reasons. And fortunately, I don’t need to in order to make the key points I want to convey through this article. That’s because I’ve also been a senior level data scientist at two other companies, and I’m also a well-regarded statistician who holds one of the most prestigious credentials offered by the American Statistical Association.
One discovery over the past year prompted me to start collecting my thoughts for this article. I was looking at the financial performance of On Deck Capital (the largest company in the alternative fintech industry which is also publicly traded) through the first nine months of 2016 relative to the same period in 2015. Gross revenue increased more than $22 million while net income for the same period fell nearly $50 million. I’m not an accountant, but that doesn’t sound good to me. And let’s face it, this fact doesn’t surprise anyone in our industry, especially given what’s happening at CAN Capital. But one interesting and overlooked fact is worth considering. According to my Linkedin search, there were between 30 and 40 data scientists (all levels) working for On Deck Capital during the same time period in which they lost $50 million. So, not only does On Deck Capital lose a lot of money, it appears they need a lot of intellectual horsepower to figure out how to do so.
And here we are today. We’re looking at an industry full of companies trying to navigate the abyss of hyper-aggressive originators and spiraling default rates. If you’re a Chief Data Scientist for one of these companies, you’re undoubtedly feeling the heat from your management team. The problem is simple. How do you grow your business (or even stabilize it) in an environment where you have to take too many uncomfortable risks? We’ll ignore the fact this question has plagued much larger industries for many years (e.g., trying to compete against Wal Mart in the retail space). Boards of Directors in alternative fintech have short memories and believe this is a unique problem to their industry and era. As a result, data scientists are at a premium as they’re seen as key players in how to resolve this crisis and steer their companies to safe harbors. Well, here is my opinion. They’re dead wrong, and here is why.
Data Scientists Are Tactical, not Strategic
This statement may end up being the most controversial thing said in the data science industry this year. But let me make my case. Of those 30-40 data scientists working for On Deck Capital, more than 80% of them have a Master’s degree in a field of study synonymous with data science. Specifically, many of them attended Columbia University’s Master’s degree program in Operations Research. The four required courses for that degree are: Optimization Models and Methods, Introduction to Probability and Statistics, Stochastic Models, and Simulation. From there, students can choose from one of six concentrations (all but one of which are targeted toward quantitative methods). Further, students selected for this program already have highly refined quantitative skills as demonstrated by the pre-requisite courses for admission (e.g., multivariate calculus, linear algebra, etc.). So, in essence, the program takes really smart quantitative people (quants) and makes them even smarter quants, while sprinkling in 6 elective courses which may or may not provide an opportunity to learn something about the “real” world of business.
Make no mistake, the students attracted to programs such as these generally aren’t the professionals you send to meet with investors and pitch them on new strategic directions for a company. They are the professionals who sit in cubicles and spend their days writing code. They are experts in programming languages such as R, Python, Java, Scala, and many others. Ironically, they are enslaved to similar rules which govern the same supervised machine learning algorithms they create each day. They aren’t allowed to “get out of the box” and see the “forest through the trees.” If I’m portraying them as a bit robotic, that’s intentional on my part.
I don’t want to leave the impression data scientists can’t think for themselves. Specifically, those who earn a PhD are known to have such skills and are often praised for their abilities to rise above the technical chains of their existence and offer strategic direction to an organization. But they are few and far between in the data science factory found deep in the bowels of companies like On Deck Capital. Instead, more and more alternative fintech companies seek out the same “cookie-cutter” data scientist who can check off the same boxes on the hiring list. This means the data scientist role is relegated to a part of the company lacking diversity of thought, creativity, and the organizational respect needed to save a company from itself.
The Law of Diminishing Returns
One of the most intelligent questions asked of me within the alternative fintech industry was, “do we really have enough data to justify so many data scientists?” As a Chief Data Scientist, you always want to answer that question with an emphatic, “YES!” Even better, you may tell your management team you need even more data scientists to make a “real and lasting contribution to the company.” After all, the existence of your team depends on it. But when you’re away from the management team and thinking about the structure of your department, the honest Chief Data Scientist knows the company is at risk of experiencing the law of diminishing returns.
All of us can recognize the law of diminishing returns from our freshman year Economics course. In short, it’s the concept of achieving less than a one to one relationship between an additional unit of input relative to the resulting measured output. For example, the reduction in default rate for a financial product is hardly ever proportional to the number of data scientists employed by the company to predict default rates. In fact, I would argue once you have more than two or three data scientists, even the largest organizations would have a difficult time justifying the payroll investment based on proportional gains in default rate management.
So, why do companies like On Deck Capital have so many data scientists? I believe it’s more akin to the comfort food we all like to eat in the winter. There is hardly anything as satisfying as my grandmother’s homemade chili during a cold Utah night. And the more of it I get, the warmer I feel! The problem is the chill of winter eventually fades and the light of day shone on financial statements eventually begs the question of whether we’ve simply eaten too much.
Make no mistake, NO organization needs endless amounts of data scientists to be successful. In fact, I would argue two or three excellent data scientists armed with superior data science/machine learning platform technology such as those offered by IBM, Microsoft, or DataRobot is more than enough to guide an organization to success. The key when thinking about staffing a data science department is to think in terms of credibility. If I have three data scientists each armed with PhD training, 15 years of industry experience, and the tools (such as a great machine learning platform) to do the mundane parts of data science usually done by legions of Master’s degree data scientists, am I more credible in the organization than I am with 30 quants who all grew up in an economy where nothing bad ever happened to financial institutions? If you want your data scientists to help your organization, you’ve got to be willing to let them into the board room and present digestible recommendations for action. So the question becomes, do I have a team that is credible enough to meet such a standard?
The Supremacy of Domain Expertise
I learned a lot during my time as a Chief Data Scientist. Since leaving Snap, I’ve established two companies. The first is Crossfold Analytics. This is my data science consulting company. We only serve the fintech industry and we spend most of our time building real-time machine learning prediction services for small to mid-sized fintech companies. And I think we’re darn good at it! The second company is Crossfold Capital. This is my independent sales organization (ISO) focusing on merchant cash advance, business loan, and factoring products. It was when I established Crossfold Capital that I learned the most valuable lesson of all about data science in alternative fintech. Nothing will ever replace the experience of working in the trenches of the business (what I call “domain” expertise). In alternative fintech, this is generally working within the trenches of a sales organization. If I could go back in time and start over as Chief Data Scientist at Snap, I would start my job by underwriting files and selling merchant cash advances for a month. Absolutely nothing I learned in math, statistics, or any quantitative subject can replace what I’ve learned running my own ISO in just the past two months. I wish every alternative fintech company would adopt a training program for data scientists that allowed them to spend their first month in the field calling on clients and working with potential customers. If you understand the business, you can bring immeasurable value to your company by blending that understanding with your technical skills as a data scientist. I truly believe such an approach could take the power of a data scientist and magnify it three-fold. Otherwise, you end up having a rogue department of quants that people in the trenches of the business either don’t understand or don’t trust.
My Recommendation to Alternative Fintech Companies
Based on what I’ve learned as an alternative fintech data science professional, I would make three recommendations to all companies in our industry. First, hire diverse talent. It’s imperative a data scientist knows enough about coding to be effective at building predictive models. But I would trade extensive coding expertise for a data scientist who also had a Bachelor’s or Master’s degree in business administration. We don’t need an army of robots in data science. We need gifted thinkers who also happen to have advanced technical skills. Second, don’t “over-eat” even though it can be cold outside. More data scientists aren’t going to solve your problems. In fact, hiring the same type of data scientist only encourages “group-think” which can actually be very detrimental to your organization. Focus on building a credible data science department, not a massive data science department. Finally, put your smartest people in the dirt of the business. Have them spend a week underwriting files. Then send them to sell your products with one of your ISO managers. Don’t treat your data scientists as fragile figurines. As a good friend of mine from Texas says about his gun collection, “they may be worth a lot, but they’re so dirty from hunting you wouldn’t know it!”
I hope my confessions help your organization navigate both fair seas and choppy water.Last modified: January 8, 2017
Justin Dickerson has spent nearly 20 years in the data business. His company, Crossfold Analytics (www.crossfoldanalytics.com) helps small to medium sized fintech companies use data science intelligently to achieve their business goals. He can be reached at firstname.lastname@example.org, or at 385.202.4630.