Aided by the availability of vast amounts of data computing resources, machine learning (ML) has made big strides. The financial industry, which at its heart is an information processing enterprise, holds an enormous amount of opportunity for the deployment of these new technologies.
Machine Learning for Finance is a practical guide to modern ML applied in the financial industry. This book is not only about investing or trading in the finance sector; it's much more as a direct result of the love story between computers and finance. Investment firms have customers, often insurance firms or pension funds, and these firms are financial services companies themselves and, in turn, also have customers, everyday people that have a pension or are insured.
Let’s see what Jannes has to say about finance industry challenges and how machine learning can help solve them:
Tell us a little bit about your book, Machine Learning for Finance? What were your goals and intentions when writing it? What makes this book necessary? What gap does it fill?
Jannes Klaas: The book is by no means a complete guide to everything you might want to possibly know, but it is a practice-driven introduction that will leave the reader with some ready-to-use skills. After reading only a few chapters, readers will be able to use their new skills in their professional work. This approach differentiates the book from much of the very theory-driven literature that traditionally dominates quant finance. When I started writing this book, I felt that there was a disconnect between what many people who were doing ML knew and what many people who were doing finance knew. I saw many ML experts creating “financial models” that were worse than useless from a finance perspective. At the same time, many financial practitioners had no idea of what was happening to their field. They were worried that the days of excel spreadsheets were over and that their skills would be outdated. And they were right about that. But they did not have a straightforward way to get an overview of the brave new world they found themselves in and acquire some essential skills. There was and still is a gap of understanding, and both sides need to upskill to make the most out of the opportunity that lies in front of us. I am hoping to address the gap and contribute a little bit to the furthering of the financial profession.
What are the different ML approaches in finance? Which approach do you prefer for mapping and resolving a problem and why?
JK: Depending on the task, there are a lot of different methods. So, no single approach clearly dominates. The first question to ask here is whether you want to do supervised, unsupervised, or reinforcement learning. If you have labels, that is, you know what the true prediction should have been for your training data, then a supervised approach is usually the best. Supervised learning is where most of the commercial value lies and is usually the go-to approach for anything that concerns predictions. Unsupervised learning allows you to gather insight from your data if you don’t have labels. For instance, you might be interested in finding common factors that drive stock prices. You don’t know which factors are there or even how many, so you can use an unsupervised approach to get some insight. Reinforcement learning does not require labels, but it requires some reward signal. Say, you are interested in an optimal hedging strategy. Once again, you have no idea what the optimal strategy would have been, but you do know if you made or lost money. So you can use this knowledge as a reward signal and train the algorithm to maximize it.
From an academic perspective, I find reinforcement learning very interesting. But as a practitioner, I tend to work with supervised approaches most of the time. In general, the simpler the model, the better.
Why do financial models amplify biases in data? How can you combat this bias and make ML models fair and accountable?
JK: Machine learning models are made to pick up on features that discriminate between classes in the dataset (e.g. fraudulent or genuine transactions). They can even combine features to form new features that help them with discrimination. The problem is that they can also discriminate based on protected attributes, such as age, gender, or race. This can even happen if the protected attributes themselves are hidden from the model. Say, you wanted to avoid discriminating against young people. So you will not use age as a feature, but you use the occupational status. However, the model will infer the age from the occupational status (e.g. student), and might end up discriminating on age anyhow. A second common issue is that many ML systems do not work equally well for everyone. Many computer vision systems, for instance, struggle to recognize the faces of people of color. This is very problematic if you use computer vision for verifying IDs, for instance. Combatting this requires you to first be aware of and open about the problem. The next step is to have a team that is diverse and can potentially spot the subtle patterns that lead to bias. Your data also needs to be representative of the diverse group of people you might be serving. Then there are a few technical approaches. One approach is to add discrimination to the loss function of the model. So the model needs to not only minimize its prediction error but also its bias on the protected attributes. These technical solutions can definitely be part of the response, but if you do not acknowledge and constantly monitor the problem, these solutions will not save you.
About the Book
Machine Learning for Finance explores new advances in machine learning and shows how they can be applied across the financial sector, including in insurance, transactions, and lending. It explains the concepts and algorithms behind the main machine learning techniques and provides example Python code for implementing the models yourself.
About the Author
Jannes Klaas is a quantitative researcher with a background in economics and finance. He taught machine learning for finance as lead developer for machine learning at the Turing Society, Rotterdam. He has led machine learning bootcamps and worked with financial companies on data-driven applications and trading strategies.
Jannes is currently a graduate student at Oxford University with active research interests including systemic risk and large-scale automated knowledge discovery.