Essays on machine learning and the cross-section of stock returns

Author(s):
Maximilian Sauer, PhD
Keywords:
Machine learning, asset pricing, stock returns forecast, ML models, glassbox models, portfolio performance

Abstract :

From blackbox to glassbox: Empirical asset pricing via interpretable machine learning. This paper employs glassbox machine learning to forecast the cross-section of stock returns, addressing the critical concern of interpretability in machine learning models. I show that an additive model, incorporating non-linear feature transformations and bivariate interaction terms, achieves performance comparable to full-complexity neural networks in terms of information criterion and sharpe ratio. Notably, this analysis reveals that half of the machine learning performance gains can be attributed to the non-linear transformations of characteristics, with bivariate interactions contributing the remainder. Additionally, the study finds that non-linearities are mainly concentrated in short-term technical factors and demonstrates how the model can be aligned with economic intuition through smoothing techniques. The study highlights that there is no trade-off between interpretability and performance in asset pricing.

Beyond the APT: Predicting idiosyncratic stock returns with machine learning. In this paper, I aim to bridge the gap between the literature on idiosyncratic stock returns and machine learning by investigating the predictability of residual stock returns. My key finding is that machine learning models, when trained on stock returns that have been ’cleaned’ of linear factor effects, significantly outperform those trained on raw returns. This approach effectively strips away linear factor exposure, enabling investors to achieve comparable performance with approximately one-third of the volatility. Traditional machine learning methods, designed primarily to minimize forecasting error, tend to over-rely on factor exposure. Although focusing solely on residual returns results in a marginal decrease in the average information coefficient, it considerably enhances stability over time. This leads to improved sharpe ratios and more attractive risk-adjusted return. The paper highlights the importance of focusing in idiosyncratic stock returns as opposed to dominant risk factors.

Publication date of the thesis
03-05-2024

Thesis committee

Supervisor:  Raman Uppal, EDHEC Business School 

External reviewer: Paolo Zaffaroni, Imperial College London 

Other committee members: Emmanuel Jurczenko, and Enrique Schroth, EDHEC Business School