(joint work with Antoine Didisheim, Barry Ke, and Bryan Kelly) We theoretically characterize the behavior of machine learning asset pricing models. We prove that expected out-of-sample model performance---in terms of SDF Sharpe ratio and average pricing errors---is improving in model parameterization (or ``complexity''). Our results predict that the best asset pricing models (in terms of expected out-of-sample performance) have an extremely large number of factors (more than the number of training observations or base assets). Our empirical findings verify the theoretically predicted ``virtue of complexity'' in the cross-section of stock returns and show that models with tens of thousands of factors significantly outperform simpler alternatives. We derive the {\it feasible Hansen-Jagannathan (HJ) bound}: The maximal out-of-sample Sharpe ratio achievable by a feasible portfolio strategy. The infeasible HJ bound grossly overstates the achievable maximal Sharpe ratio due to a {\it complexity wedge} that we characterize.
|