David H. Bailey, Jonathan M. Borwein and Marcos López de Prado
In mathematical finance, backtest overfitting connotes the usage of historical market data to develop an investment strategy, where too many variations of the strategy are tried, relative to the amount of data available. Backtest overfitting is now thought to be a primary reason why investment models and strategies that look good on paper often disappoint in practice. Models and strategies suffering from overfitting typically target the specific idiosyncrasies of a limited dataset, rather than any general behavior, and, as a result, often perform erratically when presented with new data.
In this study, we address overfitting in the context of designing a mutual fund or investment portfolio as a weighted collection of stocks. Very often a newly minted equity-based fund of this type has been designed by an exhaustive computer-based search of some sort to obtain an optimal weighting that exhibits excellent performance based, say, on the past 10 or 20 years’ historical market data, and the fund often highlights this backtest performance.
In the present paper, we illustrate why this back test-driven portfolio design process often fails to deliver real-world performance. We have developed a computer program that, given any desired performance profile, designs a portfolio consisting of common securities, such as the constituents of the S&P 500 index, that achieves the desired profile based on in sample back test data. We then show that these portfolios typically perform erratically on more recent, out-of-sample data. This is symptomatic of statistical overfitting. Less erratic results can be obtained by restricting the portfolio to only positive-weight components, but then the results are quite unlike the target profile on both in-sample and out-of-sample data.