Training Machines to Trade Stocks
Vol. 22, No. 4, 2024
Dilip B. Madan and King Wang
Machines are trained to trade stocks by developing an investment policy for stock investment in a Markovian context. Importantly, the investment actions impact just the immediate reward and not the state transitions. The policies are designed to maximize a nonlinear expectation of the undiscounted sum of future rewards using the methods of Now Decision Theory. The nonlinear expectations, unlike expectations, are rendered risk sensitive using a distortion of probabilities. The distortions employed need not be concave and display regions of convexity making them volatility desiring at low volatility levels. The technology is illustrated by trading 589 stocks over 15 years using the policy function.