Financial Applications of Machine Learning

Headwinds

There are some good reasons why the methods of machine learning may never pay the rent in the context of money management.
Low Noise Tasks :
- Human beings can easily pick a person out of a crowd having seen a photograph of that person. This is a resonably "low noise" task for a human.
- It is a ML task on which there is progress, but the machines are still far behind the people.
- The gap now looks almost impossilby large, though the problem is so important resources will continue to be devoted to it.
- For a simpler example, human's can read written numbers `as well as humanly possible' --- and computers cannot yet do this well enough to replace human zip code readers in post offices, though the computers are getting very closer after 30 years' of trying.
High Noise Tasks:
- The returns on a financial assets are very noisy. On a daily basis the mean return on the SP500 is typically about 0.0005 and the standard deviation is typically about 0. 01, some 20 times as large. A similar story applies to individual stocks, but the multiple is even larger.
- Covariates do explain some stock returns, or have done so at some times historically. For example, Lo and MacKinlay found autoregressive coefficients of order 0.2 in weekly returns in the '80s. This was good enough to make bets on, but the current correlations are much weaker.
- Humans have shown not very much skill at making superior investments. Eighty percent of the money managers cannot provide enough incremental return to cover their expenses. This is one of the reasons why most retail investors would do substantially better in an index fund than in an actively managed fund. Still, for us the message is just that this is a "high noise" situation.
- Those investors who have obtained superior records (Warren Buffett, Peter Lynch, etc.) have indeed made a great deal of money. Th amount that they have made clears the Bonferroni criteria for significance --- but not by much.
- Most really rich folks got that way not by investment activities but by business activities --- taking the helm and making superior decisions in lines of bussiness.

Basic Reasons To Try

While one may be unlikely to find any massively compelling off-the-shelf ML technique for making returns that exceed one's benchmark, there are good reasons to try.

Success, though unlikely, would be very well rewarded.
There are forecasting successes. An early paper by Grainger discusses some of these. It also points to his best guess for the requirements of more substantial success.
- One of these is nonlinearity of the model. Granger gives some examples.
- Another is the possibility of "predictability at times." These are not Grainger's words but mine.
- Still, Granger's regime switching models provide a leading case.
- One does not need preditctability all the time and everywhere, just some of the time --- somewhere.
- This last property is a special feature of financial series. It would not help you in engineering. The 747 really does have to stop when it is supposed to stop --- every damn time.
"A change in magnitude can produce a change in kind"
- Chess playing programs were horribly weak. Even in 1983, I could beat the best chess playing program in the world.
- Eventually, pure computing made it possible to beat the best chess players in the world. The key was not subtle heuristics or algorithms, but the simple capacity to draw out a huge decision tree --- and to incorporate powerful opening and end game data bases.
- Now, there programs at Radio Shack that can beat me every time.

Fancier Reasons to Try

One can put the shoe on the other foot. Markets appear to be entities that learn. This has been formalizes (its still pretty informal!) by Andy Lo's Adaptive Market Hypothesis.

Hey, We've Already Done Some Trying!

Expert Models. In these models, we make on-line choices about the choice of a "zero or a one" on the next period. Many investment decisions fit into this framework.
- Market Up or Market Down
- "Value" out perform or "Growth" out perform
- Sector A does beter or Sector B does better.
Why the fit is imperfect --- but not too imperfect.
- We don't have to make all or nothing bets. We can form portfolios.
- Discretized portfolios can be reframed as "experts"
- We don't care as much about beating the "best expert" as we care about doing well. Keep in mind, here the "experts" are the strategies like "100% stocks", "50% stocks and 50% bonds" and "100% bonds".
- The analytical fact is that methods like weighted majority rule will guarantee that asymptotically you will do "as well" as the "best expert."
Classification problems are somewhat natural to financial decision problems.
- Most simply: "good" investment or "not" --- based on current observables.
- Tools like boosting or bagging are immediately relevant.
- Alex Braunstein will give us a nice example where our task is to guess the next days sign of the market return. Base rate infomed predition gives about 51% correct guesses, but boosted logit will do substantially better --- about 60% on the out of sample data.
- There is an issue about "out of sample" versus "rolling":
  - rolling is uncommon in the ML literature but it is quite a nice and logical structure --- the investigation of which may have advantages.
  - it is most natural in a financial context
  - rolling is pretty likely to do better than "training-testing" blocking.
Expert Methods have Investment Model Implications!
- Consider Zinkevich's method applied to the convex function of "under performace compared best performer." The gradient step you make moves your portfolio by bying more of what did best yesterday.
- This is a very concrete version of "momentum" investing, and it may indeed be a winning strategy.
- Zinkevich's method suggests offers new ways to think about momentum strategies.
- For one thing, it brings in statistical ideas without requiring a statistical model.
- Most likely there is a hidden model that makes Zinkevich's method "statistical" just as the Adaboost could be made statistical.
- When the weighted experts model is applied to portfolio problems, you also get a momentum strategy story.
Universal Portfolio Theory
- Quite awhile ago, Tom Cover suggested an algorithm for porfolio allocation which had some asymptotic properties that people found charming.
- This is related to --- but distinct from --- the weighted majority rules. We'll see how this works out.
- There is a nice bibilography with hotlinks for universal portfolio theory.