The trading system development flowchart has two paths. One uses the traditional trading system development platform; the other machine learning.
Both paths begin with issue selection, data preparation, and transformations. Both lead through model fitting and validation, producing a set of trades that are the best estimate of future performance.
The upper path, indicator-based development, uses traditional trading system development platforms such as AmiBroker. The lower path uses machine learning.
The two have a fundamental difference. With traditional platforms, the technique begins with calculation of an indicator, then sees what happened after. With machine learning, the technique begins with identification of desirable trades, then sees what happened earlier. Both use models.
Models are defined in two ways:
- Supervised. Where they are supplied with known target values to use during fitting.
- Unsupervised. Where no known target is supplied.
- Classification. Where the model separates according to categories.
- Regression. Where the model estimates a continuous value.
Almost all models used in trading systems are supervised models.
Models that produce signals — such as Buy, Sell, beLong, beFlat — are classification models. Models that produce estimates — such as percentage change from today’s close to tomorrow’s high — are regression models.
Dr. Jason Brownlee has written several excellent and applicable essays, including this one: Difference Between Classification and Regression in Machine Learning.
Recall my suggestion for the two questions you should ask yourself when designing your trading system:
- What would I like to know about tomorrow? For this article, and for most of trading system development, I would like to know whether tomorrow’s close will be higher or lower than today’s close.
- What would I do if I knew that? If I knew that the price would be higher tomorrow, I would take a long position at the closing price at the close of trading today (or continue to hold a long position established earlier), and hold for one day. Otherwise, I would remain flat with no position.
The first question can be answered using a classification model that predicts one of two classes — higher or lower. Depending on that prediction, I would trade long or remain flat. Whether I can trade at the close depends on several things (all of which are feasible):
- data is available
- computation of signal is fast
- either data is available shortly before the close and I trade MOC; or I use closing data and trade in the after-hours session.
The data science profession makes use of many types of models, including:
- linear and non-linear classification
- linear and non-linear regression
- decision tree
- support vector machine
- nearest neighbor
- neural network
The model most often used by traditional trading system development platforms is the decision tree. The decision tree model is built into the design and capabilities of the platform and we have no other choice.
The model used in a trading system based on machine learning can also be a decision tree. But one of the more complex models is often chosen instead. Ensemble models, which combine many individual models, are very effective.
Two of the videos I have posted to YouTube will help in understanding and comparing these two approaches as they apply to trading systems.
I recommend that you watch one or both of these videos, then continue with this series of articles.