Statistical Validity of Non-Trend-Following Technologies
For Automated Trading Systems
In Non-Stationary Markets

by Thomas W. Wright


I am at once amused and saddened by the lack of statistical rigor employed by those among us who choose to be in this business of building automated trading systems. It behooves each of us to maintain in our minds the central question of this occupation: "Can I be statistically confident that the system on which I am working, including its parameters, is within the population of profitable systems?" Or asked another way: "Can I reject the null hypothesis that these trading results could have been achieved by chance?"

The Demands Of Complexity Versus The Demands Of Statistical Validity

The demands of complexity are always in a tour de force with the demands of statistical validity. The demands of complexity may best be described by the Law of Requisite Variety from cybernetics. The law of requisite variety demands that an order-n problem require an order-n solution. For example, if you are driving on the freeway and run out of gas you have an order-1 problem. If you pull off onto the shoulder and blow out a tire, you now have an order-2 problem. Only an order-2 solution will get you going again. A not-flat tire won't do it. A can of gas won't do it. It will take both a tire and some gas to fix your order-2 problem.

This law suggests that since a market is a non-stationary process, nothing less complex than a non-stationary process can model it. Consequently, I view the solution to be a "non- stationary process" rather than an indicator, model, algorithm, or black box.

Let's say that you have correctly concluded that there are 10 essential forces driving the price and systemic non-stationarity of IBM common stock. Let's further say that you have isolated the 10 differential equations, which together describe the stock market, technical stock and computer sectors, and the economic conditions over the past year. Let's say that your model trades IBM perfectly with that model, trading 25 times over the year.

You are exceedingly happy. You start trading, but suffer significant drawdown. Q: What happened? A: Your view of the problem solving domain was inadequate. Though you satisfied the demands of complexity, you failed to satisfy the demands of statistical validity.

The central limit theorem, stolen by the statisticians from the mathematicians, says that for our tenth-order solution to adequately explain the market forces, it must have sufficient data. It must demonstrate that of all of the possible solutions, ours is actually from the population of valid solutions. Over my 17 years studying this problem, people have brought many trading systems to me which seem to trade well. They tout them as having been back-tested over 10 to 15 years, "therefore, they must be OK." I have had to explain to them that it is not the years of data, but rather the number of buy/sell trading decisions which are made along the way that is important. These would-be trading system builders usually do not have a clue about the subject of "degrees of complexity," which in this discussion I will call "degrees of freedom." I use "degrees of freedom" rather than "degrees of complexity" because, while the problem being solved may be measured by its complexity, problem solvers tend to curve fit problem by adding more and more "free parameters" into their solutions. A degree of freedom is frequently associated with each free parameter in a simulated trade history. They usually have used far too many degrees of freedom in their models to be statistically valid. Most statisticians would demand 30 decisions for each degree of freedom (DOF).

In the body of the above chart are the number of trading decisions / month which are required for indicators or models with from 1 to 6 degrees of freedom simulating trades over from 1 to 7 years. For example, if I am building an indicator with 3 degrees of freedom, and it only trades 2 times each month, then I need to validate it over at least 4 years of data. Or, if I am considering an indicator which has 2 degrees of freedom, and I only have 2 years of data, then the indicator must trade at least 2.6 trades per month to be considered statistically valid. The formula for each entry in the table is TPM = (30*DOF) / (252*NYears/22).

In our order-10 solution we, consequently, needed 300 trades and only had 25. That's at least one explanation for its failure. So, one tries to lengthen the test data in order to simulate more trades, only to realize that the 10 equations don't work any more at all. But, the years of data required to avoid "violation" of the central limit theorem would span qualitatively diverse market periods.

The Catch-22

If one uses enough data to be statistically valid with a useful level of confidence, the discriminating variables will come and go like tax strategies. We are forced to conclude that there can be no closed form solution.

If there can not be an exact solution, then one can only attempt to emulate a solution, while keeping degrees of freedom, total trades, and trading frequency under control. [Don't try to understand this paragraph on first reading. Just skim to the next paragraph.] Predictive models must treat an idealized price/time series (an objective function) as the output of a non-linear dynamical system whose structure may be discovered directly or synthesized. That is, an indicator or model can at best emulate the solution as an adaptive process. The adaptive process by definition will have a half-life and consequently must require a smart analyst to keep it on track. The "half-life" comment is usually true if the input data is exogenous to price, in which case the losses may be serialized. Sometimes, if the input data is price, the algorithms used to produce the intermediate results (prior to signal generation) will sooner or later self-adapt to a trend, and the "half-life" comment is not true. In this latter case, the losses may still be serialized because the algorithms introduce too much lag to be profitable.   

[This paragraph says the same thing as the preceding paragraph, but in a different way.] One may, with complete use of hindsight (yes, by cheating) construct an oscillator about zero (an objective function, a dependent variable, a perfect trader) that crosses through zero from negative to positive every time one would like to buy, and which crosses through zero from positive to negative every time one should to sell. Then, one may use statistical pattern recognition techniques (not chart patterns) to locate data (independent variables) which can be used to synthesize or emulate that perfect trader or objective function. If, during the process, one keeps close reign over the degrees of freedom, one might emulate the oscillator such that the process is statistically valid (in which case it is expected to make money). Also, if the process is designed with adaptation in mind, one can tune it periodically (in practice - weekly) and keep it working. The process is non-linear over the long term, but parts of it are maintained/treated as if they are locally linear.

Using the above techniques, one may satisfy both the demands of complexity and the demands of statistical validity. Of course, the usefulness of the resulting indicators will be a function of (1) the information content of the data chosen for independent variables, (2) the efficiency of the noise reduction employed, and (3) the ability to discern a buy/sell decision in data which invariably has a low signal-to-noise ratio.

Statistical Pattern Recognition

Statistical pattern recognition is that body of science, popularized in the 1980s by the American and Soviet navies, which almost totally eliminated submarine prop wash noises, detectable by enemy sonar. They got props to be so quiet that the largest amount of recognizable noise was coming from the galley cooks yelling at each other. Neural nets helped those studying this problem to realize that the remaining noise was coming from people. They then replaced the cooks with microwave ovens and frozen meals. But that's another story. We got a glimpse of that story in the movie "Hunt for Red October." Remember, they were huddled over the sonar, wondering if the noise on the screen was from an enemy sub or a whale. Among other technologies, they were using statistical pattern recognition.

Pattern recognition is that discipline which recognizes structure in seemingly chaotic noise. There may be information in a time series, but it is covered by the noise. The signal-to-noise ratio is very small. The information about how many people stick with their positions overnight is very obvious in open interest data. But the very numbers telling us that information are themselves almost all noise, IF one is interested in whether one should be long or short. Noise frequently dominates the data, especially if one is looking at a short planning horizon and at price or volume data. The signal-to-noise ratio is quite small for that information. But it is there. And it takes statistical pattern recognition technology to find it. (It does not find much buy/sell information in price data.) To find information in a sea of noise requires two essential operations: noise reduction (or filtering) and statistical analysis. I have a mathematical transformation which removes noise (with excellent frequency response), which works with fractional days (moving averages only work with integral days), and delivers a well-behaved (oscillates about zero as does the perfect trader I'm trying to emulate) derivative surrogate of the time series. The best part is still to come - it does all that and only adds ONE degree of freedom to my solution (requiring only 30 trade decisions).

I have another statistical evaluation algorithm which can do a good job of both analyzing an input stream and converting it to a very well-behaved oscillator (hence it generates a trading signal) and it only adds ONE degree of freedom. Using those two together, I can take most data inputs and test them against my perfect trader, while keeping the degrees of freedom to only TWO or sometimes THREE. This means that I can emulate the perfect trader, generating more than 60 to 90 trades, and keep the total length of the training period to a year. That is, it will have to trade from 5.2 to 7.9 trades each month to be statistically valid with 2 or 3 degrees of freedom and a year of data. The feat is somewhat remarkable. Unfortunately, the profitability has declined over time due to the non stationarity and increased volatility of the markets. 

Pity the analysts using systems which combine stochastics (with 1, 2, or 3 DOF) with moving averages (1 or 2 if crossing MAs or MACDs) and with additional rules (which each add 1 DOF). Those may have to reverse almost daily and work over 2 or 3 years to be considered statistically valid. And we all know how the markets can be qualitatively different over two years. I worked at a company which traded a 2-bullet system (stochastics and bandpass filters) with rules which had 15 degrees of freedom. My job was to validate their system. It didn't have a chance of making money.

Cross-Validation

The final step in the quest for statistical validity involves Cross-Validation. Cross-Validation is a statistical procedure used to avoid the problem of "over-fitting" the data. Since many statistical patterns that appear to be useful are not real but rather "fools gold." Cross-validation sometimes involves the sequestering of input data into two or (usually) three sets — (1) a learning set, (2) a testing set, and (3) a validation set. The three sets are subjected to successively harsher examination. Tests are performed to determine if parameter sets can migrate or adapt within and among the three test data sets. The obvious objective is to reject bogus statistical relationships before trading assets are lost. Too frequently, either the market non-stationarity or the lack of data is a problem. The market non-stationarity problem may be attacked by an adaptive walk-forward model. The lack of sufficient testing data problem may be attacked by a "vertical" approach. That is, using varying sources of input data over the same short time horizon in order to increase the number of independent trading decisions. This process is especially useful for trading newly created sector funds, where parameter sets may be found that can trade the dominant stocks within the sector, and added together can trade the fund profitably.   


The Importance of this Website to Your Business As the markets become more volatile, you would do well to train your quants to protect your portfolio against the ill effects of nonstationarity.

Exogenous Data Based Models – The good and bad characteristics of Exogenous Data. (Don’t miss the interesting visualization of some SPX Index Option data.)

Visualization of Exogenous Data – In case you missed it above.

Quantitative Analysis Platform – A user-friendly modeling platform for improving the productivity of quantitative analysts.

Overview – Advanced Automated trading Systems.

Consulting Services – Helping your quantitative analysts deliver a better product for your clients. 

Trading Model Building Services – Continuous and Discrete Models, using Price or Exogenous Data. 

Quantitative Analysis Training Seminars – Topics covered in typical training seminars. 

Model Validation – A Catch-22 in the struggle between the Central Limit Theorem and the “Law” of Requisite Variety.

Non Trend-Following,  Non Technical Analysis Methods – The difference that a non-price market view can make in your portfolio’s success.

Back to Home Page


© 1997-2004 Thomas W. Wright. All Rights Reserved