PDF Version: P123 Strategy Design Topic 6 – Momentum

This can be a perplexing topic for many.

We’re often told “the trend is your friend,” “don’t try to catch a falling knife,” “don’t get in front of a speeding train,” “ride your winners and sell your losers,” etc. We encounter likewise in the many trend-oriented technical indicators based on multitudes of moving average and combinations and variations. All of these reflect a belief in the efficacy of momentum, which is based on the assumption that rising stocks will continue to rise and vice versa. Surely such strategies have produced good experiences; otherwise the mo-mo lingo wouldn’t be as prevalent as it is.

Would that life could be so simple.


We also hear that you should “buy high and sell low,” which is the exact opposite of momentum. We also have a family of technical indicators addressed to this, including RSI, Stochastic and more. Moreover, there’s abundant intellectual firepower dedicated to the quest to demonstrate that momentum is stupid, and too many stories of unfortunates who bought at the top. In fact, disdain for momentum investing is so prevalent, four academicians have found it worthwhile to publish a paper that, articulates and refutes ten objections regularly leveled against this investment style. Clifford S. Asness, Andrea Frazzini, Ronen Israel, and Tobias J. Moskowitz, Fact, Fiction and Momentum Investing, Journal of Portfolio Management, Fall 2014 (40thAnniversary Issue).

Let’s elevate the discussion above the usual sidewalk-brawl level.

We can start with quantitative-finance research and its affinity for the acronym I.I.D., which stands for: “independently and identically distributed.”  That is one of the core assumptions underlying work in the field. When researchers say stocks are “identically distributed,” they are typically suggesting more than the idea that all distributions of equity returns have the same sets of characteristics; they also suggests that all these sets of returns satisfy the elements of a statistically normal distribution. That, in and of itself, is very controversial. But it’s not the subject of our immediate inquiry. Instead, we’re concerned with the other component of I.I.D., the assumption that stock returns are “independently” distributed, or put another way, that returns observed for one period have no influence on returns achieved in any other period.

That sounds good if you say it fast enough. But slowing down, consider consumer stocks that rally based on favorable news relating to employment or some other relevant metric. Are positive returns from one day to the next REALLY independent. We can easily observe serial correlation. Must we dismiss it as coincidence?

Let’s Slow Down and Try To Make Sense of All This.

Common sense makes it clear that in a vacuum (or in a basic Monte Carlo simulation), A (past returns) cannot and does not cause B (future returns). That, however, is the right answer to the wrong question. The correct question is whether persistence in C, an external phenomenon, can cause A and B, both of which are rationally related to C, to act in such a way as to give rise to the appearance that A is influencing B. If we’re right about the causal effects of C, then the answer is “yes;” We often observethe appearanceof A causing B.

That is what opens the door to momentum investing – but not so widely as to foreclose the potential for you to get pummeled even if you did a good job detecting and measuring momentum. When C finally ceases to be, the stocks will cease acting as if it were still present. That’s when you need to sell your high flier or grab the fallen knife. When C vanishes, contrarianism enters. This is why both sets of contradictory adages can be simultaneously correct.

The key to momentum investing is detecting and measuring C and its persistence (or lack thereof). If you get C right, you can make a monstrous amount of money based on momentum. If your confidence is high, you should aim at the most volatile corners of the market you can find. (This, by the way, is the reason so many nano-cap negligible-liquidity Portfolio123 strategies did so well for so long; they rode the biggest C factor of all, the collapse in interest rates and the surge in excess capital, which as a liquidity cycle peaks flows into the lowest quality assets, the ones that couldn’t attract capital back when investors had more plausibly-valued buying opportunities. Helping too was the general persistence of other C factors relating to other aspects of the economy.)

Thinking About A, B and C in a Real-World Context

Many investors do homework to discover and assess the likely ramifications of C. That’s economic and fundamental analysis.

Others, however may use Signal A alone, either piggybacking on the efforts of those who engage in fundamental analysts or implicitly relying on an assumption that the kinds of fundamental factors that generate A are more likely than not to be sufficiently persistent to also cause B. That, rather than a simplistic “if A then B” framework, is the logical underpinning to momentum investing. It’s also the essence of technical analysis. (Note though that many technical practitioners don’t talk in these terms. Many prefer to think in terms of stocks having souls or personalities of their own. But serious practitioners, including the kinds who invent rather than just follow trading systems, don’t use soft notions like that. They understand the role of C factors and rigorously analyze the way traders and investors behave in response to them. That’s what underpins the indicators and systems they invent. )

Let’s envision this in market terms. Assume that in Year One, the company introduces a new product that turns out to be very successful (event C_0) causing net income to exceed initial expectations (event C_1). Assume, too, that dividend rises by more than previously expected as do expectations of future dividend growth. Finally, assume the risk and required rate of return do not change. Based on the modifications investors make to two of the dividend discount model’s inputs, even the most ardent efficient-market adherent would expect the stock to perform well. (The efficient market adherent would also suggest that the stock’s adjustment to the new state of affairs would be so rapid as to eliminate the possibility that a thoughtful investor would be able to buy shares quickly enough to generate alpha as a result of having anticipated the development. But they would not deny that the stock price would have to rise.) Hence the stock displays positive movement (C_0 causes C_1 which causes A). Let’s now assume it takes more than a year for distribution of the new product to reach the point where the market is saturated and that earnings and dividend growth in year two exceed even the recently-revised expectations (C_0 and C_1 now cause C_2). That would lead to further strong share performance as valuation continues to be reassessed (C_2 causes B). So again in Year Two, the stock performs well.

Obviously A (the Year-One momentum) did not cause B (the strong Year-Two performance). Instead, A and B were both independently caused by ultimately by C_0, a set of business fundamentals, the introduction and success of the new product which sparked two serial and related phenomenon, C_1 and C_2.

That being the case, we’re equally unable to say that the Year-Two return is truly independent of the Year-One return. While the former was not directly caused by the latter, we cannot completely separate them. A momentum investor who observed the strong returns achieved in Year One could have made an overt or implicit assumption that others – fundamental investors –whose demand for and purchases of shares were causing the stock to generate superior returns were responding to favorable business fundamentals and that these change slowly and infrequently therefore making it reasonable for momentum investors to ride piggyback, so to speak, on the analytic efforts of such others. In the alternative, a fundamental investor could come upon the stock through a stock screen based on momentum and choose to do his own research, and ultimately to buy the stock based on a belief that the product will continue to favorably impact earnings and dividends in Year Two.

Hence we have two scenarios in which momentum investing succeeds based on clearly identifiable and understandable reasons. And we accomplished this based entirely on the limiting assumption that P=V. If we add Noise to the mix, i.e. we recognize that P=V+N, the opportunities for successful momentum investing are amplified as investors can ride piggyback on or be inspired to do research based on the variety of non-value-based factors that could induce investors to buy or sell a stock. As was the case with value-based momentum, noise-based momentum will or will not succeed based on whether the factors that influenced returns in Year One persist in Year Two.

So in practical terms, the controversy regarding momentum investing is not really a debate over whether it’s correct to say whether sequential events A and B are truly independent. Instead, for investors, the issue is whether C is sufficiently and systematically sustainable enough to infer, based on observation of A, the likely occurrence of B. Given this framework and an understanding that momentum investing can indeed be built on a foundation of fundamentals, let’s now consider some research into whether and how momentum may be used to generate alpha.

 Examining Some Momentum Factors/Formulas/Raking Systems

There are countless ways the presence or absence of momentum, and the strength of whatever trends might be present, can be measured. This is good in that it allows for much creativity. But it’s bad in that it is very easy for one to get overwhelmed by specific variations to the point where one has gets completely lost in the trees and winds up unable to see the forest.

Keeping our focus on how and why momentum can work will help us cut through the clutter and quickly get to usable momentum expressions. I’ll demonstrate this by displaying results of a series of tests of 12 different and generally representative momentum approaches. All are valid to some degree. But in comparing which ones tested better, and searching for explanations in the above-discussed framework, we’ll be able to derive some general modeling principles.

All tests run from 1/2/99 to 6/10/16, operate on the PRussell3000 universe, compare a best (to 25%) group to a worst (bottom 25% group), and exclude trading costs since I’m not trying to derive an investable strategy; instead, I’m exploring the way different kinds of models relate to one another. Each model will be tested using three different rebalancing intervals; 1 week, 4 weeks and 3 months.

We’ll start by looking at four simple models that define momentum in terms of share return over the last X number of days. As simplistic as this approach is, you’ll notice that it can be quite usable – if you use a reasonable timer frame for your X value and if you use a reasonable rebalancing interval.

  • Share Return – 20 days

o  Best: FRank(“close(0)/close(20)”) > 75

o  Worst: FRank(“close(0)/close(20)”) < 25



  • Share Return – 60 days

o  Best: FRank(“close(0)/close(60)”) > 75

o  Worst: FRank(“close(0)/close(60)”) < 25



  • Share Return – 120 days

o  Best: FRank(“close(0)/close(120)”) > 75

o  Worst: FRank(“close(0)/close(120)”) < 25



  • Share Return – 240 days

o  Best: FRank(“close(0)/close(240)”) > 75

o  Worst: FRank(“close(0)/close(240)”) < 25



It’s easy to fall into the trap of assuming it’s important to stay fresh, as “up to date” or better still “up to the minute” as possible. It’s easy to see why given the way the stock market is depicted starting with the old continually scrolling ticker taper with traders hanging out at brokerage offices reacting immediately to what they see to massive amounts of money being spent today by pros for Bloomberg terminals and ever faster parsing and dissemination of data to the overall TV/internet/movie culture that depicts the stock market in terms of flashing screens and frantic gesticulating. Bear in mind though that we don’t (at least not to my knowledge) have empirical studies indicating whether tape watchers or Bloomberg users outperform others.

So let’s slow down and remind ourselves why momentum can work? It works when it is caused by persistence of core fundamental drivers of share prices. Such persistence is not something that can be measured in minutes or a day, or often, even in a week. So as surprising as this may seem, jackrabbit-like behavior is not a good idea for momentum investors. Believe it or not, momentum requires patience. Persistence needs time to manifest and influences the stock.

This is not to say nobody can make a buck with rapid-fire trading. People can and do. But they work under entirely different frameworks than we do and have different tools and data sets. And they most definitely do not do things because somebody told them they’re supposed to do it a certain way. They study and analyze behavior of market participants at least as thoroughly as we study company fundamentals.

Just as I’ve been preaching that our modeling should be based on the DDM and logical adaptions thereof, those who succeed as traders base their strategies on models of behavior. You can even see an example of that with the test of the 20-day share returns. Superficially, it looks like support for the notion of mean reversion. But mean reversion in the market does not exist in its own right; it’s just a convenient label casual observers stick onto some sort of phenomenon. When we speak of pricing mean reversion (including bounces or corrections), we’re really talking about tendencies of a subset of traders to book quick profits, or make contrarian buys. All transactions involve buyers and sellers and both parties are influenced, each in their own way, not just by fundamentals but also price in relation to their past transaction prices and in relation to fundamentals. Mean reversion will be noticed by casual observers if, when and to the extent behavioral factors dictate that stocks move this way. And as the 20-day price change model as well as the 5-20 day SMA comparison remind us, these behavioral tendencies tend to be quite entrenched.

Unless you really study and understand the behaviors that make short-term trading what it is, I suggest you avoid the shortest measurement periods and the shortest rebalancing intervals. You can data mine these things. But making them work in real life over sustained time periods is a treacherous task, unless you really understand the whys and wherefores of what you’re seeing.

We’ll see similar ideas play out in the next three models, which are based on simple moving average comparisons.

  • 5-day Simple Moving Average relative to 20-day SMA

o  Best: FRank(“sma(5)/sma(20)”) > 75

o  Worst: FRank(“sma(5)/sma(20)”) < 25



  • 20-day Simple Moving Average relative to 60-day SMA

o  Best: FRank(“sma(20)/sma(60)”) > 75

o  Worst: FRank(“sma(20)/sma(60)”) < 25


  • 50-day Simple Moving Average relative to 200-day SMA

o  Best: FRank(“sma(50)/sma(200)”) > 75

o  Worst: FRank(“sma(50)/sma(2000)”) < 25


We see, here, pretty much what we saw with simple share-price change. Patience is a virtue. Being a jackrabbit is not. The phenomenon of persistence, by its very nature, needs time to play out.

Let’s switch gears now and check five ranking systems. Three will be pure momentum. One will be a combination of momentum and analyst sentiment. The other will be all sentiment. Sentiment (which we examined closely in Topic 5) generally correlates with momentum so in one sense, it is part and parcel of the same style. But it will be interesting to see if the considerations that make momentum usable play out when we turn toward “softer” human versions of the approach.

We’ll start by looking at three basic models.

  • Basic: Momentum ranking system

o  Best: Rank>75

o  Worst: Rank <25

o  Parameters

  • Price Change (65%)
  • Close(0)/close(120) – 20%, higher is better
  • Close(0)/close(140) – 20%, higher is better
  • Close(0)/close(160) – 30%, higher is better
  • Close(0)/close(180) – 30%, higher is better
  • Up/Down Ratio (35%)
  • UpDownRatio(120,0) – 30%, higher is better
  • UpDownRatio(60,0) – 30%, higher is better
  • UpDownRatio(20,0) – 30%, higher is better



  • Alternative Momentum 1

o  Best: Rank>75

o  Worst: Rank <25

o  Parameters

  • Close(#week)/close(6*#month)+#week – 100%, higher is better
  • This is a 6-month return change stepped back to exclude the most recent week



  • Alternative Momentum 2

o  Best: Rank>75

o  Worst: Rank <25

o  Parameters

  • PctAvg(11,20,20) – 100%, higher is better
  • This is a variation the preceding theme; here, we’re using a longer overall period and eliminating the most recent month rather than offsetting a full-period by a week



  • Momentum-Sentiment combination ranking system

o  Best: Rank>75

o  Worst: Rank <25

o  Parameters

  • Price Momentum (50%)
  • Close(0)/HighVal (241,0,#High) – 25%, higher is better
  • Close(50)/sma(200) – 25%, higher is better
  • PR52WRel%Chg – 25%, higher is better
  • PR26WRel%Chg – 25%, higher is better
  • Analyst Data (50%)
  • NextFYUpRevLastWk/NextFYUpRev4WkAgo – 25%, higher is better
  • Surprise%Q1 – 25%, higher is better
  • AvgRec/AvgRec4WkAgo – 25%, lower is better
  • (CurFYEPSMean-CurFYEPS4WkAgo)/abs(CurFYEPS4WkAgo)–25%, higher is better



  • Basic: Sentiment ranking system

o  Best: Rank>75

o  Worst: Rank <25

o  Parameters

  • Estimate Revision (50%)
  • (CurFYEPSMean-CurFYEPS4WkAgo)/abs(CurFYEPS4WkAgo) – 3%, higher is better
  • (CurQEPSMean-CurQEPS4WkAgo)/abs(CurQEPS4WkAgo) – 3%, higher is better
  • CurQEPSStdDev/abs(CurQEPSMean) ) – 3%, lower is better
  • Surprise (30%)
  • Surprise%Q1 – 65%, higher is better
  • Surprise%Q2 – 35%, higher is better
  • Recommendations (20%)
  • AvgRec – AvgRec4WkAgo – 75%, lower is better
  • AvgRec – 25%, lower is better


All of these ratings-based approaches appear to be viable and it’s likely many others will test well too. One important advantage of a ranking system is that it can be have more than one factor, and contrary to what statisticians preach in other disciplines, multiplicity is often a good thing for investment modeling because we are not concerned with measuring anything per se but instead with creating necessarily imprecise (sometimes spectacularly imprecise) proxies for theoretical ideas (i.e. relating to DDM). When we search for the benefits of persistence, as we do when we model for momentum, there is no precise definition of persistence. Hence there is much to be said for using multiple factors in order to stay open to different ways in which persistence may manifest.

Interestingly, the single-factor systems that worked especially well were fairly long-term in nature, one addressing an approximately 6-month period and the other addressing a year. Use of a measurement period that long enables more varieties of persistence to fall within a single ambit.

Another interesting thing we see is an increase in effectiveness of one-week rebalancing as we turn away from pure price-volume toward models that include reference to analyst sentiment. Is this an exception to the general principle that patience with momentum is a virtue? No, not really.

The benefit of patience is based on what is needed for fundamental persistence to become manifest. That’s important for pure price-volume-based momentum. Without persistence (and the patience it entails), we’d be stuck making the ridiculous assertion that A (past price action) directly causes B (future price action). Sentiment doesn’t work in these terms.

Sentiment, besides being able to serve as a proxy for the G item within the DDM, is also important outside the realm of valuation. It can be and often is an important generator of noise. Estimates rise. Investors react in a knee jerk manner and move the stock up. That’s noise. It’s direct and it’s immediate. That’s what allows sentiment data to be effective for the short rebalancing interval, even while it can also be usable with longer intervals (i.e. the time it takes for the things analysts expect to play out in the business world and then in the stock market).

Finally, let’s look at three larger tables that present the foregoing information in a different way. We’ll start by looking at a table that examines how all models compare to one another assuming all are rebalanced every week. The other two tables will show similar comparisons but for 4-week and 3-month rebalancing protocols.




 Obviously, this set of tests does not cover every possible variation on the momentum theme. So it is very possible you now have or may develop different approaches that are usable. As you do that, and as you test and interpret your results, keep in mind the following core principles.

  • Because Momentum is a proxy factor, one that is useful not in and of itself but because it represents something else (persistence of relevant fundamentals), aim for patience. This addresses the factor itself (a 240-day measurement period, for example, as opposed to a 5-day period) and to the rebalancing interval. There is no set of single correct answers (sometimes 4 weeks is better than 3 months, other times, it’s the reverse). What’s most important is that you be aware of what you are trying to do and the role of patient persistence.
  • If you find something that works in a faster manner, don’t use it unless you develop a rational explanation for why it’s OK to jackrabbit. The above remarks regarding Sentiment are an example. Remember, a successful test that cannot be explained is data mining or curve fitting. That can always be accomplished. But as many have seen, live-money (out of sample) performance is a whole different manner and for that, conceptual soundness is mandatory.

Finally, remember that the tests illustrated here were academic in nature. They compared best and worst quartiles of an approximately 3,000 stock-universe. Each of those quartiles-buckets contained approximately 750 stocks.

Recall, too, that we looked at Momentum alone. In terms of our formal theoretical structure, we could, obviously, see Momentum as Noise. And in terms of our DDM (Dividend Discount Model) construct, it’s role is primarily to serve as a proxy for G (Growth) under the theory that the stock would not have been behaving this way unless theyexpected good things etc., etc., etc. However, none of the foregoing tests used Momentum in a full-fledged context; we did not address the relationship between P and E (or any proxy for E) nor did we address R, risk.

It’s possible you may be able to get from a concept that works in terms of 750-stock quartile buckets down to an investable portfolio of 10-40 stocks based on Momentum alone. But it won’t be easy. When building investable models, you may well find it easier to use momentum as one of a collection of relevant factors.

Here, for example, is sim data for an in-progress model I’m working on. It starts with a custom universe built from the PRussell3000 but which carves out a momentum subset consisting of the top third based on criteria similar to those of ranking system “Alternative Momentum 1” as described above. (The universe also omits ADRs and MasterLPs, stocks with market caps below $250 million, and stocks for which 60-day average dollars traded is less than $500,000.)

I am going to keep the details proprietary given the nature of future uses to which this may be put but I can say that the ranking system applied to this Liquid Alternative Momentum 1 universe is a Quality-Value-Sentient combination (notice how I use ranking-system-like factors to create the universe and then run a ranking system against it; this gives you a dual-layer ranked model). The Buy and Sell rules are primarily Value oriented. And, of course (for me), there is no hedging, nor are there any rules that include or exclude stocks on the basis of broad market performance.

It’s a 35 stock model and is rebalanced once very three months. Here are the sim results.





Essentially, I started with a universe defined by momentum in a broad academic sense similar to what the tables show. I worked my way down from the mega-bucket (about 900 stocks) to 35 using non-momentum DDM-related proxy factors. I could of course go the other way; use momentum ideas to filter a universe or sub-universe of value or quality stocks.

We’re Done!

This concludes the strategy-design seminar series.

The field is vast and ever evolving so I won’t pretend that I gave you more than a tiny fraction of everything you would want or need to know. Actually, though, nobody can do more.

What I have attempted to do is give you the perspectives you need to make sense of what you see out there, create sensible models that have a reasonable probability of performing well relative to the market with live money, and most important, the ability to discover more on your own.


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s