PDF Version: P123 Strategy Design Topic 3B – Designing a Sales-based Valuation strategy
As a tool for assessing stock valuation, Sales has always been solid, but it got a very bad rap from having been abused back in the late 1990s to support bullish theses on stocks that were badly overvalued on the basis of earnings (assuming they even had earnings). The key to using Sales-based valuation successfully requires one to recognize that although it is analogous to PE, it is not a substitute for PE. It provides different information and relates, on its own terms, the DDM starting point. It can be used whether or not PE or any other valuation metric is used. Each stands or falls on its own terms.
The Logical Foundation of PS
Our starting point, as usual, is the Dividend Discount Model (DDM):
P = D / (k-g)
D is dividend, k is required rate of return, and g is expected growth rate.
As usual, given the real-life impracticality of this formulation, we’ll use other approaches to nudge us in the direction of stocks for which DDM valuation is more likely than not to be reasonable.
- We saw in Topic 3A that we can substitute E * PR for D (E is earnings and PR is payout ratio).
- We also saw that as a practical investment-community culture thing, we can ignore PR and just use E (we noted the tendency of investors to treat even retained earnings as if it were in their hands and voluntarily returned to the corporation to be used for reinvestment).
That got us here:
P/E = 1 / (k-g)
Moving on, we know this:
E = S * PM where S is sales and PM is profit margin
Therefore, the following holds true:
P = (S * PM) / (k – g)
Continuing with some algebraic shuffling:
P / (S * PM) = 1 / (k – g)
P / S = PM / (k – g)
Voila. That’s the formula for a correct PS ratio. As with our formulas for DDM and PE, we can’t literally use it. But also as with the formula in PE (see Topic 3A), it pinpoints some important principles we can use in strategy design.
- Increases in interest rates, a key component of required return (k) tends to depress PS.
- Increases in risk, another component of k, also tend to depress PS ratio.
- Increases in growth tend to push ideal PS upward
- Increases in PM also tend to push PM upward
Having seen the importance of qualifying a universe on the basis of risk and growth in order to make PE work, you should be already getting ideas for how you can model with PS. Here, though, you also have to be aware of margin. In order to say Stock A, with a lower PS is better than Stock B, margin must be included in your implied “all else being equal” caveat.
Now, let’s turn to EV/S or EVS or EV2S (enterprise value to sales). The formula for it looks a lot like that for PS because it’s based on a similar logical path from DDM:
EV / S = PM / (k – g)
The difference between PS and EV2S is in how we define PM. To validate use of the P in PS (the market value of what equity holders get) we need to imagine that somewhere along the line we’ll account via PM for all costs that occur in between S and E. Hence PM means Net Profit Margin.
EV positions us differently. We’re pretending the value of the net debt is being treated as it it were equity. Since w’re turning a momentary blind eye to the fact that it really is debt, we’ll also need to turn a momentary blind eye to the interest portion of the expense.
Now comes a place where you have latitude to either approximate or geek out. If you choose the latter, you would compute your own custom margin by eliminating interest expense (and adjusting for the impact of the tax deductibility of interest). On the other hand, this is can be as good a time as any to call attention to the reality that geek to the max isn’t always necessary or even the best course of action. As per a great adage, we’re often better off vaguely right than precisely wrong. (That’s especially so given the inherent non-usability of the DDM and the need for approximation in everything we do.) It’s fine to simple define PM as operating margin or gross margin and call it a day.
What’s Better: PS or EV2S
Both suffer from use of TTM Sales as opposed to estimated sales, which we don’t have. But Sales is subject to a lot less distortion than EPS (and its plethora of whacky cost items that may or may not be present from period to period). With TTM sales, our vulnerabilities are twofold:
- Is there a significant external (industry or economy-wide) distortion that’s temporarily inflating or depressing sales?
- Is there a corporate structural change (noteworthy acquisition or divestiture) that makes the TTM sales figure less representative of the ongoing business than we wish it were?
We will have to address these concerns through screening rules that weed out such situations and/or through the two kinds of diversification available to us (the number of stock positions we hold, which mitigates the risk of oddball data items putting companies we’d rather not see into our result sets; and factor diversification, which reduces the impact a distorted sales figure will have on a value rank or a multi-style rank).
I prefer to control sales data risk through diversification. Doing it through screening is do-able, but not so easy. Screening out unusually hot or cold business conditions is probably better done through economic data points that apply to many kinds of stocks regardless of whether you care specifically about PS or EV2S. Eliminating companies that recently acquired or divested may be unproductive from a cost-benefit standpoint; it may eliminate too many companies that may be worth owning for other reasons. So in strategizing with sales-based value, I’ll tolerate oddball ratios and expect to control risk through diversification.
As to PS versus EV2S, both are usable, but I’ve been evolving in the direction of EV2S.
One advantage EV2S offers is that it allows us to make apples-to-apples comparisons among companies with substantially different capital structures. Consider Table 1, which values two different companies with comparable fundamental business characteristics (same ratio of sales from each dollar of capital invested) but different capital structures.
Table 1
Equity (Book Value) | $1 bill | $1 bill |
+ Debt | – – | $1 bill |
= Enterprise Value | $1 bill. | $2 bill |
If Sales to Capital is . . . | 0.5 | 0.5 |
Then Sales would be . . . | $500 mill. | $1 bill. |
Assume Pr2Book is . . . | 1.00 | 1.00 |
MktCap (P) to Sales | 2.00 | 1.00 |
Based on . . . | $1 bill Book / $500 mill sales | $1 bill Book / $1 bill sales |
EV to Sales | 2.00 | 2.00 |
Based on . . . | $1 bill EV / $500 mill sales | $2 bill EV / $1 bill sales |
If use of debt to double the amount of available capital winds up increasing risk, that would enter the picture through a higher k value in the denominator of the equation. We cannot, however, automatically assume the extra debt adds to risk. (This is actually a substantial area of contention in academia with the well-known team of Modigliani and Miller famous for the proposition that changes in capital structure do not impact risk.) We do not need to take sides in this debate. All we need to do recognize that addition of debt may or may not increase risk and use a formula for EV2S that would equalize the ideal ratio for these two companies if it turns out that the extra debt is neutral to risk, while at the same time trusting the k item to properly address the situation if risk levels do, in fact, differ.
A Simple EV2S in a Strategy
We’ll do the same here as we did with PE. We’ll assume, based on the logical path from DDM, that lower EV2S ratios are better all else being equal but recognize that all else often is not equal. So we’ll support our use of EV2S with other relevant factors, namely operating margin, growth and risk.
Let’s start with a simple strategy:
- Universe: PRussell3000
- Benchmark: iShares Russell3000 ETF
- Max No. Stocks: (i.e., All)
- Basic Backtest period: MAX (1/2/99 – 11/15/15)
- Rebalance: 4 Weeks
- Slippage: 0.25%
- Rolling Backtest Samples: Every Week
- Length of Sample: 4 weeks
- Test Period: MAX
- Screening Rules
- Frank(“EV/SalesTTM”,#industry, #asc)>90
Here are the test results.
Basic Backtest:
- Annualized Return: 14.44% vs. 5.18% for Benchmark
- Standard Deviation: 80% vs. 15.86% for Benchmark
- Max Drawdown: -66.02% vs. -55.77% for Benchmark
- Sharpe: 0.59 vs. 0.28 for Benchmark
- Sortino: 0.87 vs. 0.37 for Benchmark
- Beta: 1.37
- Annualized Alpha: 9.55%
Rolling backtest:
- Average “excess” return per 4-week period: +0.89%
- During Up periods: +1.76%
- During down periods: -0.50%
Essentially, we’re on script. EV2S worked because it was supposed to have worked. As noted, there is a rational basis for assuming low EV2S will, all else being equal, push us toward stocks more likely than not to be reasonably aligned with ideal DDM value.
But it’s not clear it worked as well as it might have. Note the high Beta and the negative down-market average performance during samples when the market fell. This, too, is on script. We sought out low EVS ratios but did not do as much as we could have in addressing “all else being equal” (we didn’t completely ignore it given that the Frank was computed relative to Industry, but the commonalities among GICS industry peers seems insufficient to address “all else” to the extent we’d like).
So as with raw PE, this successfully tested strategy can work often in the real world, but our out-of-sample results have a built-in vulnerability: We see that in down markets, characterized often by increases in risk and/or diminution in growth or margin, we’re prone to underperform. It’s not the down market per se that weakens this strategy. It’s the accompanying changes in the “all else” conditions. When we depart from the big aggregate item and limit ourselves to, say, 10-20 stocks (or even worse, 5), we’re that much more vulnerable to being bitten by “all else.” So once again, as with raw PE, we have a strategy that tests well (not surprising given that so many up periods are included in the MAX testing interval) but contains the seeds of its own potential out-of-sample weakness implosion (because we did not respect the DDM-derived theory).
Fleshing out our simple strategy to embrace “all else”
Our “all else” involves three things associated with higher EV2S ratios:
- Margin – Higher is associated with higher EV2S
- Growth – Higher associated with higher EV2S
- Risk – Lower associated with higher EV2S
As we change things up, we’ll stick with the same inputs for Universe, Benchmark, Basic Backtest, and Rolling Backtest.
- Screening Rules
- Frank(“EV/SalesTTM”,#industry, #asc)>90
- Rating(“Basic: Quality”) >65
- This addresses risk, which I often prefer to measure using fundamentals, rather than price-based metrics. In setting a cutoff, I’m keeping it very general. I don’t need the highest quality stocks I can find. Decent quality should be fine, hence the cutoff at 65 (seeking approximately the top third)
- No. Stocks 15
- Use Ranking System named “Supports EV2S”
- Historic Margin and Growth – 50%
- OpMgnTTM
- Sort, hi is better, 25%
- (OpMgnTTM – OpMgn%5YAvg)/abs(OpMgn%5YAvg)
- Sort, hi is better, 25%
- Sales%ChgTTM – Sales5YCGr%)/abs(Sales5YCGr%)
- Sort, hi is better, 25%
- Sales%ChgTTM
- Sort, hi is better, 25%
- Expected Growth – 50%
- LTGrthMean
- Sort, hi is better, 100%
- LTGrthMean
- OpMgnTTM
- Historic Margin and Growth – 50%
Needless to say, this is far from the only way one might address margin, growth and risk. The possibilities are likely endless. But as to this combination of screening rules and factors, there are two points worth noting:
- There’s no need to be frantic about the risk factor. As noted, we’re not seeking conservatism per se, or even a low volatility portfolio. We’re simply incorporating a factor that helps us accept the headline notion that lower EV2S is better without having our thesis undone by a flight to risk.
- Any time one wishes to factor growth into a model, it’s hard to avoid use of historical growth data because for the most part, that’s what we have. Nevertheless, what we really need is growth in the unknowable future and often history is a poor indicator (in quant terms, growth rates tend to be weak in terms of “persistence,” something that will be discussed further when we address Growth as a topic). That’s why half of this ranking system uses the LTGrthMean item; there are questions that can be raised regarding believability, but for all its shortcomings, it may still be the best data-point we have.
Now, let’s see how this version of the model performs in test.
Basic Backtest:
- Annualized Return: 19.77% vs. 5.18% for Benchmark
- Standard Deviation: 34% vs. 15.86% for Benchmark
- Max Drawdown: -59.73% vs. -55.77% for Benchmark
- Sharpe: 0.75 vs. 0.28 for Benchmark
- Sortino: 1.11 vs. 0.37 for Benchmark
- Beta: 1.34
- Annualized Alpha: 15.51%
Rolling backtest:
- Average “excess” return per 4-week period: +1.24%
- During Up periods: +1.84%
- During down periods: +0.29%
The strategy is still volatile and is still far better in up periods than in market declines. But that’s not the end of the world. Not every strategy is stable, and it is legitimate to pursue an up-market-oriented strategy, so long as you understand that these characteristics exist. And we did improve down-market tested performance.
So it seems we do have a viable investable strategy.
Or not . . . one problem: I ran the model looked at the passing stocks. I saw a huge concentration in biotech, not just now but in past periods as well. We could tolerate that for now and address it later, when the model is converted to a simulation and then a portfolio; in that module, we can limit industry concentration, and test again to check that performance remains OK. But that’s not ideal. If there’s something inherent in the model that makes it want to concentrate, I’d rather address it now than fight with it later on in the simulation platform.
It appears the problem is our industry-sort raking system. Often, this works to negate excess concentration by forcing us to consider the “best” in many industries, as opposed to concentrating in industries that are heavy in terms of particular characteristics. Biotech, however, is a strange animal with exceptional variety consisting of big established firms as well as speculative more R&D-intense firms, the key being that there’s a big enough numerical spread in terms of the data on which we rank, enough so to create too many industry superstars.
We could simply eliminate biotech; there’s nothing wrong with screening out industries that you think will have characteristics incompatible with the idea that motivates your model. Often, it’s a good idea to do that. In this case, however, I’ll switch the ranking system to use of Universe sorts. I’m still at risk of concentration. But this type of risk is more tolerable to me since it’s pushing me more toward fundamentals I find desirable. So I usually sort based on Universe, and put on the brakes only if I find concentration too extreme.
Here’s what happens if the rank factors are sorted relative to Universe:
Basic Backtest:
- Annualized Return: 23.68% vs. 5.18% for Benchmark
- Standard Deviation: 59% vs. 15.86% for Benchmark
- Max Drawdown: -60.43% vs. -55.77% for Benchmark
- Sharpe: 0.86 vs. 0.28 for Benchmark
- Sortino: 1.27 vs. 0.37 for Benchmark
- Beta: 1.37
- Annualized Alpha: 19.27%
Rolling backtest:
- Average “excess” return per 4-week period: +1.54%
- During Up periods: +2.28%
- During down periods: +0.33%
Nice! We gained performance. But after the obligatory pat on my own the back, I recognize that this wasn’t the goal. I just wanted to eliminate the concentration (which I did, based on having run the model and looked) without messing up performance too much (yes, I’d have settled for a reduction in order wind up with a more palatable portfolio).
It’s still volatile and up-market oriented. Again, as long as we recognize that, it’s OK. But just for the heck of it, let’s try something different.
Combining EV2S with PE
Regardless of what you may know from other quant disciplines, less is not more when it comes to stock-selection. We want stocks that perform well; we get no brownie points for limiting ourselves to stocks that do well for reason A while rejecting stocks that do well for reason B. This is especially crucial considering the nature of the data with which we work. Much of it is very “badly behaved.” Oddities happen all the time to the point that often you aren’t really seeing what you think you’re seeing (Is EV2S high because the stock is overvalued? Or, for example, are we seeing a mirage, tracing to the company having just made a divestiture for example?). We can control some of this through our screening rules, but there are too many potential landmines (more than anyone can possibly catalog, as you can easily see if you make a habit of looking closely at individual companies/stocks). We can’t screen them all out. Often, factor diversification is the best way to manage the risk of an inadvertently mis-specified model (the logic being the same as diversifying among stocks, the mitigate the impact of idiosyncratic risks).
The basic settings are as before. Here is the new screen:
- Frank(“ProjPECurFY”, #industry, #asc)>75
- Frank(“EV/Sales”, #industry, #asc)>75
Notice I cut the FRank thresholds to 75. Using 90 for two valuation thresholds may narrow the result set too much.
For the ranking (that gets us down to 15 positions), I’m going to try a change of pace. I’m going to switch to the pre-set “Comprehensive: QVGM” system. It interests me because:
- It touches all the bases I need to support the legitimacy of lower valuation ratios (growth, risk, and margin in the case of EV2S) but does so in a broader way.
- There’s a 25% allocation to historical growth here (the G in QVGM), with all its attendant limitations. But the M (Momentum) part of QVGM correlates generally with Sentiment, and for that to be positive, we’d have to assume the Street sees at least reasonable forward prospects.
- I especially cherish the 25% Q (Quality) part of the model. It’s much broader than what I’d been using before and takes the pressure off finding THE right way to define company risk.
Here are the test results:
Basic Backtest:
- Annualized Return: 17.69% vs. 5.18% for Benchmark
- Standard Deviation: 94% vs. 15.86% for Benchmark
- Max Drawdown: -65.96% vs. -55.77% for Benchmark
- Sharpe: 0.76 vs. 0.28 for Benchmark
- Sortino: 1.05 vs. 0.37 for Benchmark
- Beta: 1.04
- Annualized Alpha: 14.39%
Rolling backtest:
- Average “excess” return per 4-week period: +1.09%
- During Up periods: +1.12%
- During down periods: +1.04%
Bingo!
The backtest returns are definitely lower. But seriously, it’s just a backtest. Is 14.39% annual alpha really worse than 19.27% or even 23.68%? Seriously? No way! In the real world with live money, I’d be thrilled to get 4%-5% (actually, anything above zero). Both results say suggest functionally identical conclusions: “My valuation ideas are sound and reasonably expressed in Portfolio123 language.” The test numbers don’t matter. The only thing that matters is the verbal conclusion.
Actually, though, there’s more. The beta is much lower, and the rolling test shows a much better balance between up and down periods.
- Why should I care if I just said the numbers per se don’t matter?
- They don’t.
- What matters is that the numbers confirm the idea I had before having run the test; that we could maintain a satisfactory potential return while reducing risk with a more broad-based set of factors.
Wrapping Up
We now have something that could potentially be used with live money. We still have to put in Sell rules. But that’s do-able.
That said, there is the question of whether the V (Value) part of QVGM is overkill; I did, after all, have a value-based screen.
The answer, here, is maybe. On the one hand, of all the “mistakes” investors make, having too much value (assuming we’re not undone by the dreaded “all else”) is probably the least of anyone’s problems. It’s OK to be wrong; just be a bit less wrong than everyone else. And the screen didn’t look for low value per se; it looked for attractive valuations relative to industry peers.
But no model can ever be the last word on anything. There are always better ideas around the corner. So I’ll leave it to you to explore more on your own. (As usual, the screens and ranking systems are saved under Group visibility.)
Ideas with which you can experiment:
- Change the screen to compute FRank relative to Universe
- You’ll lose out by being more likely to see companies whose good numbers reflect the rising tide that lifts all industry boats, as opposed to company-specific strength. But you gain by having you results pushed toward industries that are better overall in terms of the qualities you seek. There is no single good-for-all-times answer. The choice is yours.
- Remove the V from QVGM; i.e, create your own QGM model.
- Did Value overkill hurt, help, or was it neutral?
- Try QM (it might work, arguably, Momentum should correlate with growth expectations)
- Try QS (combine “Basic: Quality” and “Basic: Sentiment”). Is Sentiment, which substitutes analyst data for price data a better or worse proxy from growth expectations?
- Try PS in lieu of EV2S
- – Let your imaginations run. (If you like technical analysis, try using it to tease out situations in which the market is reacting to sharp increases in growth expectations?) Share your experiences in the Group if you’d like.
Did You Notice . . .?
So who noticed what I did not do? It’s something big. It’s something many on Portfolio123 consider a critical part of strategy design.
Do you give up?
I never looked at Performance tests for the ranking systems, event he new one I created specifically for this exercise. (Even as of this writing, I still haven’t run such a test.)
It’s irrelevant. I don’t care.
How these ranking systems perform against a broad universe would have added nothing to the strategy-development process. Did you miss it as you followed the thought process? All of my thinking was pointed at how I could prevent low valuation ratios, something I know works (I know it from the theory; I don’t need tests to find that out), from being unraveled by all else not being equal. There’s no room in this process for a ranking system that tested well against a broad universe I’m not interested in using.
Get used to seeing this. I don’t test ranking systems when creating models I intend to use for my own real money. The only time I test systems is if I need to collect the information for someone else who might want to know. (If you ever have seen me publish a ranking-system test or if you see me do so in the future, know I did it only for purposes of the blog, article, etc., because I knew readers would want to see it.) To me, if a ranking system is part of an overall good model, that’s that.
P.S. Oh what the hell. I know you’re going to do it so I decided, just now, to test the “For Use With EV2S” ranking system. Run against the All Fundamentals Universe, it’s barely mediocre. But as I noted in the last topic, that’s a marshmallow universe. Stepping up to a test against the PRussell3000, the ranking system is an appalling pile of you-know-what. OK. So I know I can’t use it to pick the best 15 PRussell3000 stocks. But since I can’t imagine ever wanting to do that, I go back to what I said before: I don’t care.
Coming Attractions
Valuation based on cash flows, or P/The-Other-Kind-Of-E.