Calmar Ratio
The Calmar Ratio is annualized compound return divided by the absolute value of maximum drawdown. It measures how much return a strategy delivered per unit of its worst peak-to-trough loss, answering the question of whether the gains were worth the deepest pain along the way.
What is the Calmar Ratio?
The Calmar Ratio belongs to the family of risk-adjusted return measures, but it defines risk differently from Sharpe and Sortino. Where Sharpe divides excess return by total volatility and Sortino divides by downside deviation, Calmar divides annualized return by the single worst peak-to-trough decline the strategy suffered. The name comes from California Managed Accounts Reports, the newsletter run by Terry Young who introduced the measure in 1991 as a more responsive alternative to longer-window return-to-drawdown ratios. It is sometimes called the return-to-drawdown ratio, and it is closely related to the older MAR ratio.
What makes Calmar distinct is that its denominator is not a statistical dispersion measure at all. Volatility and downside deviation summarize the whole distribution of returns; maximum drawdown is a single realized event, the deepest hole the equity curve ever fell into. That makes Calmar a deeply intuitive number for anyone who actually lived through a strategy. Volatility is abstract. A 45 percent drawdown is not. Calmar asks: for every point of worst-case loss you had to stomach, how many points of annual return did you collect.
As rough calibration, a Calmar ratio above 0.5 is acceptable for a long-only equity strategy, above 1 is good, and above 3 is exceptional and usually a sign of a short, lucky window. Managed-futures and trend-following programs are often judged on Calmar precisely because drawdown control is their selling point. Equity strategies that ran through 2008 or 2020 tend to post lower Calmar ratios than their Sharpe ratios would suggest, because one severe bear market can dominate the denominator for years.
Formula
Both inputs are expressed as decimals or as matching percentages. If a strategy compounds at 12 percent per year and its worst peak-to-trough fall was 30 percent, the Calmar ratio is 0.12 divided by 0.30, which equals 0.40. The numerator is the annualized growth rate, not the cumulative total return, so windows of different lengths can be compared on equal footing. The denominator is the maximum drawdown over the same window, taken as a positive magnitude.
The original Calmar convention measures both terms over a trailing 36-month window, which is what gives the ratio its responsiveness; a three-year lookback updates as the market changes rather than being anchored to ancient history. In backtest reporting, the more common practice is to compute Calmar over the full test window so that the worst drawdown in the entire history is reflected. SledgeKey reports Calmar over the full backtest window, using the same annualized return and the same maximum drawdown shown elsewhere on the results page, so the three numbers are internally consistent. When you compare two Calmar figures, confirm they cover the same window length, because a ratio computed over three calm years is not comparable to one computed over a decade that includes a crash.
Why the Calmar Ratio Matters in Backtesting
Calmar matters because maximum drawdown is the risk that actually ends strategies. Investors rarely abandon a plan because its standard deviation was a little high; they abandon it at the bottom of a brutal drawdown, when the paper loss feels permanent. A strategy with a great Sharpe ratio and a 60 percent maximum drawdown is, for most real investors, untradeable. Calmar surfaces that mismatch directly by putting the worst realized loss in the denominator.
The decision Calmar informs is whether a strategy's returns justify its worst-case exposure, and whether you could realistically have held on. Two strategies with identical annual returns but Calmar ratios of 1.0 and 0.3 are telling you that the second one put you through three times the worst-case pain for the same reward. For position sizing and for setting honest expectations with clients or with yourself, that is often more useful than Sharpe, because drawdown is what people feel.
The failure mode of ignoring Calmar is falling in love with a high-Sharpe strategy whose return came with a catastrophic interim loss you would never have tolerated in real time. A backtest does not feel a drawdown; a person does. Reading Calmar alongside the maximum drawdown figure keeps the worst-case loss in view rather than letting it hide inside an attractive annual return.
How SledgeKey Implements the Calmar Ratio
Calmar appears on the backtest results page as a single number alongside Sharpe and Sortino. It is computed as the strategy's annualized return divided by the absolute value of its maximum drawdown over the full test window, using exactly the annual return and maximum drawdown figures that are already displayed in the results. Because the inputs are the same numbers you can see elsewhere on the page, you can check the ratio by hand: divide the reported annual return by the reported drawdown magnitude and you should land on the Calmar figure.
The benchmark's Calmar is computed the same way over the identical window, so the side-by-side comparison is the most useful reading. A strategy with a higher Calmar than the benchmark earned more return per unit of worst-case loss than the index did over the same period. A strategy with a lower Calmar, even with a higher total return, took a deeper worst-case hit to get there. Because Calmar depends on the single deepest drawdown, it is most meaningful over windows long enough to include at least one real market stress event; over a short, calm window the denominator can be small and the ratio flatteringly large.
Common Pitfalls
The first pitfall is window sensitivity. Because the denominator is a single worst event, the Calmar ratio depends heavily on whether your test window happens to contain a crash. A strategy backtested only over 2013 to 2019 will show a high Calmar simply because no severe bear market fell inside the window. Extend the same strategy through 2008 or 2020 and the Calmar can drop by half or more as a larger drawdown enters the denominator. A high Calmar over a short, benign window is not evidence of drawdown control; it is evidence that the window was kind.
The second pitfall is instability from a single number. Maximum drawdown is the most sample-dependent statistic in a backtest, defined by exactly one peak and one trough. Change the start date by a few months and a different drawdown can become the maximum, swinging Calmar noticeably even though the strategy did not change. Volatility-based measures like Sharpe average over hundreds of observations and move smoothly; Calmar can lurch on a single data point. Treat it as a coarse gauge, not a precise dial.
The third pitfall is treating Calmar as a complete risk picture. It says nothing about how often drawdowns occurred or how long recovery took. A strategy with one deep drawdown and a strategy with five moderate ones can post the same Calmar while feeling completely different to hold. Reading Calmar next to the full list of drawdown episodes gives the texture that a single ratio cannot.
Calmar lives or dies on whether your test window contains a crash. A strategy that looks excellent over a calm five-year window can see its Calmar cut in half once a real bear market enters the sample. Before trusting a high Calmar, confirm the window is long enough to include at least one genuine market stress event.
See the Calmar Ratio in your own backtest
Run a backtest on any screening strategy and see Calmar computed against the benchmark on point-in-time data, free.
Run a Backtest