Wimbledon and the Art of Risk Management
Plus: Hedge Fund Closures, KKR/Global Atlantic, Rocket Mortgage
|Marc Rubinstein||Jul 10, 2020||11||2|
Welcome to issue #8 of Net Interest, my newsletter on finance industry themes, and hello to all new subscribers — you join a growing community of over 2,000 net interesting people. Each week I deep-dive into one theme and highlight a few others. This week the deep-dive connects Wimbledon to ideas around risk. If you know anyone who may be interested in this or any of the other themes addressed over the past few weeks, feel free to forward and invite them to subscribe. Thanks!
Wimbledon and the Art of Risk Management
This weekend should have been Wimbledon finals weekend. Maybe Roger Federer would have extended his record to nine wins in the men’s. Or the tournament could have done what it does best and throw up a surprise like Simona Halep’s resounding victory over Serena Williams last year. Alas, we’ll never know. For the first time since the Second World War, the All England Lawn Tennis Club took the decision to cancel.
Not that the Club will be out of pocket. Unlike most other major events planned for 2020, it had pandemic insurance in place. For seventeen years it paid an annual premium of around £1.5 million for insurance; its payout this year is expected to be around £114 million.
The story goes that the Club was motivated to take out insurance by SARS. But that would underestimate the Club’s literacy with extreme events.
Anyone who’s been to Wimbledon in the past ten years will have seen this plaque on a brick wall outside Court 18.
It commemorates the longest tennis match in history. In June 2010, American 23rd seed John Isner played French qualifier Nicolas Mahut in the first round of the men’s singles. Played out over three consecutive days the match took 11 hours and 5 minutes to conclude, with Isner the eventual winner after 183 games. The previous record was a match in Paris that lasted six and a half hours.
Unlike the pandemic, the event didn’t have a negative financial impact on the Club. But it was no less extreme. The probability of a match lasting 183 games or more is 1 in a billion. On those odds, another pandemic will occur before a match lasts that long again. (And to prove it, the same pair-up Isner vs Mahut was drawn at Wimbledon the following year, itself an unlikely event although not as unlikely, and Isner won in straight sets in just over 2 hours.)
A pandemic and a tennis match lasting over 11 hours are both infrequent events. When we think about risk it can be useful to think about it in two dimensions. One dimension is frequency — the likelihood that an event is to occur; the other dimension is severity — the impact if it does occur.
Isner vs Mahut was a low frequency, low severity event — unlikely enough to merit a plaque, but not an event that has long-lasting impact. A pandemic is a low frequency, high severity event. No-one is in any doubt as to the severity of what we are currently going through, even though they may have underestimated it in February.
Over the years a number of risk management models have emerged to measure risk across both of these dimensions. It is the low frequency occurrences that typically confound.
One model financial institutions use to measure risk is Value-at-Risk (VaR). It harnesses historical data of price movements to estimate a worst loss in a given interval of confidence. To illustrate, take a look at Goldman Sachs. In the first quarter of this year Goldman reported an average VaR of US$81 million. This means that 95 times out of a 100 – it’s chosen level of confidence – the firm shouldn’t lose more than US$81 million in a single trading day. As it turned out, Goldman lost more than US$81 million twice in the 62 days that comprise the quarter, which is within its bounds of confidence.
The problem with Value-at-Risk is what happens when the future doesn't look very much like the past. Goldman samples from five years of historical data, weighted to give prominence to more recent data. That’s fine as long as the future reflects the past five years, but what if it doesn’t? The firm recognises the shortcoming. It acknowledges that “VaR is most effective in estimating risk exposures in markets in which there are no sudden fundamental changes or shifts in market conditions.”
Unfortunately sudden changes are a feature of markets. A former Goldman Sachs CFO told the FT back in 2007 that his firm was “seeing things that were 25-standard deviation moves, several days in a row.” A 25-sigma event is really infrequent — more so even than an 11 hour tennis match. None of us should have the fortune to witness one. Yet the number of multi-sigma events we’ve seen even since then exceeds what the Chinese would wish on their worst enemy.
As Howard Marks’ backgammon buddy, Bruce Newberg, says: “There’s a big difference between probability and outcome. Probable things fail to happen – and improbable things happen – all the time.”
In order to accommodate this another model is used — the stress test. Unlike VaR measures, which have an implied probability because they are calculated at a specified confidence level, stress tests simply model outcomes based on predefined scenarios. They completely ignore the concept of frequency and dwell entirely on the severity of a particular scenario.
Although stress tests have long been used by portfolio managers and trading desks to manage risk, they became part of regulators’ armoury at the tail end of the financial crisis. Initially they were employed as an extension of informal ‘burn down’ exercises that analysts were conducting on banks at the time — to see how bank capital positions could withstand severe losses. In his memoir, Stress Test, Tim Geithner, former US Treasury Secretary, says:
“I first described the plan as a ‘valuation exercise’. We would come to call it the stress test. The plan aimed to impose transparency on opaque financial institutions and their opaque assets in order to reduce the uncertainty that was driving the panic. It would help markets distinguish between viable banks that were temporarily illiquid and weak banks that were essentially insolvent. Then it would help stabilize the strong as well as the weak by mobilizing a combination of private and public capital.”
He goes on to say that the stress test would end up having many other virtues he didn't foresee at the time, that it would be “the gift that keeps on giving” as a regulatory tool.
That’s not a bad thing. Stress tests have many merits. They continue to provide investors with the transparency on financial institutions Geithner aimed for. They form the foundation for how much capital a bank is required to have. And in addition they provide regulators with a consistent, horizontal view of system-wide risk.
But they also have drawbacks. Three in particular stand out, two of which have echoes in other domains where stress tests are used — healthcare and construction.
The Millennium Bridge problem
Stress tests have been used in the construction industry for many years. You wouldn’t want to drive over a bridge that hadn’t been stress tested. So when the Millennium Bridge in London began to sway from side to side moments after opening to pedestrians in June 2000 it caused some surprise.
The bridge was quickly closed while engineers conducted an investigation. Here’s what they found:
“Chance footfall correlation, combined with the synchronization that occurs naturally within a crowd, may cause the bridge to start to sway horizontally. If the sway is perceptible, a further effect can start to take hold. It becomes more comfortable for the pedestrians to walk in synchronization with the swaying of the bridge. The pedestrians find this makes their interaction with the bridge more predictable and helps them maintain their lateral balance. This instinctive behaviour ensures that the footfall forces are applied at a resonant frequency of the bridge, and with a phase such as to increase the motion of the bridge. As the amplitude of the motion increases, the lateral force imparted by individuals increases, as does the degree of correlation between individuals. The frequency ‘lock-in’ and positive force feedback caused the excessive motions observed at the Millennium Bridge.”
In other words, a feedback loop develops. People naturally fall into step with one another; that causes a bit of swaying; people respond in a way that makes them feel more comfortable; the swaying gets worse. In order to capture that feedback loop, stress tests would have had to have incorporated pedestrians’ response to each other and then their response to the bridge.
Second-order effects are a difficult thing to capture. Many people predicted Brexit but only a subset of them predicted the market’s response; many people predicted Trump but likewise only a subset predicted the market’s response. The pandemic’s a bit different — not many people predicted it, but if they had, it is highly unlikely that many of them would have predicted the market’s response.
Portfolio managers typically stress their portfolios according to historic scenarios — the unpegging of the Swiss Franc, the ‘Taper Tantrum’ and the like; if the Trump or the pandemic scenarios would have been used before they happened, they wouldn’t have appeared credible. (“Yeah, yeah, we’re modelling a complete shutdown of the global economy with unemployment rising to 15% and GDP cratering by 30% so we’ll pencil in, what, a 5% fall in equity markets?”)
Probing in the wrong place
A second drawback, drawn from medicine, is that the stress test may be probing in the wrong place.
Doctors perform cardiac stress tests by getting patients to run on treadmills and monitoring their pulse and blood pressure. Stress tests are able to detect arteries that are severely narrowed as a result of cholesterol build up; these are what cause symptoms. However, they don’t necessarily cause heart attacks, which often result from lesser blockages that rupture and form clots. Not until the blood clot is formed and heart muscle is starved for oxygen will symptoms be felt, which is why people who have heart attacks often have no warning symptoms and can undergo a perfect stress test days before keeling over.
In July 2016 Banco Popular of Spain passed a stress test conducted by the European Banking Authority. Within 12 months it had failed. The stress test examined solvency in an adverse scenario for the general economy; what brought Popular down was liquidity in an adverse scenario for the company specifically.
Gaming the system
The third drawback is that the system can be gamed. This is where financial stress tests differ from cardiac stress tests or engineering stress tests. Neither heart muscle nor steel girders know they’re being stress tested; people do. As Richard Feynman said, “Imagine how much harder physics would be if electrons had feelings.”
There is some evidence that regulatory stress tests are increasingly being gamed. Researchers have found that as stress tests in the US have evolved, the outcomes have become more predictable. Others have shown that in Europe the flexibility that exists in the design and use of banks’ own models is systematically exploited to minimise projected losses. And Risk magazine reported last year on a leaked memo in which a US bank approached a European bank to swap out risky assets as a window-dressing exercise ahead of its forthcoming stress test.
A similar phenomenon has long been observed on trading desks. If interest rate risk is being measured, traders will optimise instead to the slope of the yield curve; if yield curve risk is being measured they will optimise to the curvature of the yield curve. At every iteration, more complexity is being layered in and as complexity is layered in, more cracks for gaming open up.
Like the Millennium Bridge problem, this drawback also highlights a feedback mechanism. But this one’s different; the feedback here is designed specifically not to fit into a model but to thwart it. This is the reason why no single risk management model can cover all risks. Once a model is specified, people will try to find a way round it.
This gives rise to a paradox. Perhaps it’s a corollary to Goodhart’s Law. Overt risk management has a tendency to add complexity in a system; and complexity increases endogenous risk within the system. (In fact, the concept of risk is riddled with paradox. The Stanford Encyclopedia of Philosophy states: “When there is a risk, there must be something that is unknown or has an unknown outcome. Therefore, knowledge about risk is knowledge about lack of knowledge. This combination of knowledge and lack thereof contributes to making issues of risk complicated from an epistemological point of view.”)
The best way out is not to focus on a single metric. Unfortunately bank regulators have fallen into the trap of focusing on one — the stress test. This came to the fore this year when US regulators did not want to be distracted from their annual stress test cycle by a real-life stress test. So rather than re-running the 2009 ‘valuation exercise’, they fudged it.
Another way out is not to overthink it. At my hedge fund we had a story pinned up on our wall. It was about Bobby Steinberg, ex-head of Bear Stearns' risk arbitrage department and Ace Greenberg, who was running the firm:
“The arbitrage department was long a stock involved in a merger, and the department was at the firm's position limit. Steinberg thought it was incredibly attractive, and he went to Ace for permission to exceed the limit. Ace listened very carefully, and when Steinberg was done, he looked up and said, 'Bobby, that sounds really great, that sounds like an amazing story, but I have just one question for you. Do you think we have position limits around here to stop people from buying stocks they don't like?'”
There’ll be no surprises out of Wimbledon this year. But our broader exposure to infrequent events remains undiminished. Next week US banks begin reporting their results and we’ll get a peek at how they’re faring so far in this real-life stress test. Modelling specific outcomes is hard, but simple things like position limits and low leverage provide resilience to defend against risk.
More Net Interest
Hedge Fund Closures
A number of hedge funds announced their closure this week. One of them is my old firm’s flagship fund; others include Sloane Robinson and Paulson. I’ve been invested in them all in the past and it got me thinking about the survival rate of funds. A quick check through my records shows that I have been a limited partner in 38 hedge funds at various times over a 20 year span. Within that sample, 26 have now closed down, i.e. around two-thirds. (Disclosure: one of them was my own.)
Various reasons were cited for the closures — the manager wants to retire and there’s no successor, he wants to manage a family office, the fund is too small, the opportunity set has diminished. One common theme is that fund performance was typically quite weak in the run-up to the fund being shut down. This presents a problem for investors. Investors who have persisted with the fund through the period of underperformance are forcibly stopped out before they can benefit from any recovery. The option granted to managers to close the fund with 30 or 90 days notice creates an asymmetry which my sample set suggests is quite often crystallised.
Of course, investors know this which is why they tend not to hang around too long through periods of poor performance. The full potential for performance reversion is undermined when the manager retains a liquidation option. So the game becomes a finitely repeated game which feeds back into a short term focus on monthly performance.
The inability of hedge funds to fashion themselves into enduring institutions is reflected in the share price performance of the few that have tried. Sculptor Capital Management, formerly known as Och-Ziff, came to the market in 2007 and its stock price is down 96% since. Contrast that with private equity — Blackstone also came to the market in 2007 and it’s stock is up 72%; KKR came in 2010 and it’s stock is up 230%.
There’s a host of reasons for the differences. Limited partners in private equity funds are locked in so can’t redeem at the whiff of poor performance; they may not even see any poor performance because private investments are not as readily marked to market.
One difference though is that private equity firms have secured captive clients for their funds. Apollo was the first to pioneer this strategy with the creation of insurance company Athene. This week KKR announced the acquisition of Global Atlantic, a retirement and life insurance business with US$70 billion of assets. The acquisition takes the share of ‘permanent’ i.e. captive capital from 9% of KKR’s assets under management to 33%. In so doing it bolsters the durability of KKR’s franchise.
Quicken Loans was the subject of More Net Interest a few weeks ago, after it announced its intention to IPO. As a reminder, it is the largest mortgage originator in the US with an 8% market share. Its parent, Rocket Companies, filed its S-1 prospectus this week. It reveals a fundamental difference between private companies and public companies.
In common with most mortgage companies, private or public, Rocket services mortgages as well as originating them. It has long-term mortgage servicing relationships over US$344 billion of loans, on which it earns a fee of 0.310% per annum. What complicates this business is its accounting. Under US accounting standards, servicing relationships are treated as a financial asset, which means the future income stream is capitalised into a ‘mortgage servicing asset’. The problem is that the fair value of that asset isn’t stable. As interest rates move up and down, the propensity to remortgage moves down and up. That creates churn in the servicing portfolio, and the fair value of the asset shifts accordingly. Any change in fair value of the asset has to be reported through the income statement. Over the past five quarters Rocket absorbed US$2.6 billion of mortgage servicing write-downs as rates moved from 2.6% to 0.6%. The write-downs overwhelmed Rocket’s reported earnings, which were just US$991 million over the five-quarter period.
Public companies tend to hedge against such write downs so as to avoid the volatility in earnings they create. As a private company Rocket hasn’t done that. The company says it “recently began hedging a portion of the risks associated with such fluctuations” but it remains to be seen by how much. In the first quarter, hedge gains offset 6% of the raw fair value change; in the case of Wells Fargo that number was 89%.
Perhaps as a tech platform Rocket has license to exhibit greater earnings volatility. Indeed, its S-1 kicks off, “Numbers and money follow, they do not lead.”