“The oldest and strongest emotion of mankind is fear, and the oldest and strongest kind of fear is fear of the unknown.” – H.P. Lovecraft
Like so many bloggers, I’ve been reading lots about the coronavirus, puzzling over numbers, looking at charts, double-checking facts. I don’t usually spend this much time studying secondary and tertiary sources. Since my beat is business and wealth building, I prefer to conjure up my ideas and advice from my personal experience. But in this case, I have no choice. If I’m going to write about the virus, I’m going to have to study it. And that means relying on information I am not qualified to evaluate. Are the data accurate? Is the logic sound? Are there missing pieces? Reading the News Again: How Many Could Die? For 20 years, I’ve read what news I read in the evening. I never wanted to deplete my morning energy by focusing on problems that were beyond my control. But for the past month, I’ve been starting each day by checking two charts: one that tracks the stock market and another that tracks the coronavirus. I have a detached curiosity about the stock numbers. But my interest in the coronavirus numbers is visceral and strong. By next week, if not before, everyone in the United States will know someone who has been infected. I already know a half-dozen. The pandemic and its economic aftermath is going to have a psychological effect on Americans that will last for the next 50 years. Since I began tracking the data, the number of diagnosed cases has gone up every day. So too, happily, the number of diagnoses. But the data point I find myself stuck on is the number of deaths. It has gone up every single day – and in the past week, at an increasingly alarming rate. So every day, I wonder: How many will die? Will it be millions? Will it be hundreds of thousands? Or will it be less than 100,000, putting the coronavirus pandemic in “bad flu” territory? Thirty days ago, the numbers were small but the projections were big. Based on the “consensus” opinion then, the virus had an infectious rate of 3 (one person infects an average of three others) and a case fatality rate (CFR – the number of deaths compared to the number of cases diagnosed as positive) of 6 (6% of those diagnosed die). Putting these numbers into probability calculations, the mathematical models I was looking at were projecting an infectious rate of nearly 100% of the population and a death rate of 6%. That amounted to a projected US death toll of about 20 million people (6% of 330 million). And that wasn’t counting the many more that would die indirectly from heart attacks, strokes, and car accidents because access to hospital beds and ventilators would be so limited. That was the direst study I found. Others projected the number infected would be 200 million, with a death toll of 12 million. The most optimistic projection as to number infected was 60 million, with a death toll of 3.6 million. On March 16, the Imperial College of London issued a report based on slightly lower infectious and case fatality rates. This report projected that the US death toll would reach 2.2 million by the end of August. Again, that wasn’t counting the indirect deaths. The most optimistic projection I saw at that time was a death count of 1.8 million (based on a CFR of 3 and 60 million cases). All of these projections were being reported in front-page stories and on every TV news show. And hundreds more were being discussed online, along with heart-wrenching human-interest stories and all sorts of conspiracy theories. I was spending four hours a day reading. And every day, I felt like I knew less than I did the day before. And Then I Figured Something Out… What I didn’t know then was that most of those early death tolls were projections of what would happen if the virus continued to spread and kill at the speed and rate it had been spreading and killing up to that point. What those early calculations didn’t take into account was what epidemiologists call adaptiveness – the ways a population changes its behavior as awareness of a significant danger spreads. This includes all the things people are doing now to lessen the chance of catching the disease – washing hands, disinfecting surfaces, social distancing, and isolating. These behaviors slow the rate of contagion. With social distancing, for example, the infectious (or reproductive) rate declines. Instead of each victim infecting three others, that rate might drop to 2.5, then to 2.0, and so on. Once it falls below 1.0, the number of people that get infected starts going down. So too does the number of deaths. Another problem with those early mortality estimates was how they determined the lethality of the disease. The early numbers – first from Wuhan and then from Seattle – were very high: 6% and higher. The mainstream interpretation was that 6 or more of every 100 people that caught the virus would die from it. But that was wrong for several reasons. First, the deaths in the USA in the first two weeks were concentrated in nursing homes and cruise ships, where the average age of those infected was considerably higher than the norm. Since, as is the case with most viruses, this one is much more likely to kill older people and people with compromised immune system, one would expect those early fatality rates to be disproportionately high. In the state of Washington, for example, the first cases were in nursing home residents. That produced a highly distorted CFR. (At one nursing home, 34 of 81 infected residents died. That is a CFR of 42%!) This anomaly, along with the data coming from Wuhan, is the reason the early projections for the US were between 6% and 12%. So that was the first problem: an overstated estimate of how infectious the virus is. The second problem was the way the early media coverage misunderstood the data the CDC (and other groups) were publishing about the lethality of the disease that coronavirus causes: COVID-19. The lethality of COVID-19 was expressed in terms of the CFR, which, as I said, is a ratio that compares the number of deaths to the number of diagnosed cases. It doesn’t take a degree in statistics to figure out what’s wrong with that: * Since the symptoms, for most people, are similar to the flu, many people that get it wouldn’t go for testing and, thus, wouldn’t be diagnosed. * Of those that would go for testing, any that didn’t have advanced symptoms and a connection to a carrier would be turned away because of the scarcity of testing kits. I asked a doctor friend of mine about this. My hypothesis was that if you could know how many people were actually affected, and compared that to the number of deaths, you would have a real fatality rate that was lower than the 3% figure being talked about then. He agreed. He said, “They call it the denominator problem.” It works like this: When you underestimate the denominator, you overestimate the numerator. Thus, for the reasons cited above, the denominator (cases diagnosed) is likely to be a gross understatement of the meaningful statistic (the number of people that actually have the virus). So why were they using this faulty ratio? “Because,” my friend said, “you cannot measure what you don’t know.” To make a scientific measurement, you must stick to the facts. In the case of measuring lethality, there are only two relevant facts: the number of cases diagnosed as positive and the number of deaths. In the beginning of the outbreak, the CFR will give you a rate that is higher, even considerably higher, than the real death rate for the reasons pointed out above. But as the days and weeks go by and you get a larger percentage of the population tested, this distortion will diminish. And that’s what has happened since I’ve been looking at it. The CFR in the US has dropped from 6% to 3% to about 1.7% today. Will that continue to drop? Definitely. Up to now, we’ve had just a fraction of 1% of the US population tested. As tests ramp up quickly, so will the diagnosed cases. And as the ratio of diagnosed cases to deaths increases (as it will), the CFR will continue to go down. To get to a realistic lethality rate, we have to take another guess: We have to guess how many Americans have the virus but have not yet been diagnosed with it. This is the denominator problem I mentioned above. Considering that 80% of those that get COVID-19 have mild symptoms, and that we’ve been able to test so few, my guess has been that for every person diagnosed, there were 10 others that had it but had not been diagnosed. A recent report I read that summarized estimates from top epidemiologists concluded that the percentage of diagnosed cases versus actual cases is 9%. Close enough. So let’s use my 10% guess to keep the arithmetic simple. What that means is that the number of Americans that have the disease right now (as I write this) is about 10 times larger than the diagnosed cases. Ten times the current diagnosed cases (139,061) is about 1.4 million. So now we have an “adjusted” infected rate of 1.4 million and a death count of 2428. And to get a realistic fatality rate, all we have to do is divide 2428 by 1.4 million. Right? But wait… there’s more Alas, no. There is another problem with the CFR: It doesn’t make sense to compare the number of deaths to date to the number of cases to date. That’s because people that die from COVID-19 don’t die overnight. Based on the numbers so far, it seems to take 10 days to two weeks. Therefore, the correct ratio should be the number of deaths to date over the number of cases diagnosed 10 days to two weeks earlier. This sounds like a problem that could be easily solved: Simply compare today’s death count against the number of cases diagnosed 10 to 14 days ago. But if you try that for several days in a row, you will see that the number you get keeps moving because you are working with two sets of numbers – death rates and diagnosed cases moving at the same time. So, no, we can’t arrive at a precise number. But we can arrive at a range. The comparisons I did since the beginning of the month increased the CFR by a factor of 2.5 to 4. That would make the lethality rate somewhere between 0.85% (0.34% x 2.5) and 1.36% (0.34% x 4). Okay, so that gives us a real fatality rate of as a range of 0.85% to 1.02%. How many will be infected? Let’s move on to the other metric we need to estimate the death toll: the Ro or reproductive rate – i.e., the rate at which the virus will spread from one person to others in close contact. Like the case fatality rate, this one has been going up in the past month. Since I’ve been tracking it, it’s gone down from 3.0 to 2.3. A reproductive rate of 2.3 means that each person that gets the virus will infect 2.3 more. 2.3 might not sound scary, but take a look at how fast it turns into 2.4 million:- 3 x 2.3 = 5.29
- 59 (5.29 + 2.3) x 2.3 = 17.4
- 0 (17.4 + 7.6) x 2.3 = 57.6
- 6 x 2.3 = 189.9
- 6 (189.9 + 82.6) x2.3 = 626.9
- 5 (626.9 + 272.6) x 2.3 = 2068.9
- 2,968.4 (2068.9 + 899.5) x 2.3 = 6827.4
- 9,795.8 (6827.4 + 2968.4) x 2.3 = 22,530.3
- 32,326,1 (22,530.3 + 9795.8) x 2.3 = 74,350.0
- 106,676 (74,350.0 + 32,326.1) x 2.3 = 245,355.1
- 352,031.1 (245,355.1 +106,676) x 2.3 = 714,623.4
- 1,066,645 (714,623.4 + 352,031.1) x 2.3 = 2,453,304
- 3,519,949 (2,453,304 + 1,066,645) x 2.3 = 8,095,882