Qualitative vs. Quantitative Analysis: why we have no idea what is going on with COVID19

To continue the previous point that the COVID-19 data is problematic (scientific word for “sucks”), the results over the last few days just reinforces the fact that doing quantitative analysis on something where the quality is so bad is frustrating at best, and misleading at worst.  And that has enormous policy implications.

Why is this? We need to know at least three key values to get a handle on this pandemic:

  1. The number of people who have died from it (how bad is it?);
  2. The number of people who got it and were spreading it (how infectious is it?);
  3. The total number of people who have been exposed to it to a sufficient degree to either beat it or got sick (how far along in the pandemic we are?).

There are some other “great to know” data points, but these are the absolutely essential numbers we need to have at least some level of confidence in to decide things like what mitigation measures are required, if the pandemic is tapering off,, and when (and under what conditions) it is safe to “reopen”.  The problem is we should have almost zero confidence in any of these three numbers.

Take something ‘simple’ like the number of people who have died. Should be simple: someone died from an ACUTE onset respiratory disease.  You test for the SARS-COV-2 virus, and if that is positive, you can call it COVID-19.  This is vital to have nailed down, because that tells us how bad this thing is.  Yet we don’t know that number with any degree of confidence. As noted previously, the testing situation is a nightmare and getting worse, not better.  There hasn’t been enough testing, so in many cases the diagnosis is based only on symptoms.  The testing itself is problematic, and the newer, fast tests are of really dubious accuracy (as bad as only 30% by some studies).  There are no universally applied standards, and no uniformly collected and available data sets, for what is and isn’t a SARS-COV-2 primary cause of death.

So the bottom line is that plots like this:

showing the deaths per 10,000 people, are nearly worthless for decision making at this point.  Look at what happened to the New York numbers last week, when they arbitrarily changed their selection criteria.  The other two via questions, how many people have it and are actively spreading it, much less how many have already been exposed and beaten it, are even less usable: no reliable testing, no standardized criteria.  That’s not to say we can’t extrapolate, interpolate, and try to draw some actionable conclusions – we have to do that no matter how bad the data.  I’ts just that this data is way noisier than it should be, and thus the decision making is far more uncertain and risky than it needs to be.

This is the sad and embarrassing situation in the US.  Other western countries like Iceland, Denmark, Germany are in vastly better shape with respect to what is going on.  The US, despite much greater resources, is a mess due to a fragmented political structures (local vs. states vs federal), and the convoluted health care delivery system.

And no reliable information means that decisions are arbitrary and driven by politics, not science.  And before one side or the other claims the mantle of “good science”, that applies to both parties – the shoddy foundation for this sad situation has been laid by administrations of both parties over the last 40 years.  There was an opportunity, after the end of the Cold War, to take care of many social and structural issues that had been subsumed by that conflict.  Instead, both parties squandered the opportunity.  Disasters like this were, and are, inevitable.

4 thoughts on “Qualitative vs. Quantitative Analysis: why we have no idea what is going on with COVID19

  1. Do we have anything like these numbers for regular flu seasons? I realize the shot changes things, but I suspect we know next to nothing about the real death rate, the numbers of people who get it, survive it and return to life, and/or the people who do not get a severe case and think it is a cold.

    • Oh my gosh, we have TONS of data on influenza, broken down by strain, etc. Like all data it has uncertainty bounds, but as an analogy, for SARS-COV-2, we are debating if the earth is flat or round. For influenza, it’s the shape of the spheroid and where the gravitational anomalies are. OK, maybe that’s not a simpler analogy, but you get the point 🙂

  2. I would be interested in your thoughts regarding the studies coming out of Santa Clara, Los Angeles, and Germany which extrapolate an overall fatality rate of 0.1-0.4%, as well as the Icelandic study supporting evidence that 50% of those infected are asymptomatic. Additionally, the Santa Clara study predicts the infection rate is 28-55% higher than believed. Some are saying the more wide-spread infection rates and the high percentage of asymptomatic carriers is supportive of continued enforced social distancing measures, while others maintain a fatality rate comparable to that of the seasonal flu would make those measures far less value added. Could you share your opinions?

    • It’s interesting that the data from the Diamond Princess cruse ship and other early data pointed to about those numbers, so the data has been pointing that way for some time. I pulled up my estimate/forecast summaries from late February that were based on that data, and it had an asymptomatic rate of 45% of those infected, and an overall fatality rate of .22%. That hasn’t changed much since then – what has happened is there has been time to do focused studies in other settings. As for the argument between mitigation (social distancing, shutdowns, etc). vs resuming normal life, it’s all about who and how many people you are willing to kill. Both options carrying risk and will cause harm. The tragedy is that if we had a solid testing and response plan, we wouldn’t have to make that choice … we would have options, and data to decide between them.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.