Graphing the outbreak: why some look like a mountain, some like an S

Lots of graphs are making the rounds, and in the White House briefing yesterday (31 March) there was a lot of talk about the Institute for Health Metrics and Evaluation (IHME) and other models of resource use or peak deaths.  It seems there is some confusion about what the graphs are showing, here is a brief 😛 overview of the two most common you will see.  The key difference is that they are trying to answer two different questions: how many total of something, and the peak rate of something.

Let’s take a detailed look at Spain mortality, since they are right at their peak, and it might be a good analog as to what New York is facing.  Note that in all of my analyses I’m focusing on deaths.  The reason is that the definition of a “case” or “hospitalization,” much less “infected,” is really fuzzy because to be blunt the testing is inconsistent and, well, a mess.  But deaths are being tracked a bit better (at least outside China).   It’s also important to recall that deaths lag “cases”, in this disease typically by about 15 or more days.  So you will see the number of cases per day start to drop even though the death rate is still climbing or steady.

Here is a plot of the deaths in Spain, along with a statistical prediction (red dashed line), and the rate at which those deaths are happening (black line, exaggerated by a factor of 10 to better see it).  In other words, the total deaths, and the deaths per day .  You can clearly see the two curve shapes:

The peak death rate per day happens in the middle of the outbreak.  In this case, Spain probably hit the peak sometime in the last couple of days.  The model forecast 900 per day, 908 were reported Monday.  Hopefully this forecast holds true, and Spain (and Italy) are starting to recover.  Italy has reported a drop in new cases, the death rate should drop quickly as well.   I tend to use the cumulative mortality plots since that is showing you the end point and where you are with respect to that end total.  You can estimate the rates based on how steep the curve is.   In technical terms, the total deaths to date is a cumulative function, and the rate of change is the first order derivative of that function (not for the faint of math heart).

All of my graphs other than the above example are normalized for population.  What that means is that the denominator is fixed so we can compare.  I really wish everyone would do this because otherwise you can’t see what is going on.  For example, the population of Spain is 46.6 million, New York State around 20 million.  So 100 deaths in Spain is the equivalent of 42 deaths in New York.   So when you are looking at a graph, be very aware of what question it is trying to answer, and how it is scaled!

Here is the mortality data as of this morning for a variety of states and countries.  This plot is the “cumulative deaths per 10,000 population plot,” which is trying to answer three questions, how do different regions compare, where are they in the overall progression of the outbreak, and how many total deaths can we expect:

As can be seen, New York is just entering the steepest part of the curve.  If these trends and statistical projections pan out, New York can expect between five and six thousand deaths over the next two weeks, and across the US, between 75,000 and 125,000 by June. That is in keeping with what the more complex models are showing as well, in the case of my “in-house” model, around 130,000.  Hopefully the mitigation measures will start to “bend” that curve, but remember what we are seeing now in deaths is the result of actions three to four weeks ago, so be patient and follow the mitigation guidelines so you or someone you care about won’t become a patient!

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.