Hello and welcome! My background is in both public health (worked at CDC for 4.5 years) and higher education. So the way that I approach COVID analysis is as a translator - I speak the language of both scientists and those who are non-scientists. My goal is to bridge that gap so that everyone has the data they need to make evidence-based decisions during a pandemic when the stakes are high for many.
There are a LOT of data out there and, honestly, it can be really overwhelming to think about it all much less access the information you need.
How do I identify high quality data?
•In all things, the best information are local, recent and adjusted for population. LOCAL because that is the area most likely to affect you. The more local you can get, the better. RECENT because the folks who were sick in the spring are no longer participating in this pandemic and aren’t contributing to current disease transmission. At the end of the pandemic, it will be interesting to see how many cumulative cases there have been but while we’re in the pandemic, we need to know what’s happening now. ADJUSTED FOR POPULATION because in a small town, 10 cases are a much bigger deal than 10 cases in a large city.
•Pay attention to trends, not single-day occurrences. You’ll notice that most graphs have a 7-day moving average. This takes the average of the most recent 7 days and it is effectively a trendline, allowing us to see the trend through the noise of day to day variation.
•Note what the axes on a graph are showing you, familiarize yourself with the color coding. The public health graphs are not the graphs we learned in high school and/or college.
•Are you looking at total counts or percentages?
•Data are not all-powerful. We have to have the right data set to answer the question.
What are the most important things to track?
The deeply dissatisfying answer to this question is that we have imperfect data so there isn’t one single disease metric that matters more than the others. The data are imperfect for a variety of reasons such as problems with delayed reporting, test-seeking behaviors, asymptomatic disease, etc. Because the data are imperfect, getting the best estimate of how we are doing at any given time is a combination of different data streams. But, to me, the four main categories of data that I track are testing, cases, hospitalizations and deaths.
Testing
The gold standard in testing is the RT-PCR test that looks for the genetic material (called RNA, analogous to our DNA) of the virus. But there are also antibody/serology tests that look at recent or past infection as well as a relatively new test called the rapid antigen test that looks for the presence of the viral proteins, the spikes that we see on the outside of the virus graphics.
Things I track in this category are total test output and percent positive rate. As the adage goes, an ounce of prevention is worth a pound of cure. Without the vaccine, the closest thing we have to prevention is testing - being able to rapidly identify who is sick and isolate them and then test and quarantine their close contacts, identified through contact tracing. So test output monitors demand for testing. Percent positive rate tells us of all the tests performed, how many were positive. This is important because we want to be testing widely enough that we aren’t likely to be missing cases. The goal set by the World Health Organization is to be at or below 5% for a sustained period of time before advancing through a phased reopening plan. The higher the percent positive, the more likely we are missing cases and they are unknowingly contributing to disease transmission. The percent positive rate often needs to be paired with recent case rate to tell us how is disease transmission in my local area, and how much of that is likely to be under-represented because of inadequate testing.
Cases
There are two ways to graph cases - by date of report and by date of symptom onset. In public health, it’s very common to graph outbreaks by date of symptom onset because it allows us to better see the pandemic as it happened. However, it is very susceptible to delayed reporting so there is often a 14-day window of uncertainty during which time the data are most likely to change. Kansas Department of Health and Environment seems to be using a 7-day window of uncertainty (gray box on the graph below), which is a bit unusual. Let’s look at the reasons why that might be.
Below is a timeline of the typical disease course for COVID-19 illness.
Reference 1 from the figure.
The median length of time between exposure and symptoms is 6 days. A person might initially shrug off the symptoms, thinking it’s just a cold or seasonal allergies. It might take a few days before they seek out a test or care from their physician. The median of this delay is another 3-7 days. From there, tests can take anywhere from 20 minutes to obtain a result (i.e. rapid antigen test) up to 14 days during peak disease periods. At least, that’s what we experienced in the South during the summer surge. But when the report of illness finally comes in to the Department of Health and Environment, they backdate the case and add them to the graph on the date when the person first started experiencing symptoms. Most of the cases are reported within 14 days, but there are exceptions. What this means, though, is that our most reliable look at the pandemic is based on data that are 2 weeks old. The other benefit of this method of graphing is that it can make adjustments for data dumps - old case information that gets reported very late. But the window of uncertainty can make things very confusing. It makes it look like there’s always a downward trend of data when really, that’s an artifact of how infectious disease and human behavior work. See? These aren’t the graphs we learned in high school.
A common question is whether these case counts are accurate. For example, if a person is tested positive multiple times are their results added together in the case count? The answer to that is no. There’s a list of diseases for each state that laboratories and healthcare providers are required by law to notify the Department of Health and Environment if it is identified or diagnosed. So you can imagine that if both the laboratory and the physician are required to notify, then duplicate reports are coming in all the time for the same individual. But this system has existed long before COVID-19’s arrival and the DHE is able to de-duplicate those results. Each case represents just one individual, regardless how many times they were tested. If anything, our case reporting is likely an under-count because of how many COVID-19 cases are asymptomatic. CDC estimates 40% of cases are asymptomatic and likely never seek a test. So what we are counting is likely just the tip of the iceberg, so to speak.
I mentioned above that you want to look at local data as much as possible, and ideally adjusted for population. That information is available in the KDHE “case rate” tab of their report. The state adjusts for population by showing cases per 1,000 residents based on census data. Most external resources are going to use cases per 100,000 residents, so just understand if you look at the COVID Tracking Project, Harvard Global Health Institute, New York Times coronavirus tracker or other resources that what you see in this report isn’t comparable to what the state provides. But it does allow you to see how your county is doing relative to the statewide average. For example, in the screenshot below, Kansas has a case rate of 21.52 per 1,000. But the pandemic isn’t being experienced equally across all counties. In Seward county, the case rate is 76.5 per 1000, but in Saline county the rate is 14.9 per 1000.
In the map below the case rate map in the KDHE reports is a way to see where the newly identified cases were located. Remember how I said that the best data are recent? This is an example of that. The map has a drop down menu where you can choose different dates in history for the state.
Hospitalizations
For hospitalizations, I tend to track current hospitalizations, new admissions, new ICU admissions, adult ventilator usage and ICU occupancy rate. ICUs and ventilators are among the most critical and limited resources in treating severe forms of coronavirus disease.
You can see how many emergency department (ED) visits are for COVID-19-like illness (CLI, see red line). This graph is updated weekly and it shows that ED visits for CLI have been higher than what is typically seen for influenza-like illness in years past. You can access this graph by clicking on the “Hospital Summary” button on the KDHE report.
The other place to see hospital-associated information is by clicking on the “Hospital Capacity” button. A screenshot is provided below. The graphs include data on beds and ventilators available, those in use by all patients, and those in use specifically by COVID-19 patients. This tells us the burden of COVID-19 on hospitals but also their capacity for coping with additional needs for intensive care, such as car accidents, complications after surgery, heart attacks and strokes, etc.
In the boxes above the graphs, you can see a snapshot of the most recent data for the state and for health care coalition regions. Again, remember that local and recent data are the most important.
Deaths
I wish we didn’t have to talk about this metric, but it’s an unfortunate part of infectious disease and this pandemic, in particular. To me, I mostly pay attention to deaths by date of death and I also check something called excess deaths to gain a sense for how many deaths we might be missing due to problems with testing, reporting, etc.
Deaths suffer from some of the same delayed reporting problems that cases do, but to an even greater extent. Below you can see the same timeline that I introduced for cases, but now adding when death typically occurs, about 26 days after exposure. Most deaths are reported within a week. But there can be delays (see dashed line) due to investigations of death by the coroner or medical examiner, etc.
Just beware that this means there is a window of uncertainty for death reporting too. The graph will likely always make it look like deaths are decreasing. I certainly want that to be true, but because it actually is true. We just don’t have good reporting in the most recent week or two.
The KDHE has a lot of other really great data on demographics for cases and deaths and we’ll dig into that in another newsletter. But I wanted to get us all on the same page for evaluating data.
However, because we didn’t have robust testing ability in the early days of the pandemic, because COVID-19 can look like so many other things and because not every death has an autopsy, we are likely under-counting deaths and this is a problem throughout the country and the world. To see around this instance of imperfect data, we can look at the total deaths that have occurred this year compared to past years. These are called Excess Deaths. You can check it any time and for any state of interest by visiting the CDC’s excess deaths tracker.
First, let’s start with how to read the graph they provide. On the x-axis (horizontal) you see blue bars that represent a week’s worth of data. The y-axis (vertical) tells us how many people died in a given week, of any cause (not just COVID-19). The orange line is a threshold based on the most recent 5 years’ worth of data to estimate what would be “above normal” levels of death. It adjusts seasonally. Any bar that surpasses the orange line is a week of excess deaths. Those weeks are also marked with a red plus sign to make it easier to spot them. Since the arrival of the pandemic and more so since July in Kansas, we’ve seen weekly deaths surpass that threshold. The previous time that we surpassed the threshold in February 2018 was a particularly difficult flu season. The difference so far between a bad season of influenza and COVID-19 on this graph is that the excess is bigger and more sustained. With the CDC link, you can also look at how excess deaths have been distributed by age, race and other demographics for Kansas.
The New York Times takes this graph a step further and counts the number of deaths above threshold for each state. Kansas has experienced 8% excess deaths this year compared to the threshold. As of September 5th, excess deaths indicated that Kansas had lost 900 more people than usual, but the reported COVID deaths at that time were 481. Not all of these additional deaths will be due to COVID, but it’s reasonable to expect that many of them are. The only thing that has changed this year compared to the past 5 years is the arrival of the pandemic.
A bit of housekeeping, I plan to report on Kansas twice a week for now, summarizing data weekly for the state and putting Kansas into the larger national and global context of the pandemic. As the pandemic intensifies in Kansas, and as reader support increases, I can add more reports as needed. For me, there’s a balance to giving you information you need to make evidence-based decisions, but not overwhelming you with emails that are too frequent to be meaningful. If you ever have questions, you can reply to this email when it arrives in your inbox or reach me at amberschmidtke.phd@gmail.com. I’m an educator and love to answer questions. We can also continue the conversation via Facebook, Twitter and my podcast, Public Health for the People. Right now, the podcast is mainly focused on Georgia statistics, but it is evolving to a broader audience. New episodes launch every Wednesday and this week’s episode talks about ways to manage risk during Halloween, the November election, and the arrival of the influenza season.
Available on Apple Podcasts, Spotify, Google Podcasts, etc. That’s it for now. Be safe and be well!
References
https://wwwn.cdc.gov/nndss/conditions/coronavirus-disease-2019-covid-19/case-definition/2020/
https://www.who.int/publications/i/item/public-health-criteria-to-adjust-public-health-and-social-measures-in-the-context-of-COVID-19
https://www.cdc.gov/coronavirus/2019-ncov/hcp/planning-scenarios.html
https://www.coronavirus.kdheks.gov/160/COVID-19-in-Kansas
https://www.kdheks.gov/epi/disease_reporting.html
https://www.cdc.gov/nchs/nvss/vsrr/covid19/excess_deaths.htm
https://www.nytimes.com/interactive/2020/05/05/us/coronavirus-death-toll-us.html
Now that I have relocated to the Midwest, I am adding the state of Kansas to my analysis. Kansas COVID-19 Updates is a free newsletter that depends on reader support. If you wish to subscribe please click the link below. There are free and paid options available. Sign up now so you don’t miss future issues.
My Ph.D. is in Medical Microbiology and Immunology. I've worked at places like Creighton University, the Centers for Disease Control & Prevention and Mercer University School of Medicine. All thoughts are my professional opinion and should not be considered medical advice.