NCSL Podcasts

Disease Forecasting, Nowcasting and Scenario Modeling | OAS Episode 196

Episode Summary

On this podcast, we sat down with Dr. Roni Rosenfeld, a computer scientist and a leader in the field of disease forecasting. Rosenfeld explained how researchers working on disease forecasting have taken weather forecasting as their model in creating tools to better understand the path of infectious diseases.

Episode Notes

The ability to forecast how an infectious disease like COVID-19 will behave is a critical tool for public health officials.

On this podcast, we sat down with Dr. Roni Rosenfeld, a computer scientist and a leader in the field of disease forecasting. Rosenfeld leads the machine learning department at Carnegie Mellon University in Pittsburgh and also works with Carnegie Mellon’s Delphi Research Group, which is one of several organizations that are part of the newly developed Outbreak Analytics and Disease Modeling Network established by the Centers for Disease Control and Prevention. 

Rosenfeld explained that, over more than a decade, researchers working on disease forecasting have taken weather forecasting as their model in creating usable tools to better understand the path of infectious diseases. He explained the type of data disease forecasters use – everything from hospital records to Google searches—to develop their forecasts and how that information can help those in health care. He also discussed why it’s important for legislators and others in state government to understand how to use and interpret disease forecasting.


Episode Transcription

Ed:      Hello and welcome to “Our American States,” a podcast from the National Conference of State Legislatures. I’m your host, Ed Smith.


RR:      Predicting is trying to tell you what’s going to happen. Forecasting is trying to tell you for anything that could happen how likely is it to happen. 


Ed:      That was Dr. Roni Rosenfeld, a leader in the field of disease forecasting and my guest on the podcast. Rosenfeld is a computer scientist who heads the machine learning department at Carnegie Mellon University in Pittsburgh. He is a professor in the department. He also works with Carnegie Mellon’s Delphi Research Group, which is one of several organizations that are a part of the newly developed Outbreak Analytics and Disease Modeling Network established by the Centers for Disease Control and Prevention.


            Rosenfeld explained that over more than a decade, researchers working on disease forecasting have taken weather forecasting as their model in creating tools to better understand the path of infectious diseases. He explained the type of data disease forecasters used--everything from hospital records to Google searches--to develop their forecasts and how that information can help those in health care. He also discussed why it is important for legislators and others in state government to understand how to use and interpret disease forecasting.


Here is our discussion.


Roni, welcome to the podcast.


RR:      Thank you. It is a pleasure to be here.


Ed:      So, to get started, tell the listeners a little bit about your background and how you got into the field of disease forecasting.


RR:      Sure. My initial training is in physics and mathematics. I then did a Ph.D. in computer science and statistical modeling. I’ve been working in computer science and machine learning for the last 30 years or so. About 15 years ago, I started studying virology and immunology, which led me to epidemiology. At the time, I noticed that there was a vast and established science of epidemiology, but that the technical aspect of it was not as well developed. So, it’s a little bit like having a science, of weather of how weather evolves and understanding weather, but not having the weather forecasting technology to come with it. And that’s what drew me in to work on epidemiological forecasting. 


            We are very pleased to learn recently that our Delphi research group, of which I am part, will be a network partner with the new CDC Center for Forecasting and Outbreak Analytics. 


Ed:      As I understand it, there is a big distinction between predicting and forecasting. Even though most of us probably use those terms interchangeably. Can you talk about what that difference is and why forecasting is a more useful tool in making decisions? 


RR:      Sure. So predicting is trying to tell you what is going to happen. Forecasting is trying to tell you for anything that could happen how likely is it to happen. And in some sense, predicting is more attractive because people really want to know what is going to happen and if they had their druthers, they would just have you tell them. Tell me what’s going to happen. But the reality is that very often usually, we can tell people what’s going to happen. There is a significant amount of uncertainty in the situation that we cannot commit to and in the sense, it is more helpful if you think about it to tell you the truth which is multiple things can happen, but some of them are more likely than others. And how much likely is important you know. Sometimes something has only a 5% chance of happening, but if the repercussions are severe, you want to know that and you want to take action even though it is only 5%. But if you knew it was not 5%, but 0.01% that’s a completely different story right. So, knowing how likely things are to happen even when they are less than 50% likely to happen is quite useful and that’s what we are trying to provide.


ED:      Well, that makes a lot of sense even to someone like me and I hope to our listeners. Along with forecasting, there is also scenario modeling and nowcasting. And I wonder if you could break those down for us and explain how each one is used.


RR:      Sure, so it might be easiest to do it again in comparison with weather forecasting which has really been an inspiration for our work over the last decade. In weather forecasting, knowing what the weather is right now is not difficult. What you need is to measure the temperature and the wind direction and strength and humidity and so forth and to measure it all over the world. It sounds, it’s obviously not logistically easy, but it has been done. It’s done through international collaboration. It’s not scientifically difficult to know what the state of the weather is right now in the world and then weather forecasting is free to take that state and try to project it forward to see whether it will be an hour from now, a day from now or a week from now. 


            In epidemiology actually, it is nearly impossible. It is practically impossible to know what the current state of the world is. Nobody really knows how many people are infected with a particular virus in a particular city in a particular week. The uncertainty in weather forecasting is only about the future. The uncertainty in epidemic forecasting is also about the present and even the past. So nowcasting is basically about … forecasting the present. Try and understand and to estimate to the most accurate way possible what is the current state of an epidemic, what is the current rate of infection, of prevalence of a particular disease, incident of the disease and so forth. And it’s not only useful for making decisions that have to do with is the current situation, but it is also a necessary condition for being able to do a good job on forecasting the future because if you don’t know the present, it is that much harder to know what the future would be like. 


            You mentioned scenario modeling. Scenario modeling is a form of forecasting, but it’s conditional forecasting. It is asking the question rather than tell me how likely are different things to happen. It is asking how likely are different things to happen assuming this happens or this doesn’t happen. So, I will give you an example. We may be able to give you pretty good estimates of how bad things are going to be with regard to say hospital load or load in the health care system under the current particular strain of a virus of infectious disease. But there is always the chance that there will be a mutation and it would you know we never can underhold a situation. You might want to estimate how likely that is to happen, but it is a separate problem and sometimes it can do some and sometimes it cannot. Sometimes it is useful to say if there is no new variant, hospitals should be able to handle this fine. If there is new variant, I don’t know. And this I don’t know part is actually quite important and it is something that sets forecasting apart from prediction. 


            It’s really important to know when you know and when you don’t know and to be able to express that very clearly. And it is not usually black or white. Sometimes you feel like a situation is fairly well understand and the forecasts are quite good. And sometimes you think they are not very good. They don’t have enough data to be based on and you’d rather just say I don’t know. In scenario modeling, it is sort of an acknowledgement that somethings, some future developments we don’t know about and we will tell you what we think will happen under each one of them.


            (TM):  7:47


Ed:      I think it is difficult for lay people sometimes to accept that the answer sometimes is we don’t know because there is a real desire to have an absolute black and white response. So, it is very interesting that you say that. 


RR:      Absolutely. But it is so important and it is so important for us. I’m sorry it is so important for us to ahm to resist the temptation to provide an answer in the circumstances and only provide it when we feel that we have a basis for it. 


Ed:      Can you talk a little bit about how you develop these forecasts?  What sort of datasets you tap into?


RR:      So, over the last 10 years, a large part of what we have done is seek out data sources that help shed light on the current state of any epidemic and on its recent past so that you can try to project that into the future. So, there are a variety of data sources and the more you have the more robust your system is, the more reliable it is. So, I’ll just give you a few examples. There are the standard data streams that most people are familiar with post pandemic such as the reporting of cases, reporting of hospitalizations and of deaths from a particular pathogen. These are quite helpful, but they don’t give you the full picture. We know that they are limited. We know that they capture some things and with others, they can be biased at times. And they are not always available in real time. Other data sources can play a complimentary role to sort of robustify your understanding, triangulate your understanding, crosscheck to make sure that the picture you get and from other readings are consistent with what you get from these main public health reporting. So, things like the frequency with which people search the web for different topics. So, search query volume. The kind of signals that the Google puts out in aggregate for large cities or states saying what fractions of queries were about a particular condition. It has been shown to be very strongly correlated with how many people actually suffer from this condition. So, that’s one example. 


            Another example would be over the counter products sales. How many thermometers are being sold in a particular city in a particular week. Now you might think that if I’m sick, I don’t buy a thermometer because I have one at home. But it is also true that when people do buy a thermometer, it is usually because not only they don’t have it, but they have an acute need for it. So again, it has been shown that there is very strong correlation between thermometer sales and febrile illnesses. These are 2 examples. Ah another example would be anonymous data from testing companies that provide tests for a variety of like flu or cough or other diseases. Other sources would be aggregated statistics from insurance claims. If you check to see how many times insurance claims were filed for particular conditions, look only at the numbers of how many were filed in a particular region. Use that as a signal. It is a very informative signal. So, we gather maybe 10 different signals from 10 different sources. Add to them the cases and hospitalizations and deaths reported by public health officials and you get a much more robust and actually a much more accurate picture of where things are.


Ed:      Even if the forecasting is good, it is only useful if you are able to actually spur a whole bunch of different people into the action. I’m just wondering, how does it go from a forecast using all these different data sources to be able to actually have people in the health care community respond to an upcoming disease outbreak or potentially upcoming disease outbreak?


RR:      So, there are a variety of decision points that public health officials face that would benefit from having better knockoffs and better forecasts. One common one that happens even in nonemergencies is timing of different public health campaigns. When should you send reminders to pharmacies to stock up on certain products. When should you start public health campaigns to educate people to vaccinate themselves especially vulnerable populations before a wave of say flu comes around. A lot of what public health does is communication. Communication and education to their stakeholders and that communication has to be timed right because of the limited attention span of people because of wanting to hit things just at the right time. So, communication is one thing. Another very important area of decision-making is in planning capacity especially at a time like this when the health care industry is suffering from a severe shortage of health care personnel. It is very important to know when they would be needed so you don’t schedule elective surgery at the wrong time and then find out that you are out of beds. Or schedule vacations or make sure you have the right equipment at the right place. Capacity issues can benefit tremendously from forecasts. And then, of course, interventions. That’s much harder because we are not providing direct information on how different interventions would affect the state of an epidemic or epidemic, but ahm so now I’m modeling some of that.


Ed:      Thanks Roni. We will be right back with the rest of our discussion after this short break.


            (TM):  13:18


            I’m back with Dr. Roni Rosenfeld discussing disease forecasting. Roni, I’ve asked this question of a lot of people on this podcast and I think you are the perfect person to ask. What lessons did you take away from the pandemic?  I’m sure there were a lot of them, but just in terms of how to be better prepared. What were the big takeaways for you?


RR:      Yes. Quite a few things and we are still working on them, but I can share a few of them with you. One is that our forecasts are only as good as the data we have. We’ve made tremendous progress over the last 10 years in identifying and procuring new data sources. And in also analyzing the data sources we have and understanding them better because often data is not what you think it is until you actually dig in and do the statistical analysis. So, we’ve gotten to a pretty good point, but I think we can do better. And a lot of what I see my group is doing is seeking out additional data sources. The more data we have, the more accurate both are nowcast and our forecast are going to be. In that regard, I would say we are behind with the forecasting. Well based on quite a few decades before us, but they’ve made steady progress over the years not just with the algorithms, but also with their methods for acquiring the data. 


            The other point I would make is the importance of communication not only just for public health. We talked about it, but for our own forecasting work explaining as we’ve done in this podcast what is it that we are trying to achieve with forecasting. What are we basing it on. When CDC puts out their own forecast, they are not just asking somebody to squint and tell them what they think. It’s based on a thorough process of consultation with a variety of data sources, a variety of modeling groups trying to reach a consensus view that sort of integrates all of this information in. All of this needs to not just be communicated, but also to be clear. We need to build trust in these messages. We need to show people in non-emergency times that these things are useful and they can be useful in the databases and they could be double useful during an emergency.


Ed:      So, our audience is of course largely legislators, state legislative staff, other people interested in state policy and I wonder if for that group, people who are looking at what their policy role is in this, what would you share with them?  What should they know about disease forecasting?


RR:      A few things. One is I would reemphasize my last point about trust. We need to build trust in the community that does the forecasting and its track record in its ability. There is a tendency to be drawn to outlandish predictions especially during an emergency when people are desperate for a definite answer of what’s going to happen. You need to resist that temptation and listen to people who have been producing these forecasts consistently for a long period of time and you have a track record of how well they did. 


            The other thing I would say is the need for data, the education and understanding that these forecasts are only as good as the data they are based on and that we need to make sure that the public good in the form of aggregated statistics is not impeded so that we put together the policy and legal mechanisms for making these data available to make forecasts. 


Ed:      To follow up on that last question and to wrap up, I wonder what the general public ought to know about disease forecasting.


RR:      I would like it very much if the general public knew that disease forecasting has been developing as a technology for over a decade now. That it’s gotten to a place where I think it is useful in a variety of situations and that it can be made much more useful still with additional data. The use of disease forecasting is not just for public health officials, but also for individuals and for a variety of organizations that have to make decisions just like you have to make a decision about rain. You know is it going to rain and you are not told whether it is going to rain or not, you are told is it a 90% chance or a 20% chance that it will rain and then you make your own decision. The same way and maybe arguably more meaningfully if you are told as an individual as a say a member of a medically sensitive group or as a child of an elderly person who is trying to help them make decisions, if you know what the level of risk that you can expect for your loved one at a certain time, it can help you and then make informed decisions. The same is true for large organizations make decisions for their employees. 


Ed:      I think we all listen to the weather forecast every day and it seems perfectly normal so it ought to be just as normal I think to listen to a disease forecast. Thank you so much. I wonder is there anything else you would like to share with the listeners before we say goodbye?


RR:      We all hear about the flu season coming and we know that it is a fairly large season starting some time in September or October and lasting until March or April. But very few people know that actually a particular wave of flu that typically comes in the winter comes over a fairly short period of time, six to eight weeks, during which most of the cases most of the hospitalizations and deaths due to flu happen. One of the things that forecasting nowcasting can help with is making people aware of when that flu wave is coming and making help them sharpen their decision about you know. I don’t want to avoid crowded places for the entire 6-month period, but if I’m immunocompromised or very frail, I might want to avoid it for 4 to 6 weeks.


Ed:      Yeah, that is a good very good concrete example and I think everyone is much more attuned to that after a couple of years of the pandemic than they were before so I think that’s very helpful. Roni, thank you so much for taking the time to do this. Take care.


RR:      Thank you.


Ed:      I’ve been talking with Dr. Roni Rosenfeld of Carnegie Mellon University about disease forecasting and how it can help health care providers and states be better prepared to deal with infectious disease outbreaks. Thanks for listening.


            You can check out all the podcasts from the National Conference of State Legislatures by searching for NCSL podcasts wherever you get your podcasts. Tim Storey, NCSL’s CEO hosts “Legislatures:  The Inside Storey” where he focuses on leadership and legislatures. The “Our American States” podcast dives into some of the most challenging public policy issues facing legislators. On “Across the Aisle” host Kelley Griffin tells stories of bipartisanship. Also check out our special series “Building Democracy” on the history of legislatures. 


            (TM):  20:51