- New research looked at whether we may be able to predict where the second outbreak will occur based on Google searches of common COVID-19 symptoms.
- Researchers used Google Trends to measure interest in specific GI symptoms related to COVID-19 to gauge actual incidence of COVID-19.
- A problem with this type of data is that there’s potential for selection bias, which means the results are not indicative of the whole population.
As the United States heads into the colder months, you may be hearing chatter about a new surge of COVID-19 as people congregate indoors.
New research looked at whether we may be able to predict where the second outbreak will occur based on Google searches of common COVID-19 symptoms.
According to a new study published by the American Gastroenterological Association, research shows that increased internet search interest for gastrointestinal (GI) symptoms may be predicting COVID-19 outbreaks in the United States.
Researchers used Google Trends to measure interest in specific GI symptoms related to COVID-19 to gauge actual incidence of the disease. Data was analyzed from 15 states over 13 weeks between Jan. 20 and April 20. Common GI symptoms related to COVID-19 include:
- loss of taste
- abdominal pain
- loss of appetite
The research found that Google search interest in loss of taste, loss of appetite, and diarrhea increased 4 weeks before a spike in COVID-19 cases in most states.
“Our results show that Google searches for specific, common GI symptoms correlated with incidence of COVID-19 in the first weeks of the pandemic in five states with high disease burden,” said the report. “Our results suggest that increased search volume for common GI symptoms may predict COVID-19 case volume, with 4 weeks as the optimal gap between increase in search volume and increased caseload.”
“This is not the first time Google searches have been used to predict epidemics,” said Dr. Elena Ivanina, gastroenterologist, Lenox Hill Hospital.
She’s referring to the 2008 Google Flu Trends (GFT), a project that was designed to study trending Google searches related to flu symptoms to predict flu outbreaks approximately 2 weeks before the Centers for Disease Control and Prevention (CDC). The study was published in the journal Nature, and was Google’s attempt to use big data methods to predict real-time flu trends.
Unfortunately, the project missed the mark. Search terms picked by GFT did not reflect actual incidences of illness and repeatedly resulted in inflated cases across the country. Not only that, the project completely missed the 2009 H1N1 pandemic.
“Since a 2009 article in Nature pointing out the potential of using online searches for health-seeking information as a way to understand the transmission of the H1N1 influenza — a novel pandemic — there has been a lot of interest in harnessing the power of search engine data to predict outbreaks of infectious diseases,” said Jennifer Horney, founding director of the epidemiology program at the University of Delaware.
“However, a 2014 article in Science pointed out that Google’s Flu Trends, which was later taken down, was predicting more than twice the number of doctor visits for influenza-like illness than the CDC was reporting,” she said.
The answer is: We don’t know yet. Based on the failure of GFT, it would seem that the methodology needs some fine-tuning.
“The problem with these systems is the same problem we have with any syndromic surveillance system — what is being reported is a constellation of symptoms, or searches, and not an official diagnosis,” said Horney. “This is problematic in terms of identifying cases of COVID-19 because it’s a disease that is asymptomatic in 50 to 80 percent of the cases, so there would be no searches on Google since there are no symptoms.”
Another challenge, she points out, is that as we move into influenza season, many of the symptoms of COVID-19 could include a differential diagnosis of many different types of respiratory infections.
On the other hand, Ivanina believes that the method can be effective, but needs more work.
“There may be inaccuracies in the Google data, and it is also important to distinguish whether people are looking up symptoms for themselves, or because they are generally anxious about an epidemic. Ideally, only the data from people searching about their own symptoms would be used,” she said.
An additional problem is that these types of data have potential for selection bias, which means that the people who are searching for symptoms have a high level of health literacy and internet access. The results are not indicative of the whole of the population.
“In this case, those with lower access to and literacy around the internet may also be most vulnerable to COVID-19 infection — because they work an essential job or at a job that cannot be done remotely,” said Horney.
It will have to be a very specific set of symptoms so as to rule out any other possible disease.
“This type of data would be most useful in detecting a disease with a very specific set of symptoms that ruled out differential diagnoses,” said Horney. “It would also be most effective when a large majority of those infected were symptomatic.”
Ivanina adds that if public health officials want to use big data to predict the next outbreak, the methodology must be fine-tuned in order to be considered.