Researchers Warn Against the Rise of ‘Big Data Hubris’
The problem is that Flu Trends has gotten it badly wrong in at least two cases. The reason for these errors is remarkably simple: the flu was in the news, and people were therefore more interested and/or concerned about its symptoms. Use of the key search terms rose, and, at some points, Google Flu Trends predicted double the number of infected people than were later revealed to exist by the Centers for Disease Control data. (One of these cases was the global pandemic of 2009; the second an early and virulent start to the season in 2013.)
On its own, this isn’t especially damning. But the authors note that flu trends have consistently overestimated actual cases, estimating high in 93 percent of the weeks in one two-year period. You can do just as well by taking the lagging CDC data and putting it into a model that contains information about past flu dynamics. And, unlike the Flu Trends algorithm, they point out that this sort of model can be improved.