The same type of data used by the NSA to track terrorists can be used by public health researchers to combat the spread of diseases.
"Big data" has become quite the buzzword this year, especially after reports surfaced that the National Security Agency (NSA) has collected and stored individuals' phone records and Internet browsing histories, opening the debate about what should (and shouldn’t) be done with all of that information. But the possibilities for these massive data sets carry far beyond the hunt for terrorists. Public health officials use the same kind of data for a similar goal: saving lives.
Researchers can now pull together huge amounts of information – think Google Trends, Twitter messages about flu symptoms, or frequency of visits to WebMD – with the aim of tracking diseases. Officials have sought after such information for decades, but the increasing availability of big data makes the hunt faster and more accurate than ever before.
In the United States, researchers at Johns Hopkins University have been working on ways to track the flu by aggregating tweets. The program TwitterHose allows anyone to download about 1 percent of the tweets made in an hour, selected at random, giving researcher a nice cross section of Twitter users. Paid helpers sift through the tweets, flagging any that mention getting the flu or feeling flu-like symptoms. Researches then use location data to figure out where individual Twitter-users are reporting being sick. When matched with the Center for Disease Control statistics on flu outbreak – which usually run about two weeks behind real-time – the Johns Hopkins team found that they were able to accurately predict the CDC reports well before they were released.