Internet Queries Help Track Waterborne Disease
By Kelly A. Reynolds, MSPH, PhD
Disease surveillance is a complicated task where tools for data collection and reporting often lag far behind real-time events. In the last decade, there has been a surge of cell phone use and signal access around the world. In 2013, there were approximately 96 cell phone service subscriptions for every 100 people globally. These digital connections allow for digital data collection where hints on habits, behaviors and health can be tracked.
Traditional disease surveillance problematic
In the US, traditional means for surveillance of disease in the population include reporting from physicians or diagnostic laboratories. The nature of when and if people seek medical attention, if the physician orders diagnostic tests and if the laboratory is required to report the results to the health department, all affect whether or not that disease occurrence gets counted. Particularly problematic is use of physician and lab records for tracking of waterborne disease, which is generally mild to moderate and often does not warrant a visit to the doctor or a health department investigation. Nonetheless, tens of millions of waterborne diseases occur in the US each year, resulting in significant economic burden. Diarrhea remains one of the top 10 highest cause of global mortality with 1.5 million deaths a year—(more than AIDS, malaria and measles combined. (Anon. n.d.) While the Centers for Disease Control and Prevention (CDC) has an active surveillance network (known as FoodNet) for tracking foodborne illnesses in the US, there is no equivalent program for water-related illnesses.
In January of 2010, a 7.0-Mw earthquake hit Haiti, affecting an estimated three million people and killing over a hundred thousand. Less than a year later and for the first time in over a hundred years, a cholera outbreak spread throughout the country. One differing aspect with this outbreak, however, was that public health surveillance was not limited to hospital reports and phone interviews. With the help of mobile phones, researchers tracked population movement in relationship to the spread of disease. This aided in making sure public health officials were prepared to respond to health and medical needs in appropriate regions. In addition, Twitter data following the Haiti earthquake accurately tracked the cholera outbreak faster than officials using traditional methods. (Bates. 2017)
Digital epidemiology is a growing trend for population and disease surveillance. Today, about 86 percent of the world’s population is within cellular range and therefore, with trackable movements and behaviors. (Bates. 2017) Even in some of the poorest regions, such as the continent of Africa, there are 64 cell subscriptions per 100 people. While mobile phone locations can be passively tracked, intentional behaviors with our computers or cell phones provide important data on purchasing probabilities, political perceptions and even illness information. Up to 52 percent of Americans seek health-related information on the Internet each year. By evaluating searches for symptoms and remedies, data can be analyzed to determine illness incidence by region and over time.
The Internet, social media and sickness tracking
Google was an early pioneer for the use of Internet searches to track disease trends. Around 2008, Google introduced it’s controversial Google Flu Trends (GFT) tool. GFT kept track of specific user search terms to determine where the flu was striking around the globe. Early assessments indicated successful predictions of disease trends up to two weeks sooner than the CDC. In subsequent years, however, GFT either grossly under- or over-estimated influenza cases. For example, during the 2012-2013 flu season, the tool predicted two-fold more illness than actual case numbers from the CDC. Ultimately confounders such as prominence of flu reports in the news were found to skew the results so models were adjusted to compensate. (Lampos et al. 2015) (Yang et al. 2015)
HealthMap was developed by a team of researchers in 2006 to collect data from a variety of online sources including blogs, Twitter, local new articles, etc., to filter for real-time surveillance. In 2014, by tracking media and social media posts, HealthMap identified the Ebola outbreak in west Africa before the World Health Organization. Others have used query surveillance to track Salmonella, norovirus and listeria outbreaks, and found they were useful in early identification efforts. (Bahk et al. 2015)
Free apps are also available where participants voluntarily report if they or others around them are ill. The app Outbreaks Near Me can inform users how to avoid locations where disease cases are occurring. As users become better informed and disease incidence reporting becomes easier through online platforms, real-time monitoring is likely to improve.
Waterborne disease online tracking
Researchers from Australia are working to expand the use of Internet search queries to a wider variety of infectious diseases. They found that data on official notifications of 17 infectious diseases (27 percent of the total tested) from 2004 to 2013 were significantly correlated with identifiable search terms. (Milinovich et al. 2014) Included in the list were vectorborne (Dengue and chikungunya virus infection), bloodborne (hepatitis B and C), sexually transmitted (chlamydia) and vaccine-preventable illnesses (chickenpox, measles, shingles, meningococcal disease). In addition, the primarily waterborne disease cryptosporidiosis and also campylobacteriosis (a food and waterborne illness) were effectively tracked.
Other potential digital crumbs leading to advanced disease prediction could be reports of boil-water notices or pipe breaks in water distribution systems. Researchers from the University of Michigan recently proposed the use of data mining techniques to match the relationship between pipe breaks and gastrointestinal illness. Pipe breaks were found to positively correlate with Internet search volume for terms like diarrhea and vomiting. (Shortridge and Guikema. 2014)
Some of the cons of online disease monitoring include potential inaccuracies, manipulation of data, privacy issues, unequal distribution of response relative to cellular access or social media use tendencies. Social media sites like Twitter and Facebook can provide more resolution to the data beyond search terms as people tend to share more details of personal involvement, stating that they don’t feel well or missed school/work because they are sick, but there may ultimately be privacy issues.
Consistency is also important to compare trends over time. If some factor in society or the media increases individual tendencies to search a topic, such as the well-publicized lead exposures in Flint, Michigan, Internet search queries may not reflect true risks.
Both critics and promoters of using Internet search queries and other digital data systems seem to agree that they can enhance disease surveillance and help to validate traditional tracking methods. To have the most beneficial effect, identification of disease trends needs to be both accurate and timely. Discovering an outbreak one to two weeks after the fact may not leave time for response and prevention of further cases. One thing seems certain: digital epidemiology will continue to be used and improved in the future.
- Anon. n.d. Top Ten Leading Causes Of Death In The World–WorldAtlas.com. Retrieved April 19, 2017.
- Bahk, Gyung Jin, Yong Soo Kim and Myoung Su Park. 2015. “Use of Internet Search Queries to Enhance Surveillance of Foodborne Illness.” Emerging Infectious Diseases 21(11):1906–12. Retrieved April 19, 2017.
- Bates, Mary. 2017. “Tracking Disease: Digital Epidemiology Offers New Promise in Predicting Outbreaks.” IEEE Pulse 8(1):18–22. Retrieved April 14, 2017.
- Lampos, Vasileios, Andrew C. Miller, Steve Crossan and Christian Stefansen. 2015. “Advances in Nowcasting Influenza-like Illness Rates Using Search Query Logs.” Scientific Reports 5:12760. Retrieved April 19, 2017.
- Milinovich, Gabriel J. et al. 2014. “Using Internet Search Queries for Infectious Disease Surveillance: Screening Diseases for Suitability.” BMC Infectious Diseases 14(1):690. Retrieved April 14, 2017.
- Shortridge, Julie E. and Seth D. Guikema. 2014. “Public Health and Pipe Breaks in Water Distribution Systems: Analysis with Internet Search Volume as a Proxy.” Water Research 53:26–34. Retrieved April 14, 2017.
- Yang, Shihao, Mauricio Santillan and S.C. Kou. 2015. “Accurate Estimation of Influenza Epidemics Using Google Search Data via ARGO.” Proceedings of the National Academy of Sciences of the United States of America 112(47):14473–78. Retrieved April 14, 2017.
About the author
Dr. Kelly A. Reynolds is an Associate Professor at the University of Arizona College of Public Health. She holds a Master of Science Degree in public health (MSPH) from the University of South Florida and a doctorate in microbiology from the University of Arizona. Reynolds is WC&P’s Public Health Editor and a former member of the Technical Review Committee. She can be reached via email at firstname.lastname@example.org