6 PEJ JULY•AUGUST/2014 By Kent Bottles, MD, Edmon Begoli and Brian Worley, PhD Understanding the Pros and Cons of Big Data Analytics Information Technology In this article… Look at how mining big data can lead to medical breakthroughs if the data are analyzed thoroughly. Big data is a term that is being applied to almost every human endeavor. What is big data? How will the concept of big data affect health care in the coming years? What can physician leaders and hospital system executives do to use big data to decrease per capita cost and increase the quality of the health care they deliver to their patients? The digitalization of information has created more data and the development of cloud computing, and faster and faster computers have made this increased data more accessible and useful. The insurance company John Hancock’s 600 mega- bytes of data was thought to be the largest amount of data available to any one organization in the 1950s. By the 1970s, this honor went to Federal Express’s 80 gigabytes; during the 1990s, WalMart was believed to have the most data with 180 terabytes. In the early 2000s, Google had accumulated 25 petabytes of data; today most analysts believe Facebook has the most data, an estimated 100 petabytes of data. 1 The International Data Corp. reported that the amount of digital data exceeded 1 zetabyte in 2010; in 2011 this number was almost 2 zetabytes. Understanding the mag- nitude of the increase in data is difficult, but this market research firm states there are “nearly as many bits of infor- mation in the digital universe as stars in our physical uni- verse.” 2 Google’s Eric Schmidt claims that every two days we create as much information as we did from the dawn of civilization up until the year 2003. 3 When you have that much data you can do things differ- ently. In the book Big Data: A Revolution That Will Transform How We Live, Work, and Think, Viktor Mayer-Schoenberger and Kenneth Cukier define big data as being able to extract new insights and create new forms of value by analyzing large data sets to find actionable correlations. They expand on this insight by writing: “In a big data world…we won’t have to be fixated on causality; instead we can discover patterns and correlations in the data that offer us novel and invaluable insights. Big data is about what, not why.” 4 Big data in medicine Physicians are trained to generate hypotheses that can be tested by the double-blind clinical trial that uses ran- domization to ensure that the only difference between the control group and the treated group is the therapy or proce- dure under investigation. Evidence-based medicine focuses on treatments that have survived this rigorous and expen- sive way of doing things. One drawback to this traditional approach is that experts estimate that only about 25 percent of what doctors do is truly evidence-based. Big data’s focus on correlations, not causality, is dif- ficult for physicians biased toward the biomedical model, where the focus is finding the cause of the disease in order to effectively treat it. “We’ve been so focused on generating hypotheses, but the availability of big data sets allows the data to speak to you. Meaningful things can pop out that you hadn’t expected. In contrast, with a hypothesis, you’re never going to be truly surprised at your result,” Stanford cardiologist Euan Ashley said. 5 Atul Butte, MD, PhD, believes that there are action- able correlations that could help patients just waiting to be discovered by data mining of existing health care data sets. “I don’t think enough people study the measurements that have already been made. Hiding within those mounds of data is knowledge that could change the life of a patient, or change the world.” 6 The big data approach has already begun to disrupt health care in ways that are only now becoming appreciated. The con- vergence of genomics, wireless sensors, imaging, information systems, social networks, cloud computing power, and ubiq- uity of smartphones described by Eric Topol in The Creative Destruction of Medicine has given us a glimpse of a new kind of personalized medicine made possible by big data studies. 7