Am J Health Behav 2001;25(3):285-289 285 Data Mining and Healthcare Informatics Gerald R. Hobbs, PhD Gerald R. Hobbs Jr., PhD, Assocate Professor, Statistics, Department of Mathematical Statistics and Adjuct Associate Professor, Community Medi- cine, West Virginia University School of Medicine, Morgantown, WV. Address correspondence to Dr. Hobbs, Com- munity Medicine, PO Box 9145, West Virginia University School of Medicine, Morgantown, WV. Email: ghobbs@stat.wvu.edu Objective: To acquaint members of the Academy with a relatively recent development in the area of data exploration and statistical analysis. Methods : A review of the concepts and methods inher- ent in data-mining with a special emphasis on the those methods applicable to predictive model- ing. Results: Data-mining is dem- onstrated to be a useful tool for researchers in those circum- stances where large amounts of information are available. Con- clusions: With the advent and pro- liferation of on-line data collec- tion, truly massive databases are now available to health care re- searchers. In that situation, data- mining methods yield some unique opportunities to researchers who wish to develop prediction models and to establish associations. Am J Health Behav 2001;25(3):285-289 T he related terms “healthcare informatics” (HI) and “medical informatics” (MI) have been around for approximately 30 years now. Interest- ingly, definitions of each term vary widely, and the distinction between them is not always clear. Usually, the person doing the defining manages to come up with wording that establishes his or her own particular set of skills and interests as just what HI/MI is all about. It is only slightly flippant, then, to say that either of the terms can be used to describe almost anything that relates to the use of com- puters in a health related setting. Aca- demic treatment of the subject has been similarly diffuse. Any number of Web sites may be accessed that range in em- phasis from biomedical engineering to statistics to cognitive science. Many of those Web sites are associated with col- leges and universities. Early computer applications in the health care area predate the uses of terms like HI and MI. Such applications were usually oriented toward performing a single labor-intensive task quickly. For example, computers were used to process information rapidly so that medical diag- nostic equipment could provide almost immediate feedback to clinicians, say, in the form of an EKG. In the same vein, computing devices were used to monitor hemodynamic characteristics such as drug concentrations by researchers in specialties such as pharmacology or an- esthesiology. Subsequently, computers were and, of course, still are used in the context of information systems. At first, those sys- tems were used at local levels. That is, they were used in individual hospitals and in clinics for purposes such as billing and patient record keeping. Now, of course, those systems have been used to create more massive databases. Examples in- clude those databases that are main- tained on behalf of Medicare, Medicaid, various health maintenance organiza-