ASA 16 STATS #25 SPRING 1999 J ERO ME P . KEATING Division of Mathematics and Statistics University of Texas at San Antonio 501 West Durango Boulevard San Antonio, TX 78207 DAVID W. S COTT 1 Department of Statistics, MS-138 Rice University 6100 Main Street Houston, TX 77005-1892 A Primer on Density Estimation for the Great Home Run Race of ’98 Questions about graphing data are frequently asked throughout courses in Statistics. The most basic questions deal with “how to” form density estimates. Our students often inquire about how the smooth density estimates are constructed in certain articles and want to know how to produce such smooth graphs. In this article, we review a fundamental approach in density estimation and illustrate the procedure on the lengths of home runs hit by Sammy Sosa and Mark McGwire in the Great Home Run Race of ’98. The Data: In the “Great Home Run Race of ‘98,” Sammy Sosa of the Chicago Cubs and Mark McGwire of the St. Louis Cardinals battled throughout the last two months of the season for the title of baseball’s greatest single season home run hitter. As the sun sets on this magnificent season, we analyze and review their quest, which was Roger Maris’ record of 61 home runs set in 1961. His record retains an asterisk, because Maris hit his 61 home runs in a 162-game season, whereas Babe Ruth hit 60 home runs in a 154-game season. After the first 154 games of the 1961 season, Maris had 58 home runs. Ruth was a larger than life figure, whose magnetism and charisma lent more weight to his records. His premature death, no doubt, contributed all the more to the lore that followed the name of Ruth throughout baseball history. These contributing factors had more to do with the well-known asterisk than the length of the season. As the 1998 season dawned, McGwire started faster. By the end of April, McGwire had 11 home runs, whereas Sosa had 6; by the end of May, McGwire had 27 to Sosa’s 13. In June, Sosa went into overdrive hitting 20 home runs, a monthly record, and closing McGwire’s lead to only 4 home runs by month’s end. Sosa hit one more than McGwire in July, and on August 19, against St. Louis in Chicago’s Wrigley Field, Sosa went ahead of McGwire by hitting his 48th home run in the fifth inning. However, his lead was short-lived (58 minutes) as McGwire tied him with a home run in the 8th inning of the same game and reclaimed the lead with a solo home run two innings later. Sammy Sosa tied Mark McGwire again at 55 on August 31, at 62 on September 13, at 63 on September 16 and at 65 on September 23. On the last Friday of the season, September 25th, for only the second time in the season, Sosa took the lead with a 462-foot home run in the Astrodome. However, this lead lasted but 45 minutes as McGwire struck back with his 66th home run in the bottom of the fourth inning in St. Louis. Both players surpassed the mark of 61 home runs in the first 154 games of the season removing the need for any asterisks on their records. Just as McGwire started strongly, he finished the same way. McGwire’s surge of five home runs on the final weekend of the season propelled him to a magnificent 70-home run season. It would be myopic, to concentrate solely on McGwire’s season for in doing so, we miss Sammy Sosa’s magnificent season within a season. While Mark McGwire’s 1998 home runs are accentuated by some of heroic distances, Sammy Sosa’s 1998 home run rate is an overlooked topic. It is an understatement to say that Sosa is a streak hitter. From May 25 (the Cubs 50th game) through September 13, (the Cubs 150th game) Sammy Sosa hit an incredible number of 53 home runs in only Ask Dr. STATS 1 Research supported in part by NSF grant DMS 96-26187 David W. Scott Jerome P. Keating