QUANTIFYING HISTORICAL TRENDS IN THE COMPLETENESS OF THE AVIAN FOSSIL RECORD Clint A. Boyd & Daniel Ksepka Jackson School of Geosciences, The University of Texas at Austin, Austin, Texas 78712, clintboyd@stratfit.org; Department of Marine, Earth, & Atmospheric Sciences, North Carolina State University, Raleigh, North Carolina 27695, daniel_ksepka@ncsu.edu 1 2 2 1 ABSTRACT Estimating the completeness of the fossil record has long been a central area of research in paleontology. Improvements in the perceived completeness of the fossil record may be driven both by new discoveries or by reinterpretation of known fossils, but the relative impact of these processes on our understanding of the fossil record has never been quantitatively explored. Here, we evaluate trends in observed patterns of relative completeness of the avian fossil record over the past century using a new methodology that clarifies the differential impact of the discovery of new fossils versus phylogenetic revision of known material. Dates of discovery and recognition for the oldest fossil representatives of 75 major avian lineages were collected for the historical period between 1910 and 2010. Using a comprehensive phylogeny, we calculated minimum implied stratigraphic gaps (MIG) across these lineages. Our results show a reduction in global MIG values of ~50% over the past century in avian paleontology. A pronounced increase in the average rate of global MIG reduction is noted post 1970s compared to pre 1970s (290.5 Ma versus 31.9 Ma per decade, respectively). While the majority of improvement in the avian fossil record has come from new discoveries, substantial improvement (~22.5%) has resulted from restudy and phylogenetic revision of previously described fossils over the last 40 years. A minimum estimate of MIG indicates that at least 1.38 Ga of gaps remain to be filled between the predicted and observed first appearances of major lineages of crown Aves, implying much progress is needed. However, a notable tapering of the rate of global MIG reduction occurs between 1990 and 2010, suggesting we may be approaching an asymptote of oldest record discoveries for birds, though only future observations can determine whether this is a real pattern or a historical anomaly. METHODS RESULTS SUMMARY CHRONOGRAM CONCLUSIONS Average standing levels of global MIG vary a maximum of 20.1% from a high of 1.54 Ga in the 1990 historical bin to a low of 1.23 Ga in the 1980 historical bin (Table 1; Fig. 5). Overall, between 1910 and 2010, global MIG remains relatively constant. However, when the historical baseline is incorporated into the analysis (Table 1; Fig. 6), global MIG values decrease 49.5% from 2.73 Ga in 1910 to 1.38 Ga in 2010. The majority of this improvement (~86%) occurred between 1971 to 2010 (Figs. 5 and 6). During the 1910 to 1970 interval, slightly more MIG is filled in during each decade in the Sampled Fossil Record data set than in the Recognized Fossil Record data set (39.6 Ma/decade versus 31.9 Ma/decade). This indicates that relevant fossils were being discovered, but were also regularly being misidentified during this historical time period. In contrast, from 1971–2010 an average of 225.2 Ma/decade was filled in for the Sampled Fossil Record , but 290.5 Ma/decade was filled in for the Recognized Fossil Record . The difference (65.3 Ma/decade or 22.5% of total MIG reduction) is due to taxonomic revision of previously described fossils. The data from the Recognized Fossil Record analysis shows a dramatic decrease in the amount of global MIG reduced each decade over the past 20 years, dropping by 26% between 1990 and 2010 (Table 1, Fig. 4). More importantly, improvement due to taxonomic revision increases from 17.8% of total MIG reduction in 1990 to 41.1% in 2010. Only future observations can determine whether these are real patterns or historical anomalies. Either way, barring the discovery of fossils that substantially push back the minimum age for the origin of crown-clade Aves, new discoveries cannot continue to reduce global MIG values at the average post-1970s rate over the long term. Causes? We hypothesize that the large gains observed between 1971 and 2010 correspond to the advent of cladistic methodologies and their adoption by the avian paleontology community. Although few actual phylogenetic analyses including fossils were conducted in the 1960–1980 interval, synapomorphy-based approaches to identifying fossil specimens began to supplant criteria based on simple similarity during this time. Additional studies are needed to determine if clades favored with “better” fossil records show comparable amounts of global MIG at equivalent taxonomic resolutions, or if the trends noted here are specific to Aves. 1910 data for Taxon A 2010 data for Taxon A 1910 data for Taxon B 2010 data for Taxon B 1910 data for Taxon C 2010 data for Taxon C 0 5 10 15 20 25 Time (Ma) 1910 data for Taxon A 2010 data for Taxon A 1910 data for Taxon B 2010 data for Taxon B 1910 data for Taxon C 2010 data for Taxon C 0 5 10 15 20 25 Time (Ma) Figure 2: Example illustrating how the historical stratigraphic data was incorporated into the stratigraphic consistency calculations via the insertion of a sister taxon for each terminal taxon. The original terminal taxon is assigned the historical age datum (e.g., the fossil record for that clade as known in 1910), while the age of the sister taxon is assigned the modern age datum (i.e., the fossil record as known today). MIG for Clade X = 10ma MIG for Clade X = 20ma 0 5 10 15 20 25 Time (Ma) Taxon A Taxon B Taxon C Taxon C Taxon A Taxon B 5 Ma MIG added by fossil discovery 10 Ma MIG added by fossil discovery Figure 1: Example illustrating how new fossil discoveries can increase total minimum implied gap (MIG) for a clade. At left, a cladogram depicts the relationships and stratigraphic ranges (black bars) of 3 taxa in clade X. Implied gaps are depicted with dashed lines. At right, the effects of a new fossil discovery extending the range of taxon B are shown. The new fossil reduces the MIG for the clade containing taxa B and C. However, the new fossil also implies additional, previously unrecognized gaps in the fossil records of taxa A and C, resulting in an overall increase in MIG for clade X. The majority of the data presented here will be published in the forthcoming paper: Ksepka, D. T., and Boyd, C. A. 2012. Quantifying historical trends in the completeness of the fossil record and the contributing factors: an example using Aves. Paleobiology. 38(1): 826-839. 2000 1500 1000 500 0 1910 1920 1930 1940 1950 1960 1970 1980 1990 2000 2010 MIG Range Values (Ma) Historical Bins Figure 5: Graph of the MIG range vales for each historical bin. Values were calculated independently for each historical bin and are not calculated using the current fossil record as a baseline. Figure 7: Bar graph showing the reduction in global MIG values for each historical bin predicted by the Sampled Fossil Record dataset (green) compared to the observed reduction in global MIG values for each historical bin based on the Recognized Fossil Record dataset (orange). Note the dramatic increase in both predicted and observed global MIG reduction starting in the 1980 historical bin. 0 -100 -200 100 200 300 MIG Reduction (Ma) 1910 1920 1930 1940 1950 1960 1970 1980 1990 2000 2010 Historical Bins 2 Figure 6: Graph of the MIG range vales for the Recognized Fossil Record dataset (orange) and the Sampled Fossil Record dataset (green). Both datasets display a marked change in slope in the 1971-2010 era. The r value for the best-fit linear trend line for the Recognized Fossil Record dataset values for the 1971-2010 era is 0.997, and for the Sampled Fossil Record dataset values it is 0.989 . 3000 2500 2000 1500 1000 500 0 1910 1920 1930 1940 1950 1960 1970 1980 1990 2000 2010 MIG Range Values (Ma) Historical Bins Figure 3: Screen shot of the program Assistance with Stratigraphic Congruence Calculations version 4.0b (Boyd et al., 2011) that was used to calculate values of MIG range in this study. PURPOSE OF RESEARCH A common misconception that permeated 20th century reviews of the fossil record was the portrayal of the avian record as meager, characterized by long gaps and comprising mainly small scraps of delicate, hollow bones. Perpetuation of this myth was fueled by both the relatively small historical population of avian paleontologists (versus, for example, fossil mammal workers; Olson, 1985) and the late adoption of cladistic methodologies by ornithologists (see Cracraft, 1980). These factors contributed to a status quo in which many fossils were misidentified, resulting in apparent large gaps in the records for many avian clades despite the fact that unrecognized older fossils belonging to those clades had indeed already been recovered. Improvement in the perceived completeness of the fossil record of a given clade through time can be attributed to two primary phenomena: the discovery of new fossils or the recognition of previously overlooked records (typically through phylogenetic revision of known fossils). However, the differential impact of these phenomena on our understanding of the fossil record has never been quantitatively explored. Here, we evaluate the relative completeness of the fossil record of the clade Aves using a new methodology and explore trends over the last 100 years. Central Research Questions What is the pattern of gains in completeness of the fossil record of birds over the past century? What percentage of these gains has come from new discoveries and what percentage has come from restudy of previously described fossils? Do the data at hand suggest significant advances will continue to be made on one or both fronts? 1 2 3 We used the stratigraphic consistency metric minimum implied gap (MIG: Benton, 1994) to quantify the total implied missing fossil record for Aves. To facilitate comparisons, the last 110 years (1900 to 2010) was divided into 11 historical bins each representing 10 years. The paleontological literature was surveyed for this time period and the dates of original description and subsequent taxonomic revision of oldest known records (OKRs sensu Walsh, 1998) were collected for 75 major avian clades. Using these data, each terminal taxon in each historical bin was assigned to one of 28 pre-defined age bins that each span 2.5 Ma, with the youngest temporal bin set equal to zero (taxon restricted to Holocene deposits) and the oldest temporal bin ranging from 65.0 to 67.5 Ma. In cases where the uncertainty surrounding the age of a fossil spanned multiple temporal bins, the age was set using at the midpoint. Frame of reference is important when analyzing historical trends in the perceived completeness of the fossil record because discovering a new OKR for a taxon can sometimes increase total minimum implied gap (see Fig. 1 for an explanation). Therefore, in addition to calculating standard MIG values for each historical bin, a baseline for historical comparisons was incorporated into analyses 2 and 3 to correct for artificial increases in MIG across time. In these later analyses, the modern fossil record was used as a proxy for the ‘true’ fossil record in analyses 2 and 3. This was accomplished by calculating MIG values from a tree in which each terminal branch is duplicated (i.e., a sister taxon is added to each terminal taxon; see Fig. 2). One of the duplicated branches was assigned the 2010 OKR and the other was assigned the OKR for the historical bin being considered (e.g., 1910). This novel methodology ensures that discoveries of older fossils always reduce MIG. Using these methods, three sets of analyses were conducted: Analysis 1: Standard MIG Calculation Values of MIG range were calculated for each of the 11 historical bins (1910- 2010) using standard methods (e.g., Wills, 1999; Pol and Norell, 2006). Analysis 2: Recognized Fossil Record This anaylsis calculated MIG values using the observed historical pattern of description and subsequent taxonomic revision of the OKRs for each of the 75 sampled clades and incorporated the modern fossil record as a frame of reference using the method shown in Figure 2. Analysis 3: Sampled Fossil Record Dataset This analysis calculated MIG values assuming a hypothetical situation where fossils were always correctly referred to a given clade when they were first described and incorporating the modern fossil record as a frame of reference. The difference between the values obtained in analyses 2 and 3 provides a quantitative estimate of the cumulative effect of taxonomic revision of known fossils over the past 110 years. Phylogenetic Framework Temporal data were analyzed using the phylogeny of Hackett et al. (2008) with individual species exemplars from the Hackett et al. (2008) phylogeny collapsed into the 75 higher level taxa analyzed. Software Implementation Calculation of MIG range values was conducted using the software program Assistance with Stratigraphic Congruence Calculations (ASCC; Boyd et al., 2011) version 4.0.0b (Figure 3). Thirty-three analyses were conducted, one for each of the eleven historical bins included in each of the three analyses. Each analysis was run for 1,000,000 replications to ensure the full range of age variation was taken into account (Table 1). 70 60 50 40 30 20 10 0 Struthionidae Rheidae Tinamidae Casuariidae Apterygidae Anhimidae Anseranatidae Anatidae Megapodiidae Cracidae Phasianidae Phoenicopteriformes Podicipediformes Phaethontidae Pteroclididae Columbiformes Mesitornithidae Eurypygidae Rhynochetidae Steatornithidae Nyctibiidae Podargidae Caprimulgidae Aegothelidae Trochilidae Apodidae Hemiprocnidae Opisthocomidae Otididae Cuculiformes Heliornithidae Gruidae Psophiidae Musophagiformes Gaviiformes Sphenisciformes Procellariiformes Ciconiidae Fregatidae Sulidae Anhingidae Phalacrocoracidae Ardeidae Threskiornithidae Pelecanidae Balaenicipitidae Scopidae Burhinidae Charadrii Haematopodidae Lari Turnicidae Jacanidae Pedionomidae Cariamidae Falconidae Passeriformes Psittaciformes Cathartidae Sagittariidae Pandionidae Accipitridae Coliiformes Strigiformes Leptosomatidae Trogoniformes Upupiformes Bucerotiformes Galbulidae Pici Meropidae Coraciidae Todidae Alcedinidae Momotidae Figure 4: Chronogram illustrating the results of the analysis of the Recognized Fossil Record dataset. For each lineage, black bars represent the fossil record as known in 1910, blue bars represent the fossil record as known in 1970, and the red bars indicate the fossil record as known in 2010. Grey dashed lines indicate inferred missing fossil records (i.e., ghost lineages). Perceived improvement in our knowledge of the Avian fossil record has increased at a substantially higher rate from 1971-2010 than during the time period from 1910-1970 (See Figures 3 and 4 for further details). Abbreviations: Plio. = Pliocene; Plei. = Pleistocene; Quat. = Quaternary. 0 Upper Cretaceous Paleogene Neogene Quat. Cretaceous Paleocene Eocene Oligocene Miocene Plio. Plei. Analysis 1 A general trend of slight reduction in global MIG values though time is interrupted by a sharp increase in global MIG between 1980 and 1990 (Fig. 5). As a result, the 2010 and 1910 global MIG values are nearly identical (Table 1). Analysis 2 An overall reduction of 49.5% of global MIG between 1910 (2.73 Ga) and 2010 (1.38 Ga). There is a sharp increase in the rate of global MIG reduction starting in 1971 (31.9 Ma/decade versus 290.5 Ma/decade). Analysis 3 The 1910 historical bin value is 215.0 Ma lower in Analysis 3 than in Analysis 2. This represents the starting “Taxonomic Debt” (i.e., unrecognized OKRs due to inaccurate taxonomic referrals prior to 1910). From 1910 to 1970 the average rate of predicted global MIG reduction (green line in Fig. 6) is higher than the average rate of observed global MIG reduction (39.6 Ma/decade versus 31.9 Ma/decade), while from 1971 to 2010 the average rate of observed MIG reduction outpaces the average rate of predicted MIG reduction (225.2 Ma/decade versus 290.5 Ma/decade). These differences signal the negative and positive effects of taxonomic referrals and taxonomic revision, respectively. Table 1: Resulting minimum and maximum values of MIG for each historical bin for all three analyses. The ’new discoveries’ column indicates how much MIG reduction should have occurred during each historical bin. The ‘cumulative effect of taxonomic work’ column describes how taxonomic work on new and previously known fossils either underperformed (negative values) or outperformed (positive values) the expected reduction in global MIG values due to new discoveries. All MIG values are given in Ma. Benton, M. J. 1994. Paleontological data and identifying mass extinctions. Trends in Ecology and Evolution 9:181–185. Boyd, C. A., T. P. Cleland, N. L. Marrero, and J. A. Clarke. 2011. Exploring the effects of phylogenetic uncertainty and consensus trees on stratigraphic consistency scores: a new program and a standardized method. Cladistics 27:52–60. Cracraft, J. 1980. Phylogenetic theory and methodology in avian paleontology: a critical appraisal. Contributions in Science of the Natural History Museum of Los Angeles County 330:9–16. Hackett, S. J., R. T. Kimball, S. Reddy, R. C. K. Bowie, E. L. Braun, M. J. Braun, J. L. Chojnowski, W. A. Cox, K.-L. Han, J. Harshman, C. J. Huddleston, B. D. Marks, K. J. Miglia, W. S. Moore, F. H. Sheldon, D. W. Steadman, C. C. Witt, and T. Yuri. 2008. A phylogenomic study of birds reveals their evolutionary history. Science 320:1763–1768. Olsen, S. L. 1985. The fossil record of birds. Pp. 79–238 in D. S. Farner, J. R. King, and K. C. Parkes, eds. Avian biology. Academic Press, New York. Pol, D., and M. A. Norell. 2006. Uncertainty in the age of fossils and the stratigraphic fit to phylogenies. Systematic Biology 55:512–521. Wills, M. A. 1999. Congruence between stratigraphy and phylogeny: randomization tests and the gap excess ratio. Systematic Biology 48:559–580. 1 2 3 LITERATURE CITED Standard Calculation of Scores Sampled Fossil Record Dataset Recognized Fossil Record Dataset Cumulative Effect of Taxonomic Work Historical Bins Minimum MIG Maximum MIG Minimum MIG Maximum MIG Minimum MIG Maximum MIG New Discoveries 1910 1340.2 1418.0 2465.4 2573.5 2679.9 2788.9 - - 1920 1333.9 1408.7 2419.7 2528.3 2674.2 2781.8 35.5 -29.1 1930 1333.5 1418.8 2363.8 2468.5 2659.7 2767.7 67.9 -53.6 1940 1359.4 1438.9 2343.2 2455.0 2640.7 2753.4 17.1 -0.4 1950 1359.4 1438.9 2343.2 2455.0 2640.7 2753.4 0.00 0.00 1960 1316.4 1396.5 2320.5 2427.4 2598.0 2708.4 25.2 18.7 1970 1235.6 1319.7 2229.2 2334.3 2486.9 2599.4 92.2 17.9 1980 1182.9 1274.8 1984.3 2091.6 2179.7 2285.9 243.8 66.5 1990 1492.4 1583.0 1716.2 1811.0 1856.1 1957.0 274.3 51.9 2000 1416.4 1503.0 1485.7 1578.0 1585.0 1676.4 231.8 44.1 2010 1339.0 1423.1 1339.0 1423.1 1339.0 1423.1 150.8 98.9 Answers to Central Research Questions