Comparison of Athletic Performances across Disciplines Chris Barnes University of Canberra christopher.john.barnes@gmail.com Abstract The extreme value (EV) distribution describes the asymptotic behaviour of all stationary distributions in terms of a limiting, three parameter distribution function. This result can be used to compare elite sport performances for several purposes. 1. By regressing out most significant fixed and random effects for a given discipline, gender and class combination, the resulting residuals can be fitted to an EV distribution to determine the optimised parameters describing gender/discipline/class combinations and their uncertainties. This allows the objective ranking of athletes across events in terms of their percentiles. 2. Use of the regression models allows objective estimates of likely future event standards for performances in heats, semis and finals placings. 3. Determinations of distributional parameters allow comparison between gender, class (including able- bodied or AWD performances) and disciplines via percentiles. 4. Deviations in the regression parameters may indicate the varying effects of performance enhancement over the time-span of Olympic cycles. Keywords: Elite sport; performance analysis; Extreme Value theorem; Weibull distribution; performance prediction 1. Introduction In selecting athletes for a track and field team, generally fairly stringent constraints are placed on the number of athletes that can be accommodated. Perhaps the most stringent of these, at least for major meets, are those set in place by the organisers because of logistic constraints limiting the total number of competitors. There are also constraints imposed by the team management from their own logistics and budget, and also through consideration of the standard of the meet: there is generally little advantage in sending a large number of athletes likely to be eliminated in their first round or heat, or who have no realistic chance of placing near the front of the race. It will almost always be preferable for these athletes to compete at a lower level, and gain experience and confidence from meets where they have some realistic chance of influencing the major placings. But these criteria on selection are somewhat nebulous, and without an obvious clear definition of eligibility; whereas eventually a choice must be made that an athlete is, or is not, of the requisite performance standard. Furthermore, since the entrance standard set by the organisers is usually the more stringent, it is often the case that the last athlete selected will be at the expense of an athlete in an entirely different discipline or event. For a number of reasons it is preferable that these decisions should be as transparent as possible, and based on objective criteria (rather than purely on the selectors subjective experience of relative merits). In Athletes With Disabilities (AWD) track and field competition, there is the additional difficulty that at anything but World Championship (WCh) or Paralympic level competition, there may be too few competitors in any one class to yield the requisite closeness of competition or excitement that makes a good spectacle; but the entertainment spectacle is where an increasingly large proportion of financial support originates. This is because the numbers of AWD athletes in an event is virtually always less than the potential pool for able-bodied (AB) athletes in their corresponding event; and there are often up to or greater than 30 AWD classes for each equivalent AB event. Even at the highest international level, available numbers are insufficient to guarantee an appropriate level of competition in some event/class combinations; let alone at the far more numerous sub-national AWD competitions that allow us to identify athletic talent initially. In the past, a number of somewhat ad hoc solutions to this dilemma have been experimented with; ranging from a failure to cater for the particular event/class combination (or equivalently, allowing more severely handicapped classes to compete in a higher Proceedings of the Twelfth Australasian Data Mining Conference (AusDM 2014), Brisbane, Australia 131