Forensic Science International: Genetics 58 (2022) 102680 Available online 9 March 2022 1872-4973/© 2022 The Author(s). Published by Elsevier B.V. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by- nc-nd/4.0/). Contents lists available at ScienceDirect Forensic Science International: Genetics journal homepage: www.elsevier.com/locate/fsigen Research paper The transitivity of the Hardy–Weinberg law Jan Graffelman a,b, , Bruce S. Weir b a Department of Statistics and Operations Research, Universitat Politècnica de Catalunya, Carrer Jordi Girona, 1-3, 08034, Barcelona, Spain b Department of Biostatistics, University of Washington, University Tower, 15th Floor, 4333 Brooklyn Avenue, Seattle, WA 98105-9461, United States of America ARTICLE INFO Keywords: Bi-allelification Polymorphism reduction Indel Microsatellite Restricted permutation test Hardy–Weinberg equilibrium Exact test ABSTRACT The Hardy–Weinberg law is shown to be transitive in the sense that a multi-allelic polymorphism that is in equilibrium will retain its equilibrium status if any allele together with its corresponding genotypes is deleted from the population. Similarly, the transitivity principle also applies if alleles are joined, which leads to the summation of allele frequencies and their corresponding genotype frequencies. These basic polymorphism properties are intuitive, but they have apparently not been formalized or investigated. This article provides a straightforward proof of the transitivity principle, and its usefulness in genetic data analysis is explored, using high-quality autosomal microsatellite databases from the US National Institute of Standards and Technology. We address the reduction of multi-allelic polymorphisms to variants with fewer alleles, two in the limit. Equilibrium test results obtained with the original and reduced polymorphisms are generally observed to be coherent, in particular when results obtained with length-based and sequence-based microsatellites are compared. We exploit the transitivity principle in order to identify disequilibrium-related alleles, and show its usefulness for detecting population substructure and genotyping problems that relate to null alleles and allele imbalance. 1. Introduction The Hardy–Weinberg law is a cornerstone principle of modern genetics, and marked the foundation of population genetics [1]. For an autosomal diploid variant, the principle establishes that genotype frequencies attain a stable composition in one generation of time; remaining, in the absence of disturbing forces, unaltered afterwards. For bi-allelic variants this implies the genotype frequencies will have relative frequencies ( = 2 ,  =2,  = 2 ), where and are the allele frequencies of A and B respectively with + =1. The Hardy–Weinberg principle becomes more complicated if one considers, for example, X chromosomal variants [2], systems with multiple alle- les [36], systems with null alleles [7,8], copy number variation [9,10] or polyploid species [11,12]. The statistical methodology needed to address all these complications often lags behind, as exemplified by the fact that adequate statistical procedures for testing X chromosomal vari- ants have only been recently developed [13,14]. In forensic genetics, Hardy–Weinberg proportions (HWP) are often assumed, in for instance matching probability calculations [15], and in the subdivided popula- tion model, the Balding–Nichols model [16]. The Hardy–Weinberg law is also crucial for the quality control of microsatellite data, statistical tests for HWP being routinely applied to autosomal microsatellites, also known as Short Tandem Repeats or STRs [17,18], indels [19], Corresponding author at: Department of Statistics and Operations Research, Universitat Politècnica de Catalunya, Carrer Jordi Girona, 1-3, 08034, Barcelona, Spain. E-mail address: jan.graffelman@upc.edu (J. Graffelman). sequence-based STRs [20], Single Nucleotide Polymorphism (SNP) pan- els [21,22] and microhaplotypes (MHs; [23]). The analysis of STR data is often complicated by the existence of genotyping error and individuals that stem from different ethnicities or ancestries. Geno- typing error, if substantial, can bias allele and genotype frequencies and so negatively affect all subsequent analysis of the data. Population substructure (in the form of ethnicities or genetic ancestries), when not accounted for, can provoke spurious findings in association studies, can lead to rejection of HWP when in fact subpopulations provide no evidence against it [24], and can suggest linkage disequilibrium (LD) between variants that are in fact independent in subgroups. The Hardy–Weinberg law is transitive in the sense that it carries over to reduced polymorphisms that can be generated from STRs by elimina- tion or joining of alleles. For STRs, next generation sequencing has revealed additional sequence diversity [20,25,26], thereby increasing the number of STR alleles. Sequence-based (SB) STRs can always be reduced to length-based (LB) STRs, and this is important for backward compatibility with previous LB work. Under the usual assumption of absence of disturbing forces (no mutation, migration, genotyping error, selection, etc.) Hardy–Weinberg equilibrium is generally expected to hold, and in practice, indeed mostly not rejected in statistical tests when these assumptions are met. If the equilibrium assumption holds https://doi.org/10.1016/j.fsigen.2022.102680 Received 9 September 2021; Received in revised form 12 February 2022; Accepted 20 February 2022