Multiset Canonical Correlation Analysis Jan de Leeuw Version 0.08, December 01, 2015 Abstract The Burt matrix collects all bivariate cross tables, and/or covariance matrices, of m variables in a single matrix. Various forms of canonical analysis based on the Burt matrix are discussed. Contents 1 Basic Theory 1 1.1 Definition ............................................... 1 1.2 MCCA Eigenvalues .......................................... 3 1.3 Least Squares Loss Function ..................................... 4 2 Multiple Correspondence Analysis 5 2.1 The Burt Table ............................................ 5 2.2 Using Homogeneity .......................................... 8 2.3 Induced Correlation Matrices .................................... 8 2.4 Binary Variables ........................................... 9 2.5 On Being Normal ........................................... 9 3 Linked Singular Value Decompositions 9 3.1 Pairwise Canonical Correlation Analysis .............................. 9 3.2 Two Sets of Variables ........................................ 10 3.3 KPL Diagonalization ......................................... 11 4 Appendix: Code 12 5 Appendix: NEWS 15 References 16 Note: This is a working paper which will be expanded/updated frequently. One way to think about this paper is as an update of De Leeuw (1982), using more modern computing and reporting machinery. The directory gifi.stat.ucla.edu/burt has a pdf copy of this article, the complete Rmd file with all code chunks, and R and C files with the code. 1 Basic Theory 1.1 Definition Suppose X 1 , ··· ,X m are data matrices, where X j is n × k j . Define X := X 1 |···| X m and C := X X. The matrix C has k j × k submatrices C jℓ = X j X . Also define D as the direct sum D := D 1 ⊕···⊕ D m , where D j := C jj . From now on we suppose that all D j are non-singular. For m = 3, for example, we have C = C 11 C 12 C 13 C 21 C 22 C 23 C 31 C 32 C 33 , 1