Original article Independent component analysis for rectal bleeding prediction following prostate cancer radiotherapy Auréline Fargeas a,b,⇑ , Oscar Acosta a,b , Juan David Ospina Arrango a,b , Amine Ferhat a,b , Nathalie Costet a,b , Laurent Albera a,b , David Azria d , Pascal Fenoglietto d , Gilles Créhange e , Véronique Beckendorf f , Mathieu Hatt g , Amar Kachenoura a,b , Renaud de Crevoisier a,b,c a INSERM, UMR1099, Rennes; b LTSI, Université de Rennes 1; c Département de radiothérapie, Centre Eugène Marquis, Rennes; d Département d’oncologie radiothérapie, INSERM U860, centre de recherche en cancérologie de Montpellier, CRLC Val-d’Aurelle Paul-Lamarque; e Centre Georges François Leclerc, Dijon; f Centre Alexis Vautrin, Vandoeuvre les Nancy; and g LaTIM, INSERM UMR1101, IBSAM, CHRU Morvan, Brest, France article info Article history: Received 29 July 2016 Received in revised form 18 November 2017 Accepted 20 November 2017 Available online xxxx Keywords: Prostate cancer radiotherapy Toxicity Rectal bleeding Predictive model Independent component analysis abstract Background and purpose: To evaluate the benefit of independent component analysis (ICA)-based models for predicting rectal bleeding (RB) following prostate cancer radiotherapy. Materials and methods: A total of 593 irradiated prostate cancer patients were prospectively analyzed for Grade 2 RB. ICA was used to extract two informative subspaces (presenting RB or not) from the rectal DVHs, enabling a set of new pICA parameters to be estimated. These DVH-based parameters, along with others from the principal component analysis (PCA) and functional PCA, were compared to ‘‘standard” features (patient/treatment characteristics and DVH bins) using the Cox proportional hazards model for RB prediction. The whole cohort was divided into: (i) training (N = 339) for ICA-based subspace iden- tification and Cox regression model identification and (ii) validation (N = 254) for RB prediction capability evaluation using the C-index and the area under the receiving operating curve (AUC), by comparing pre- dicted and observed toxicity probabilities. Results: In the training cohort, multivariate Cox analysis retained pICA and PC as significant parameters of RB with 0.65 C-index. For the validation cohort, the C-index increased from 0.64 when pICA was not included in the Cox model to 0.78 when including pICA parameters. When pICA was not included, the AUC for 3-, 5-, and 8-year RB prediction were 0.68, 0.66, and 0.64, respectively. When included, the AUC increased to 0.83, 0.80, and 0.78, respectively. Conclusion: Among the many various extracted or calculated features, ICA parameters improved RB pre- diction following prostate cancer radiotherapy. Ó 2017 Elsevier B.V. All rights reserved. Radiotherapy and Oncology xxx (2017) xxx–xxx Toxicity prediction following radiotherapy requires the integra- tion of many heterogeneous variables, including patient (clinical history, age, etc.) and treatment (DVH, treatment techniques, etc.) characteristics, into predictive models. As predictive models require a large amount of data, overfitting issues may occur, such as having too many parameters for the number of events. Improv- ing toxicity thus becomes a trade-off between including a large amount of data (gathering as much as information as possible) and not too much to cause overfitting. To overcome this issue, feature extraction/reduction strategies have recently emerged with advances in machine-learning methods. Principal component analysis (PCA) and functional PCA (FPCA) can resolve the issues of dimensionality reduction and have demonstrated effective pre- dictive capacity for rectal toxicity in prostate cancer radiotherapy [1,2]. FPCA enables a functional DVH representation that over- comes the issues of correlation between neighboring DVH bins [2]. More specifically, PCA decomposes the data into several orthogonal bases, yielding a set of features with maximized vari- ance. The orthogonality constraints imposed by PCA can, however, be relaxed by using more statistical information, such as mutual independence [3]. These relaxed constraints lead to the concept of independent component analysis (ICA) [4] enabling a specific observed multidimensional vector (i.e., rectal DVH) to be decom- posed into several components, which should be as statistically independent as possible [5]. For predicting rectal bleeding (RB) fol- lowing prostate cancer radiotherapy, we thus propose using ICA to estimate two informative subspaces of patients, one with RB, one without, enabling a normalized distance (pICA) to be computed https://doi.org/10.1016/j.radonc.2017.11.011 0167-8140/Ó 2017 Elsevier B.V. All rights reserved. ⇑ Corresponding author at: Laboratoire Traitement du Signal et de l’Image (LTSI), Université de Rennes 1, Campus de Beaulieu, Bât. 22, 35042 Cedex Rennes, France. E-mail address: aureline.fargeas@univ-rennes1.fr (A. Fargeas). Radiotherapy and Oncology xxx (2017) xxx–xxx Contents lists available at ScienceDirect Radiotherapy and Oncology journal homepage: www.thegreenjournal.com Please cite this article in press as: Fargeas A et al. Independent component analysis for rectal bleeding prediction following prostate cancer radiotherapy. Radiother Oncol (2017), https://doi.org/10.1016/j.radonc.2017.11.011