randomForestSRC: Competing Risks Vignette Hemant Ishwaran, Thomas A. Gerds, Bryan M. Lau, Min Lu and Udaya B. Kogalur Introduction Here we outline the extension of random survival forests [1] to competing risks given in [2]. Users should first read the random survival forests vignette [3] if they are unfamiliar with this topic. In competing risks, unlike survival where there is only one event type, the individual is subject to J> 1 competing risks. As in survival data, a complication is that the individual can be right-censored. Formally, let T o be the true event time and let δ o ∈{1,...,J } record the event type. Let C o denote the true censoring time. Under the presence of right-censoring we only observe T = min(T o ,C o ) and the censoring indicator δ = δ o · I {T o C o }. Thus for each individual one either observes the time an event occurs T = T o and the type of event which occured δ = δ o ∈{1,...,J }. Otherwise if the individual is right-censored, we observe the censoring time T = C o and the censoring indicator is δ = 0. Competing Risk Splitting Rules There are three splitting rules used by the package to grow a competing risk tree: 1. Generalized log-rank test, specified by splitrule = "logrank". This tests for equality of the event-specific hazard and is most appropriate when the analysis focuses on determining factors for event-specific risk. The generalized log-rank test is based on the weighted difference of the Nelson-Aalen event-specific cumulative hazard estimates in the daughter nodes. 2. Gray’s test, specified by splitrule = "logrankCR" which is the default used by the package. This is a modification of Gray’s test [4] and tests for the equality of the cause-specific cumulative incidence. This is most appropriate when the goal is long term probability prediction. 3. Composite (weighted) splitting. This is specified using cause and is an integer value between 1 and J indicating the event of interest for splitting a node, where splitting is either based on the generalized log-rank test or Gray’s test specified by splitrule as described above. If not specified, the default is to use a composite splitting rule that averages equally over all events. Can also be a vector of non-negative weights of length J specifying weights for each event (for example, passing a vector of ones reverts to the default composite split-statistic). Goals of Competing Risks In competing risks, we are interested in predicting events and discovering risk factors affecting event times. For the latter, we distinguish between risk factors for the cause-specific hazard and risk factors for the cumulative incidence. The cause-specific hazard function for event j =1,...,J given a covariate X is α j (t|X)= lim Δt0 P{t T o t t, δ o = j |T o t, X} Δt := f j (t|X) S(t|X) . Here S(t|X)= P{T o t|X} is the event-free survival probability function given X. The cause-specific hazard function describes the instantaneous risk of event j for subjects that currently are event-free. Factors found to change the 1