randomForestSRC: Competing Risks Vignette Hemant Ishwaran, Thomas A. Gerds, Bryan M. Lau, Min Lu and Udaya B. Kogalur Introduction Here we outline the extension of random survival forests [1] to competing risks given in [2]. Users should ﬁrst read the random survival forests vignette [3] if they are unfamiliar with this topic. In competing risks, unlike survival where there is only one event type, the individual is subject to J> 1 competing risks. As in survival data, a complication is that the individual can be right-censored. Formally, let T o be the true event time and let δ o ∈{1,...,J } record the event type. Let C o denote the true censoring time. Under the presence of right-censoring we only observe T = min(T o ,C o ) and the censoring indicator δ = δ o · I {T o ≤ C o }. Thus for each individual one either observes the time an event occurs T = T o and the type of event which occured δ = δ o ∈{1,...,J }. Otherwise if the individual is right-censored, we observe the censoring time T = C o and the censoring indicator is δ = 0. Competing Risk Splitting Rules There are three splitting rules used by the package to grow a competing risk tree: 1. Generalized log-rank test, speciﬁed by splitrule = "logrank". This tests for equality of the event-speciﬁc hazard and is most appropriate when the analysis focuses on determining factors for event-speciﬁc risk. The generalized log-rank test is based on the weighted diﬀerence of the Nelson-Aalen event-speciﬁc cumulative hazard estimates in the daughter nodes. 2. Gray’s test, speciﬁed by splitrule = "logrankCR" which is the default used by the package. This is a modiﬁcation of Gray’s test [4] and tests for the equality of the cause-speciﬁc cumulative incidence. This is most appropriate when the goal is long term probability prediction. 3. Composite (weighted) splitting. This is speciﬁed using cause and is an integer value between 1 and J indicating the event of interest for splitting a node, where splitting is either based on the generalized log-rank test or Gray’s test speciﬁed by splitrule as described above. If not speciﬁed, the default is to use a composite splitting rule that averages equally over all events. Can also be a vector of non-negative weights of length J specifying weights for each event (for example, passing a vector of ones reverts to the default composite split-statistic). Goals of Competing Risks In competing risks, we are interested in predicting events and discovering risk factors aﬀecting event times. For the latter, we distinguish between risk factors for the cause-speciﬁc hazard and risk factors for the cumulative incidence. The cause-speciﬁc hazard function for event j =1,...,J given a covariate X is α j (t|X)= lim Δt→0 P{t ≤ T o ≤ t +Δt, δ o = j |T o ≥ t, X} Δt := f j (t|X) S(t|X) . Here S(t|X)= P{T o ≥ t|X} is the event-free survival probability function given X. The cause-speciﬁc hazard function describes the instantaneous risk of event j for subjects that currently are event-free. Factors found to change the 1