Journal of Data Analysis and Information Processing, 2023, 11, 11-36
https://www.scirp.org/journal/jdaip
ISSN Online: 2327-7203
ISSN Print: 2327-7211
DOI: 10.4236/jdaip.2023.111002 Jan. 31, 2023 11 Journal of Data Analysis and Information Processing
Modelling Key Population Attrition in the HIV
and AIDS Programme in Kenya Using Random
Survival Forests with Synthetic Minority
Oversampling Technique-Nominal
Continuous
Evan Kahacho
1*
, Charity Wamwea
1
, Bonface Malenje
1
, Gordon Aomo
2
1
Department of Statistics and Actuarial Sciences, Jomo Kenyatta University of Agriculture and Technology, Nairobi, Kenya
2
Monitoring and Evaluation, Kenya Red Cross Society-Global Fund Unit, Nairobi, Kenya
Abstract
HIV and AIDS has continued to be a major public health concern, and hence
one of the epidemics that the world resolved to end by 2030 as highlighted in
sustainable development goals (SDGs). A colossal amount of effort has been
taken to reduce new HIV infections, but there are still a significant number of
new infections reported. HIV prevalence is more skewed towards the key pop-
ulation who include female sex workers (FSW), men who have sex with men
(MSM), and people who inject drugs (PWID). The study design was retros-
pective and focused on key population enrolled in a comprehensive HIV and
AIDS programme by the Kenya Red Cross Society from July 2019 to June
2021. Individuals who were either lost to follow up, defaulted (dropped out,
transferred out, or relocated) or died were classified as attrition; while those
who were active and alive by the end of the study were classified as retention.
The study used density analysis to determine the spatial differences of key
population attrition in the 19 targeted counties, and used Kilifi county as an
example to map attrition cases in smaller administrative areas (sub-county
level). The study used synthetic minority oversampling technique-nominal
continuous (SMOTE-NC) to balance the datasets since the cases of attrition
were much less than retention. The random survival forests model was then
fitted to the balanced dataset. The model correctly identified attrition cases
using the predicted ensemble mortality and their survival time using the es-
timated Kaplan-Meier survival function. The predictive performance of the
model was strong and way better than random chance with concordance in-
dices greater than 0.75.
How to cite this paper: Kahacho, E., Wam-
wea, C., Malenje, B. and Aomo, G. (2023)
Modelling Key Population Attrition in the
HIV and AIDS Programme in Kenya Using
Random Survival Forests with Synthetic
Minority Oversampling Technique-Nomi-
nal Continuous. Journal of Data Analysis
and Information Processing, 11, 11-36.
https://doi.org/10.4236/jdaip.2023.111002
Received: October 10, 2022
Accepted: January 28, 2023
Published: January 31, 2023
Copyright © 2023 by author(s) and
Scientific Research Publishing Inc.
This work is licensed under the Creative
Commons Attribution International
License (CC BY 4.0).
http://creativecommons.org/licenses/by/4.0/
Open Access