Towards Real-time User QoE assessment via Machine Learning on LTE network data Umair Sajid Hashmi *† , Ashok Rudrapatna * , Zhengxue Zhao * , Marek Rozwadowski * , Joseph Kang * , Raj Wuppalapati * and Ali Imran * Bell Labs Consulting, Murray Hill, NJ, USA School of Electrical and Computer Engineering, University of Oklahoma, Tulsa, OK, USA Abstract—It is well known that current reactive network management would be unable to support the exponential increase in complexity and rapidity of change in future cellular networks. Keeping this in perspective, the goal of this paper is to investigate applicability of machine learning and predictive models to assess cell-level user quality of experience (QoE) in real-time. For this purpose, we leverage a 5-week LTE metrics data collected at cell level granularity for a national LTE network operator. Domain knowledge is applied to assess user QoE with network key performance indicators (KPIs), namely sched- uled user throughput, inter-frequency handover success rate and intra-frequency handover success rate. Results indicate that applying boosted trees model on a subset of carefully selected non-collinear features allows high accuracy threshold-based estimation of user throughput and inter-frequency handover success rate. We also exploit the periodic nature of cell data characteristics and apply a recently developed time series prediction model known as PROPHET for future QoE estimation. By employing machine learning and data analytics on network data within an end-to-end framework, network operators can proactively identify low performance cell sites along with the influential factors that impact the cell performance. Based on the root cause analysis, appropriate corrective measures may then be taken for low performance cell sites. Index Terms—Machine learning, user quality-of- experience, PROPHET model, gradient boosted trees, mobile network data I. I NTRODUCTION As we approach the era of 5th generation wireless networks (5G), network operators are striving ever so hard to cope up with the exponentially growing data de- mands and diversity of devices connected to the network. Ensuring adequate user quality of experience (QoE) in ultra-dense 5G networks becomes ever so important for multiple reasons, for instance, reducing customer churn as well as supporting advance 5G AR (Augmented reality) / VR (Virtual Reality) use cases. Existing user QoE assessment techniques are rendered impractical in these scenarios due to their passive operation. Whether the assessment is done via mobile applications, probe measurements or MDT (Minimization of Drive Test) reports, there is an inherent delay in these methodologies which causes any remedial action to be sluggish. Since the network states in dense networks changes instantly, real-time evaluation of the user QoE is essential to enable proactive network management. In this paper, we apply well established machine learn- ing (ML) methodologies to investigate real-time user QoE measurement via cell level surrogate key perfor- mance indicators (KPIs). We employ real-time counters and metrics measured at different network elements of a country wide LTE operator. The data was collected over a total duration of 10 weeks with a time granularity of 1 hour. From the cell level performance data, we shortlist three user QoE assessment KPIs: i) downlink user throughput, ii) handover (HO) success percentage due to mobility from one cell to another, and iii) hand over success rate from one radio frequency (RF) band to another within the same cell. While the selected KPIs do not necessarily express the QoE at user level, they are suitable indicators for the average perceived QoE for an arbitrary user within a cell’s coverage area. The three- fold goal of this work is summarized as follows: We validate our hypothesis that user QoE can be accurately estimated by using limited set of real- time LTE network metrics. By applying off-the- shelf regression and classification models, we inves- tigate the performance of the algorithms in terms of their ability to accurately predict from a set of LTE counters whether a cell’s performance in terms of three QoE KPIs is below the stated threshold. After establishing that real-time QoE performance level of a cell can be accurately estimated, we apply a recently developed time series based predictive model known as PROPHET on the LTE data to predict future cell level QoE performance. The QoE performance prediction is performed for near-future and further-future scenarios, to assess how well the model can capture the hourly variation for the next 48 hours in case of former, and capture the weekly trend in user QoE KPI for the latter scenario. From the correlation and feature importance anal- ysis, we identify multi-collinearity amongst se- lected metrics. By training our ML models on non- collinear features, we obtain the relative influence of metrics on the user QoE KPIs. Consequently, the network operator can improve the overall user QoE