Analyzing Ola Data for Precise Price
Prediction Using XGBoost Technique
Comparing with LASSO Regression
G. Venkat Sai Tarun
a,1
and P.Sriramya
b
a
Research Scholar, Dept. of CSE, Saveetha School of Engineering,
b
Professor, Dept. of AI&DS, Saveetha School of Engineering,
a,b
SIMATS, Chennai
Abstract: XGBoost algorithm and Lasso regression and compare r-square, Mean
Square Error (MSE), Root MSE, and RMSLE values. The algorithm should be
efficient enough to produce the exact fare amount of the trip before the trip starts.
The sample size for implementing this work was N=10 for each of the groups
considered. It was iterated 20 times for efficient and accurate prediction of cab
price prediction with G power in 80% and threshold 0.05%, CI 95% mean and
standard deviation. The sample size calculation was done with clincle. The pretest
analysis was kept at 80%. The sample size calculation was done using clincalc.
The statistical analysis shows that the significance value for calculating r-squared
and MSE was 0.63 and 0.581(p>0.05), respectively. The XGBoost algorithm gives
a slightly better accuracy rate with a mean r-squared percentage of 72.62%, and
the Lasso regression algorithm has a mean r-square of 70.47%. Through this, the
prediction is made for the online booking of cabs or taxis, and the Xgboost
algorithm gives a slightly better r-squared value and MSE values than the Lasso
regression algorithm.
Keywords. XGBoost regression, LASSO regression, Fare prediction, Novel
exploratory data analysis, Machine Learning.
1. Introduction
The objective of this study is to use a machine learning method called XGBoost
Algorithm to predict the fare amount for online cab services before the trip starts by
comparing the r-squared and MSE values with the Lasso regression algorithm [1]. The
central importance of this study is predicting the prices of online cab services. The
price of the trip which will be started can be shown before the trip starts. This process
is shown as the price prediction. The price prediction shows the trip's fare by
calculating the given values of the attributes. The attributes are the central values to be
calculated to show the prediction. The attributes include location, date-time, passenger
count, and fare amount. The existing fare amount should be changed or updated
through the program, and the fare amount is updated through the weather conditions,
1
P.Sriramya, Dept. of AI&DS, Saveetha School of Engineering, SIMATS, Chennai, India. E-mail:
sriramyap@saveetha.com
Advances in Parallel Computing Algorithms, Tools and Paradigms
D.J. Hemanth et al. (Eds.)
© 2022 The authors and IOS Press.
This article is published online with Open Access by IOS Press and distributed under the terms
of the Creative Commons Attribution Non-Commercial License 4.0 (CC BY-NC 4.0).
doi:10.3233/APC220050
360