RESCIENCEC
Replication / ML Reproducibility Challenge 2021
[Re] Replication study of ’Data-Driven Methods for
Balancing Fairness and Effciency in Ride-Pooling’
Vera Neplenbroek
1,2, ID
, Sabijn Perdijk
1,2, ID
, and Victor Prins
1,2, ID
1
University of Amsterdam, Amsterdam, the Netherlands –
2
Equal contributions
Edited by
Koustuv Sinha,
Sharath Chandra Raparthy
Reviewed by
Anonymous Reviewers
Received
04 February 2022
Published
23 May 2022
DOI
10.5281/zenodo.6574683
Reproducibility Summary
Scope of Reproducibility
We evaluate the following claims related to fairness‐based objective functions presented
in [1]: (1) For the four objective functions, the success rate in the worst‐served neighbor‐
hood increases monotonically with respect to the overall success rate. (2) The proposed
objective functions do not lead to a higher income for the lowest‐earning drivers, nor
a higher total income, compared to a request‐maximizing objective function. (3) The
driver‐side fairness objective can outperform a request‐maximizing objective in terms
of overall success rate and success rate in the worst‐served neighborhood. This means
that this objective, whilst reducing the spread of income, also positively impacts rider
fairness and proftability.
Methodology
The code provided by [1] was used as a base for our re‐implementation in PyTorch. We
evaluate the claims by the original authors by (a) replicating their experiments, (b) test‐
ing for sensitivity to a diferent value estimator, (c) examining sensitivity to changes in
the preprocessing method, and (d) testing for generalizability by applying their method
to a diferent dataset.
Results
We reproduced the frst claim since we observed the same monotonic increase of the
success rate in the worst‐served neighborhood with respect to the overall success rate.
The second claim we did not reproduce, since we found that the driver‐side fairness ob‐
jective function obtains a higher income for the lowest‐earning drivers than the request‐
maximizing objective function. We reproduced the third claim, since the driver‐side
objective function performs best in terms of overall success rate and success rate in
the worst‐served neighborhood, and also reduces the spread of income. Changes of the
value estimator, preprocessing method and even dataset all led to consistent results re‐
garding these claims.
Copyright © 2022 V. Neplenbroek, S. Perdijk and V. Prins, released under a Creative Commons Attribution 4.0 International license.
Correspondence should be addressed to Vera Neplenbroek (vera.neplenbroek@student.auc.nl)
The authors have declared that no competing interests exist.
Code is available at https://github.com/Veranep/rideshare-replication – DOI 10.5281/zenodo.6501799. – SWH
swh:1:dir:f5439c1a7a15c4eb709da6f32eb252679a1d44bd.
Data is available at https://www1.nyc.gov/site/tlc/about/tlc-trip-record-data.page.
Open peer review is available at https://openreview.net/forum?id=BEhgn2zm3CK.
ReScience C 8.2 (#29) – Neplenbroek, Perdijk and Prins 2022 1