Extracting Corrective Actions from Code Repositories
Yegor Bugayenko
Huawei
Russia
yegor256@huawei.com
Kirill Daniakin
Mirko Farina
Firas Jolha
Artem Kruglov
Giancarlo Succi
*
g.succi@innopolis.ru
Innopolis University
Russia
Witold Pedrycz
University of Alberta
Canada
wpedrycz@ualberta.ca
ABSTRACT
Simple detection of bugs, defects or anomalies during software
development is not enough - it is necessary to apply corrective
actions to eliminate them. To fnd out whether an anomaly exists in
any software, we can measure the quality attributes using software
metrics. The main goal of this paper was to fnd out and explain
how to meaningfully attribute metrics to useful corrective actions.
CCS CONCEPTS
• Information systems → Data management systems;• Soft-
ware and its engineering → Software development process
management.
KEYWORDS
software metrics, repository, information retrieval, anomaly, cor-
rective action, project management
ACM Reference Format:
Yegor Bugayenko, Kirill Daniakin, Mirko Farina, Firas Jolha, Artem Kruglov,
Giancarlo Succi, and Witold Pedrycz. 2022. Extracting Corrective Actions
from Code Repositories. In 19th International Conference on Mining Software
Repositories (MSR ’22), May 23–24, 2022, Pittsburgh, PA, USA. ACM, New
York, NY, USA, 2 pages. https://doi.org/10.1145/3524842.3528517
1 INTRODUCTION
Since the establishment of TPS [3], the problem of measuring and
improving the quality and performance of various production en-
vironments has been the subject of intense research. In software
engineering, it gave rise to a number of frameworks for managing
the development process, such as COBIT, ITIL, and CMMI [1, 2].
Such frameworks provide grounds for achieving specifc business
goals within the project; however, none of them describes prac-
tices of the situational management. In other words, how to relate
*
corresponding author.
Permission to make digital or hard copies of all or part of this work for personal or
classroom use is granted without fee provided that copies are not made or distributed
for proft or commercial advantage and that copies bear this notice and the full citation
on the frst page. Copyrights for components of this work owned by others than ACM
must be honored. Abstracting with credit is permitted. To copy otherwise, or republish,
to post on servers or to redistribute to lists, requires prior specifc permission and/or a
fee. Request permissions from permissions@acm.org.
MSR ’22, May 23–24, 2022, Pittsburgh, PA, USA
© 2022 Association for Computing Machinery.
ACM ISBN 978-1-4503-9303-4/22/05. . . $15.00
https://doi.org/10.1145/3524842.3528517
anomalies and metrics of the project with specifc preventive and/or
corrective actions.
2 ANALYSIS OF PREVIOUS FINDINGS
We conducted an exploratory search and found seven papers on
this topic, which we subsequently reviewed (for details please see:
https://bit.ly/3Bn148x)
The papers we reviewed specifed corrective actions for quanti-
fed anomalies detected in software systems. Across these papers,
we individuated 90 anomalies and 140 possible corrective actions.
In this context, it is worth noting that only a bunch of the papers we
analysed drafted out connections between metrics and anomalies.
To overcome this limitation, we manually clustered (by similarity)
the anomalies and the corrective actions we collected. Subsequently,
we related them to key process areas of CMMI and organized the
metrics into three main categories (process, product, and resources).
Finally, we connected the metrics to specifc anomalies when such
a connection was specifed. The results
1
demonstrated that several
important areas within the CMMI have been previously overlooked,
and—as a consequence—that the connections between metrics and
anomalies have not been fully comprehended.
3 INDUSTRIAL PROBLEM
This instructive failure prompted us to speculate that it might be
possible to relate all other anomalies and associated preventive
and corrective actions by using Machine Learning (ML) techniques
[4, 5] with the intent of producing an advisory system, deployable
during all phases of software development, capable of enriching
and improving project management processes.
4 OUR RESULTS
To achieve this goal we have been performing targeted interviews
with senior Huawei managers. We also tried to identify anomalies
in an unsupervised way, by using multiple clustering techniques
on several software metrics collected from several repositories. We
did succeed in identifying anomalies as cluster of abnormal values
of metrics and used statistical tests (such as T-test and F-test) to
ascertain the contribution of diferent metrics in predicting the
emergence of specifc anomalies.
If, our preliminary fndings will be confrmed, we may be able to
build an ML model capable of identifying anomalies and then auto-
matically recommending certain corrective and preventive actions.
1
https://github.com/fras-jolha/CAPA
687
2022 IEEE/ACM 19th International Conference on Mining Software Repositories (MSR)
Authorized licensed use limited to: Universita degli Studi di Bologna. Downloaded on June 30,2022 at 06:50:58 UTC from IEEE Xplore. Restrictions apply.