Functionality Based Code Smell Detection and Severity
Classification
Omkarendra Tiwari and Rushikesh K. Joshi
{omkarendra,rkj}@cse.iitb.ac.in
Department of Computer Science and Engineering
Indian Institute of Technology Bombay
Mumbai, India
ABSTRACT
The Long Method code smell is a symptom of design defects caused
by implementing multiple tasks within a single method. It limits
reusability, evolvability and maintainability of a method. In this
paper, we present a functionality based approach for detecting long
methods. Functionalities are identifed through a novel block based
dependency analysis technique called Segmentation. It clusters sets
of statements into extract method opportunities (or tasks). The
approach uses interdependencies among various extract method
opportunities identifed within the method as a means to measure
severity of the long method smell. The approach is validated over a
Java based open source code. A comparison with expert’s assess-
ment shows that the approach is promising in detecting severe
methods irrespective of their sizes.
CCS CONCEPTS
· Software and its engineering → Software maintenance tools;
Maintaining software.
KEYWORDS
code smell, long method smell severity, extract method opportunity,
refactoring, segmentation
ACM Reference Format:
Omkarendra Tiwari and Rushikesh K. Joshi. 2020. Functionality Based Code
Smell Detection and Severity Classifcation. In 13th Innovations in Software
Engineering Conference (formerly known as India Software Engineering Con-
ference) (ISEC 2020), February 27ś29, 2020, Jabalpur, India. ACM, New York,
NY, USA, 5 pages. https://doi.org/10.1145/3385032.3385048
1 INTRODUCTION
The term code smell [8] is popularly used to refer to design de-
fects such as methods with multiple responsibilities, misplaced
functionality and non-cohesive classes. To mitigate such problems
refactoring is applied, which changes the defective source code in
order to improve its internal structure without altering the external
behavior. New feature addition, bug fxing, and improved resuabil-
ity and readability are among primary motivations for applying
refactoring [8, 12].
ACM acknowledges that this contribution was authored or co-authored by an employee,
contractor or afliate of a national government. As such, the Government retains a
nonexclusive, royalty-free right to publish or reproduce this article, or to allow others
to do so, for Government purposes only.
ISEC 2020, February 27ś29, 2020, Jabalpur, India
© 2020 Association for Computing Machinery.
ACM ISBN 978-1-4503-7594-8/20/02. . . $15.00
https://doi.org/10.1145/3385032.3385048
Murphy-Hill et al. [9] reported extract method to be among
the most frequent refactorings applied by users. Studies show that
refactoring tools are underutilized as developers prefer to apply
refactorings manually [9, 12, 15]. Factors for underutilization are
typically identifed as (i) lack of awareness about code smells and
available tools (awareness), (ii) non availability of IDE and tool inte-
gration for better visibility of automated suggestion (opportunity) to
the user, (iii) unspecifed limitations of the tools leaves developers
to perform refactoring activities manually (trust ).
Long method smell is reported as the most popular code smell
after duplicate code in a survey conducted by Yamashita and Moo-
nen [15], which included people with diferent responsibilities (e.g.
developers, team leads, architects, project managers) from 29 coun-
tries. Chatzigeorgiou and Manakos [5] observed that most of the
long methods survive multiple successive versions of software.
They also noted that time and efort required in maintenance may
limit experts to apply refactoring over selected long methods such
as the ones that are subject to change in subsequent versions.
Prioritization of long methods to be refactored can be based
on diferent criteria such as number of calls to methods, feature
extension possibilities of methods and complexity and readability.
In this paper, we develop a severity measure that tries to capture
such factors for long methods. Designing automated support for
the severity measure has its own challenges as outlined below.
• Consolidated knowledge of the system For large software, it
is difcult to fnd a single person or group of people with
detailed knowledge of the whole system, who can assist in
marking the severity of the components. In such scenarios, a
tool’s assistance can be helpful in long method identifcation
and severity.
• Human aspects Diferent experts may apply diferent strate-
gies and considerations of properties based on their expertise
for identifying and prioritizing long methods.
• Combined approach An efective approach would be to com-
bine multiple attributes for measuring long method severity.
To name a few, structural, textual and historical aspects of
the software can be used.
A severity measure motivated by improved readability may con-
sider structural aspects of the method in addition to textual infor-
mations. An approach with focus on ease of debugging may aim
to refactor a method with highly interdependent extract method
opportunities (EMOs), so that bugs can be localized. For improved
reusability, an approach would require knowledge of input and
output dependencies of EMOs across the software.
In this paper, we propose an approach for computing long method
code smell severity with focus on studying the efectiveness of the