Automatically Prioritizing and Assigning Tasks from Code Repositories in Puzzle Driven Development Yegor Bugayenko Huawei Russia yegor256@huawei.com Ayomide Bakare Arina Cheverda Mirko Farina Artem Kruglov Yaroslav Plaksin Giancarlo Succi * g.succi@innopolis.ru Innopolis University Russia Witold Pedrycz University of Alberta Canada wpedrycz@ualberta.ca ABSTRACT Automatically prioritizing software development tasks extracted from codes could provide signifcant technical and organizational advantages. Tools exist for the automatic extraction of tasks, but they still lack the ability to capture their mutual dependencies; hence, the capability to prioritize them. Solving this important puzzle is the goal of the presented industrial challenge. CCS CONCEPTS • Software and its engineering → Error handling and recov- ery; Software confguration management and version control systems; • Social and professional topics → Quality assurance. KEYWORDS task prioritization, text tagging, software development ACM Reference Format: Yegor Bugayenko, Ayomide Bakare, Arina Cheverda, Mirko Farina, Artem Kruglov, Yaroslav Plaksin, Giancarlo Succi, and Witold Pedrycz. 2022. Auto- matically Prioritizing and Assigning Tasks from Code Repositories in Puzzle Driven Development. In 19th International Conference on Mining Software Repositories (MSR ’22), May 23–24, 2022, Pittsburgh, PA, USA. ACM, New York, NY, USA, 2 pages. https://doi.org/10.1145/3524842.3528512 1 MOTIVATION This research focuses on how to automatically assign to develop- ers the tasks that may be extracted from the source code, where they are embedded as todo or fixme markups. This stems from “Puzzle Driven Development” patent, which idea is that most soft- ware algorithms and constructs can be decomposed into smaller elements [1]. Such elements may not necessarily be implemented * corresponding author. Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for proft or commercial advantage and that copies bear this notice and the full citation on the frst page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specifc permission and/or a fee. Request permissions from permissions@acm.org. MSR ’22, May 23–24, 2022, Pittsburgh, PA, USA © 2022 Association for Computing Machinery. ACM ISBN 978-1-4503-9303-4/22/05. . . $15.00 https://doi.org/10.1145/3524842.3528512 all at once to make the system work but some of them may be left there in form of stubs, thus being “puzzles” for future iterations. 2 THE PROBLEM There is a rich literature in software engineering describing this process, especially with reference to quality and maintenance [5]. Attempts have also been made to automate or streamline puzzles allocation. For example, Zerocracy 1 created a bot extracting todo from source code and turning them into GitHub tickets; however, neither Zerocracy, nor its current competitors 2 have succeeded in automatic prioritization of puzzle-based tasks. Therefore, the problem of automatic puzzles allocation remains particularly thorny, especially for very large IT companies (such as Huawei). Solving it would constitute a signifcant advance in the feld, because it would: (a) reduce the size of a backlog, thus making projects more manageable [4] (b) allow people identify high priority tasks faster, (c) promote a feeling of fairness and freedom at work, and (d) guarantee that everyone is focused on achieving the key goals of the organization. Recent research in cognitive anthropology and psychology [3] as well as anecdotal evidence from agile approaches [2] evidenced that the allocation of tasks among knowledge workers is more ef- fective if driven—whenever possible—by the developers themselves. 3 OUR SOLUTION Taking this important consideration as our starting point and ex- panding on previous work conducted by Zerocracy we attempted to automate tasks prioritization and allocation from code repositories by using Machine Learning techniques. In our current approach 3 , we represented a puzzle as a vector of numeric properties uniquely available in PDD context and not available for regular GitHub is- sues, such as the depth of the puzzle in a hierarchy, the number of previously closed siblings in the tree, and so on. We predicted the priority of a puzzle by maximizing the distance between the 1 A Project Manager That Never Sleeps (2018) by Yegor Bugayenko; https://www. yegor256.com/2018/03/21/zerocracy-announcement.html; accessed on 11/09/2021 2 See todo and check-todo products at GitHub Marketplace at https://github.com/ marketplace; accessed on 14/02/2022 3 To this end, our entire code-base is available at: https://github.com/InnoPDDTeam/ pdd-data-analysis 722 2022 IEEE/ACM 19th International Conference on Mining Software Repositories (MSR) Authorized licensed use limited to: Universita degli Studi di Bologna. Downloaded on June 30,2022 at 06:51:22 UTC from IEEE Xplore. Restrictions apply.