MoDLF – A Model-Driven Deep Learning Framework for Autonomous Vehicle Perception (AVP) Aon Safdar , Farooque Azam , Muhammad Waseem Anwar * , Usman Akram ,Yawar Rasheed Department of Computers and Software Engineering, College of EME, National University of Sciences and Technology, Pakistan * Department of Innovation, Design and Engineering, Mälardalen University, Västerås - Sweden (aon.safdar20, yawar.rasheed18)@ce.ceme.edu.pk, (farooq, usman.akram)@ceme.nust.edu.pk, muhammad.waseem.anwar@mdu.se ABSTRACT Modern vehicles are extremely complex embedded systems that integrate software and hardware from a large set of contributors. Modeling standards like EAST-ADL have shown promising results to reduce complexity and expedite system development. However, such standards are unable to cope with the growing demands of the automotive industry. A typical example of this phenomenon is autonomous vehicle perception (AVP) where deep learning architectures (DLA) are required for computer vision (CV) tasks like real-time object recognition and detection. However, existing modeling standards in the automotive industry are unable to manage such CV tasks at a higher abstraction level. Consequently, system development is currently accomplished through modeling approaches like EAST-ADL while DLA-based CV features for AVP are implemented in isolation at a lower abstraction level. This significantly compromises productivity due to integration challenges. In this article, we introduce MoDLF - A Model-Driven Deep learning Framework to design deep convolutional neural network (DCNN) architectures for AVP tasks. Particularly, Model Driven Architecture (MDA) is leveraged to propose a metamodel along with a conformant graphical modeling workbench to model DCNNs for CV tasks in AVP at a higher abstraction level. Furthermore, Model-To-Text (M2T) transformations are provided to generate executable code for MATLAB ® and Python. The framework is validated via two case studies on benchmark datasets for key AVP tasks. The results prove that MoDLF effectively enables model-driven architectural exploration of deep convnets for AVP system development while supporting integration with renowned existing standards like EAST-ADL. CCS Concepts Software and its engineering → Software notations and tools Context specific languages Domain specific languages KEYWORDS Model-Driven Architecture, Model transformation, Low code, Autonomous vehicles perception, Deep learning, Computer vision ACM Reference format: Aon Safdar, Farooque Azam, Muhammad Waseem Anwar, Usman Akram and Yawar Rasheed. 2022. MoDLF – A Model-Driven Deep Learning Framework for Autonomous Vehicle Perception (AVP). In Proceedings of 25th ACM International Conference on Model Driven Engineering Languages and Systems (MODELS ’22), October 23–28, 2022, Montreal, QC, Canada. ACM, New York, NY, USA, 12 pages. https://doi.org/10.1145/3550355.3552453 1 Introduction and Motivation Over the past decade, scene perception, decision-making, and control mechanism in autonomous vehicles (AVs)/advanced driver assistance systems (ADAS) have become increasingly reliant on deep learning-based (DL) solutions with demonstrated performance in terms of efficiency and safety [1]. Deep learning in AVs is extensively applied for computer vision (CV), path planning, control algorithms, and sensor fusion tasks [2]. For autonomous vehicle perception (AVP), the visual scene information is first collected, and then distinct CV tasks are performed through a learning model for decision making. Our motivation to propose a comprehensive modeling framework for AVP arises from a few challenges faced during DL solutions development for AVP and during model-driven system development for embedded automotive system design. We introduce these challenges in the following two subsections. 1.1 Deep Learning for AVP CV tasks such as object classification, detection, semantic segmentation, etc. are an integral part of AVP and are required to be performed with the overriding constraints of real-time inference and limited hardware capability onboard AVs [3]. To achieve this, the designers of AVP strive to develop deep learning architectures (DLA) that deliver high-precision (maximizing inference accuracy) and high-speed (minimizing inference time/complexity) performance in carrying out these critical CV tasks. This is mainly achieved via DLA exploration and extensive experimentation [4]. Contextually, DLA exploration refers to the changes to the structure of layers, weights, biases, and activations that define the learning algorithm. Consequently, various DLAs exist that are trained on benchmark datasets to find the best weights for the training data. Changes including, but not limited to, variations in volume/depth of layers, layer combinations, model concatenation, hyperparameters, optimizers, regularizes, etc. allow developers to come up with different backbones, necks, and heads to develop models suited for different tasks [5]. Over time, various state of the ___________________ Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from Permissions@acm.org. MODELS’22, October, 2022, Montréal, QC, Canada © 2022 Association for Computing Machinery. ACM ISBN 978-1-4503-9466-6/22/10... $15.00 https://doi.org/10.1145/3550355.3552453 ___________________ MoDLF artifacts are available at: https://doi.org/10.5281/zenodo.7011240