Contents lists available at ScienceDirect Computers in Biology and Medicine journal homepage: www.elsevier.com/locate/compbiomed Reﬁnement-based modeling of the ErbB signaling pathway Bogdan Iancu a,b,1 , Usman Sanwal a,b,1 , Cristian Gratie a,b , Ion Petre a,c,d,* a Computational Biomodeling Laboratory, Turku Centre for Computer Science, Finland b Department of Computer Science, Åbo Akademi University, Finland c Department of Mathematics and Statistics, University of Turku, Finland d National Institute for Research and Development in Biological Sciences, Romania ARTICLE INFO Keywords: Computational modeling Model construction Reﬁnement ErbB signaling pathway ODE-Based models Event-B Invariant ABSTRACT The construction of large scale biological models is a laborious task, which is often addressed by adopting iterative routines for model augmentation, adding certain details to an initial high level abstraction of the biological phenomenon of interest. Reﬁtting a model at every step of its development is time consuming and computationally intensive. The concept of model reﬁnement brings about an eﬀective alternative by providing adequate parameter values that ensure the preservation of its quantitative ﬁt at every reﬁnement step. We demonstrate this approach by constructing the largest-ever reﬁnement-based biomodel, consisting of 421 species and 928 reactions. We start from an already ﬁt, relatively small literature model whose consistency we check formally. We then construct the ﬁnal model through an algorithmic step-by-step reﬁnement procedure that ensures the preservation of the model's ﬁt. 1. Introduction Mechanistic control of cellular activity is intricate and making predictions about its system-level behavior is highly diﬃcult. Our ability to make such predictions can be essential not only in reversing the dynamics of cellular impairment, but also in directing cellular ac- tivity towards a more favorable behavior. Mathematical modeling is essential in making such predictions, but its use as a standard procedure in the ﬁeld of practical applications is severely limited due to large numbers of parameters that are required either to be ﬁxed or estimated, see Ref. [1]. A massive number of parameters to estimate requires the avail- ability of a large volume of data and makes model ﬁtting computa- tionally intensive. For this reason, we focus on reﬁnement-based model construction as an intermediary step in the model development cycle. Stepwise reﬁnement emerged from the ﬁeld of software engineering. It was introduced at ﬁrst as a concept in parallel computing and it ex- panded quickly, giving rise to the framework of reﬁnement calculus, where it is promoted as a reﬁnement method to ensure correctness preservation, see Ref. [2]. In the ﬁeld of systems biology, model reﬁnement becomes crucial in the model development cycle. Model ﬁt is greatly aﬀected by changes in the number of reactants, reactions, modules, etc. The entire process of model ﬁtting for considerably large models is not only a tedious task for the modeler as such, but it is computationally intensive since most parameter estimation routines take considerable time to complete and require massive amounts of computational resources. Hence, an itera- tive approach which relies on the conventional reiteration of the entire model ﬁtting procedure is not feasible for large models. As an alter- native, we consider an approach which ensures model ﬁt preservation at every reﬁnement step. The approach was discussed in the literature for rule-based models, see Refs. [3,4]. For reaction-based models with a quantitative dynamic described by ODEs, the method was referred to as quantitative model reﬁnement, see Ref. [5] and then extended and called ﬁt-preserving data reﬁnement [6]. We discuss in this paper the implementation of the largest-ever model built through model reﬁnement, describing the ErbB signaling pathway. Our reﬁnement approach is based on data reﬁnement, where a ﬁnite set of subspecies of a given species in the initial model are sub- stituted in the reﬁned model for their corresponding ‘parent’ species in the initial model. We started with a model of the EGFR (ErbB1) sig- naling pathway proposed in Refs. [7,8]. Throughout the paper, the model from Ref. [7] is referred to as the basic model. We reﬁned this model to include four diﬀerent types of receptor tyrosine kinases, − ErbB1 4, structurally related to the epidermal growth factor receptor, EGFR, and two types of ligands, EGF and HRG, and we compared the computational eﬀort needed to build it with that of [9]. We used logic- based formal methods support based on modeling with Event-B [10] to https://doi.org/10.1016/j.compbiomed.2019.01.016 Received 10 October 2018; Received in revised form 18 January 2019; Accepted 19 January 2019 * Corresponding author. Computational Biomodeling Laboratory, Turku Centre for Computer Science, Finland. E-mail address: ion.petre@utu.ﬁ (I. Petre). 1 Authors with equal contribution. Computers in Biology and Medicine 106 (2019) 91–96 0010-4825/ © 2019 Elsevier Ltd. All rights reserved. T