SBMLChecker, a Semantic approach for SBML model reliability evaluation. Mathialakan Thavappiragasam, Carol M. Lushbough, Etienne Z. Gnimpieba Computer Science Department, University of South Dakota, 414 E. Clark St. Vermillion, SD 57069, USA, {Mathialakan.Thavappi; Carol.Lushbough; Etienne.Gnimpieba}@usd.edu ABSTRACT In Systems Biology model design, reliability evaluation constitutes a requirements challenge. In order to apply the models on a given process or on work for in silico study, a systems biologist needs to be ensured of the models quality. The key problem remains the relation between the model and the biologist question. Several algorithms was designed to validate models but they only check correctness of syntax (e.g. Online SBML validator). These algorithms do not consider semantic annotation of a model defining biological context of the model. In our approach we have measured the model reliability using a combination of meaning (semantic) and syntax. This approach allows researcher to identify a model that really fits his needs and application domain. It also provides unique identification to each model element (compound, reaction, and compartment) in order to facilitate any Systems Biology operation such as merging, splitting, and simulation. It is implemented in Java and connected to the model database BIOMODELS using Restful API, our algorithm implementation called SBMLChecker is available online at http://jacksons.usd.edu/SBMLC/. The command line version has been deployed on BioExtract server, at bioextract.org that it to be integrated in automatic sharable scientific workflow. Keywords: semantic, syntax, annotated URL id, SBML, biological model 1. INTRODUCTION System Biology Markup Language (SBML) is the common format to represent a Biosystem mathematical model. Used by over 250 tools (SBML.org), it remains lacking in many aspects in order to provide the appropriate model in the right context. The reliability of a model depends considerably on the context related to the model design. The development of semantic annotation of biological elements allows systems biologists to connect design context (domain ontology) to a model. Semantic in biological modeling There are several organizations (EBI [2], NCBI [3]) maintaining databases ( Biomodels [2], Protein, Gene, etc.) and/or ontologies (gene ontology [4]) in order to manage biological components (e.g., reaction, species, etc.) in a standard way. They try to categorize the already defined components and identify relationships among them. Each database assigns unique id to each element and keeps tracking relevant details (e.g. properties, description) with these ids. Furthermore, we can find several web applications (e.g., KEGG Mapper) that provide services to map the same components from different places [5]. Some of them provide web services, especially RESTful services, that could be used by software tool developers (web services for GO terms and annotation provided by EBML-EBI) [6]. A single component can be annotated by multiple databases and or ontologies. The SBML defines annotation- tag to annotate biological components, it has resources with the details of database and id for each annotation [7]. E.g. the reaction MTHFR, [5,10-methylene-tetrahydrofolate] + [NADPH] → [5-methyl-tetrahydrofolate] in BIOMD0000000018 has the annotations "urn:miriam:ec-code:1.5.1.20", "urn:miriam:kegg.reaction:R01224". This reaction has Enzyme id 1.5.1.20 and KEGG reaction id is R01224. SBML reliability evaluation in existing tools (Online SBML Validator) The model reliability checking should ensure their correctness on both syntax and semantic (meaning). The Online SBML validator introduced by SBML.org provides the services to test syntax and internal consistency of an SBML model. This system checks the following aspects of a model [1]: Consistency of measurement units associated with quantities (SBML L2V4 rules 105nn) Correctness and consistency of identifiers used for model entities (SBML L2V4 rules 103nn) Syntax of MathML mathematical expressions (SBML L2V4 rules 102nn)