GITIRBio: A Semantic and Distributed Service Oriented- Architecture for Bioinformatics Pipeline Luis F. Castillo 1 , Germán López-Gartner 2 , Gustavo A. Isaza 1* , Mariana Sánchez 2 , Jeferson Arango 1 , Daniel Agudelo-Valencia 2 , Sergio Castaño 1 1 Systems and Informatics Department, GITIR Research Group. Caldas University, Street 65 # 26-10 Manizales, Colombia 2 Biology Sciences Department, GITIR Research Group. Caldas University, Street 65 # 26-10 Manizales, Colombia Summary The need to process large quantities of data generated from genomic sequencing has resulted in a difficult task for life scientists who are not familiar with the use of command-line operations or developments in high performance computing and parallelization. This knowledge gap, along with unfamiliarity with necessary processes, can hinder the execution of data processing tasks. Furthermore, many of the commonly used bioinformatics tools for the scientific community are presented as isolated, unrelated entities that do not provide an integrated, guided, and assisted interaction with the scheduling facilities of computational resources or distribution, processing and mapping with runtime analysis. This paper presents the first approximation of a Web Services platform-based architecture (GITIRBio) that acts as a distributed front-end system for autonomous and assisted processing of parallel bioinformatics pipelines that has been validated using multiple sequences. Additionally, this platform allows integration with semantic repositories of genes for search annotations. GITIRBio is available at: http://c-head.ucaldas.edu.co:8080/gitirbio 1 Introduction A comprehensive database of biological data, particularly genomic data, is being developed in the scientific community for a wide variety of organisms. This database will not only characterize individuals within species but also species as a whole, driving the exponential growth of biological databases and creating the need for powerful new tools to organize, analyze, and visualize this information. This trend is affecting all fields of knowledge, including medicine, agriculture, animal and plant breeding, ecology and environmental sciences as well as general industry, allowing the emergence of new research specialties, such as computational biology, bioinformatics and biocomputing. New databases constantly appear that describe sequences of new genomes, transcriptomes, proteomes and everything occurs at a rate that exceeds the capacity to process such data by several orders of magnitude. Researchers need to analyze the immense amount of genomic data available today to assign the biological functions of complex genetic, biochemical and physiological processes. This analysis requires high computational capability to perform numerous tasks efficiently, including but not limited to sequence assembly, sequence alignment, functional and structural annotation, structural biology analysis, molecular modeling, gene interaction networks, molecular phylogenetics and comparative genomics. The overall annotation process consists of identifying biological characteristics associated * To whom correspondence should be addressed. Email: gustavo.isaza@ucaldas.edu.co Journal of Integrative Bioinformatics, 12(1):255, 2015 http://journal.imbio.de doi:10.2390/biecoll-jib-2015-255 1 Copyright 2015 The Author(s). Published by Journal of Integrative Bioinformatics. This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivs 3.0 Unported License (http://creativecommons.org/licenses/by-nc-nd/3.0/).