A Modified GA-based Workflow Scheduling for Cloud Computing Environment Safwat A. Hamad Department of Computer Science, Faculty of Computers & Information, Cairo University, Cairo, Egypt mcssafr@gmail.com Fatma A. Omara Department of Computer Science, Faculty of Computers & Information, Cairo University, Cairo, Egypt f.omara@fci-cu.edu.eg AbstractThe Cloud computing becomes an important topic in the area of high performance distributed computing. On the other hand, task scheduling is considered one the most significant issues in the Cloud computing where the user has to pay for the using resource based on the time. Therefore, distributing the cloud resource among the users' applications should maximize resource utilization and minimize task execution Time. The goal of task scheduling is to assign tasks to appropriate resources that optimize one or more performance parameters (i.e., completion time, cost, resource utilization, etc.). In addition, the scheduling belongs to a category of a problem known as an NP-complete problem. Therefore, the heuristic algorithm could be applied to solve this problem. In this paper, an enhanced dependent task scheduling algorithm based on Genetic Algorithm (DTGA) has been introduced for mapping and executing an application’s tasks. The aim of this proposed algorithm is to minimize the completion time. The performance of this proposed algorithm has been evaluated using WorkflowSim toolkit and Standard Task Graph Set (STG) benchmark. Keywords—Cloud Computing; Task Scheduling; Genetic Algorithm; Directed Acyclic Graph; Optimization Algorithm I. INTRODUCTION The Cloud computing is emerging technology and great popularity in recent years which grants the users with high scalability, reliability, security, cost effective mechanism, group collaboration and ease of access to various applications [1]. In addition, The Cloud computing provides dynamic services as Software as a service (SaaS), Platform as a service (PaaS) and Infrastructure as a service (IaaS) via the internet [2]. The Cloud computing has some challenges (e.g., security, performance, resource management, etc.). Therefore, the task scheduling is considering one of the most challenges that related to resource management [3]. In general, task scheduling is a problem of assigning tasks to the machine to complete their work. In the same context, the scheduling in the Cloud computing environment means that large number of the tasks are executing on the available resources in a suitable way depending on many parameters (i.e., minimize completion time, minimize the cost of execution tasks, maximize resource utilization, etc.) [3]. Therefore, task scheduling in the Cloud computing environment is considered one of the most factors would affect reliability and performance of the Cloud services [2]. Generally, the problem of assigning tasks to apparently unlimited computing resources in the Cloud computing environment is an NP-Complete problem. According to the process of task scheduling, the user’s jobs are submitted to the Cloud scheduler. In turn, the Cloud scheduler inquires the Cloud information service about the statues of the available resources, and then allocates the various tasks on different resource (i.e., virtual machines) as per the task requirements [2]. The good task scheduling must assign the virtual machine in an optimal way [3]. Therefore, task scheduling problem is considering the challenge in the Cloud computing environment. The researchers are trying to apply heuristic methods to solve this problem and get optimal solution [4]. Therefore, the Meta- heuristic based techniques deal with this problem by providing near optimal solutions. In addition, Meta-heuristic has gained huge popularity in past years due to its efficiency and effectiveness to solve the large and complex problem. There are many of Meta-heuristic algorithms (e.g., Genetic Algorithm (GA), Particle Swarm Optimization (PSO), Ant Colony Optimization (ACO), etc.).[5]. Further, task scheduling algorithms are different based on dependency among tasks to be scheduled. According to dependent task scheduling, there is precedence orders exist in tasks where any task can only be scheduled after finishing execution all its parent tasks. Otherwise, tasks are independent of each other, and they can be scheduled in any sequence. In addition, the dependent task scheduling is known as workflow scheduling and independent task scheduling is known as independent scheduling [5]. The aim of this paper is to develop a workflow scheduling algorithm in the Cloud computing environment based on Genetic Algorithm for allocating and executing dependent tasks to improve task completion time. The rest of the paper is as follows: in Section 2, the related works are discussed. In Section 3, a model for task scheduling problem is described. Sections 4, the principles of the modified GA-based dependent task scheduling are described. The configuration of the Workflowsim simulator, implementation of the proposed Genetic Algorithm, as well as, performance evaluation is discussed in Section 5. Finally, conclusion and future work are given in Section 6. International Journal of Computer Science and Information Security (IJCSIS), Vol. 15, No. 8, Augus 2017 276 https://sites.google.com/site/ijcsis/ ISSN 1947-5500