Machine Translation System as Virtual Appliance: For Scalable Service Deployment on Cloud Pawan Kumar Expert Software Consultants Ltd., New Delhi, India hawahawai@gmail.com B. D. Chaudhary Motilal Nehru National Institute of Technology Allahabad, India bdc@mnnit.ac.in Rashid Ahmad LTRC, International Institute of Information Technology Hyderabad, India rashid.ahmed@research.iiit.ac.in Rajeev Sangal LTRC, International Institute of Information Technology Hyderabad, India sangal@iiit.ac.in Abstract— Machine Translation (MT) system is a complex natural language processing (NLP) system composed of large number of heterogeneous modules. Deployment of such a complex system even on a stand alone system is a cumbersome, knowledge intensive and time consuming task, taking hours to load, configure and run the system. As an MT system goes through frequent and regular updates, mainly to improve its accuracy and performance, the cumbersome task of its deployment is required to be repeated on release of each new version. Further, when such a system is needed to be deployed on a cloud infrastructure, mainly to facilitate auto-scaling of computational resources for varying load conditions, the task of deployment gets even more complicated and more time consuming. This paper proposes that every software version of a complex NLP application like MT system should be built and released as a virtual appliance that can be deployed with a very little setup time and with ease even by a common user. It discusses the experiments performed to build the MT system into a virtual appliance, for stand alone system deployment as well as for cloud deployment, and reports the deployment time measurements in both the scenario. Deployment of the virtual MT appliance took 130 seconds in stand alone system; its deployment on a large number of virtual machines in the cloud environment took 150 seconds on an average, in contrast to several hours taken for the deployment of MT applications earlier. Keywords — Deployment; Machine Translation; NLP Application; Virtual Appliance; Cloud; Auto scaling. I. INTRODUCTION Most of the NLP applications like Machine Translations (MT) Systems in general are composed of large number of modules, that are heterogeneous in nature, and these heterogeneous modules in turn depend upon complex set of environmental dependencies to perform a given task. To resolve such complex environmental dependencies at the time of software deployment is a hard task; it is also technical intensive and time consuming; additionally it is undesirable too. Software deployment is defined as the process between the acquisition, and execution of the software. This process is performed as the post-development activity that takes care of user-centric customization and configuration of the software systems. At times this process can be quite complex, and may need the involvement and expertise of the developers and the system administrators quite extensively. Apart from technical complexity, the deployment tasks may be time consuming (of the order of hours). It is found [2, 5] that in general, 19% of total cost of operation (TCO) of a software system goes in deployment cost. As an MT system is far more complex and technical intensive it is fair to expect that its TCO must be far higher. Unlike generic applications, an NLP application like MT system goes through frequent and regular updates, mainly to improve its accuracy and performance, and also to increase the coverage of its domain. Every new release of MT system requires fresh deployment of the new version from scratch, aggravating the technical administration distress, and in turn, inflating the total cost of operation far higher. Furthermore, response of an MT system becomes exponentially slow with growing load. Scaling up of computation resources with growing load, mainly to provide users with better response time requires additional financial commitment from the service provider which cannot be done on-the-fly. Cloud infrastructure offered by third party, where computation resources can be scaled-on-the-fly, seems to be the most appropriate platform for offering service of such types of applications. Deploying an application on cloud infrastructure, in compared to a stand alone system, is far more complex, technical intensive, and time consuming. And hence, the total cost of deployment for such applications becomes even more significant when it is to be deployed on the cloud. For a complex and technical intensive application like an MT system which is to be distributed to lay users, and which may have frequent software updates, the need to minimize the deployment time and to diminish its technical complexity become imperative. 2013 IEEE Seventh International Symposium on Service-Oriented System Engineering 978-0-7695-4944-6/12 $26.00 © 2012 IEEE DOI 10.1109/SOSE.2013.69 304