FAST: A fully asynchronous and status-tracking pattern for geoprocessing services orchestration Huayi Wu a , Lan You a,b,n , Zhipeng Gui a,c , Shuang Gao a , Zhenqiang Li a , Jingmin Yu a a The State Key Laboratory of Information Engineering in Surveying, Mapping and Remote Sensing, Wuhan University, Wuhan, China b Faculty of Computer Science and Information Engineering, Hubei University, Wuhan, China c School of Remote Sensing and Information Engineering, Wuhan University, Wuhan, China article info Article history: Received 23 January 2014 Received in revised form 30 April 2014 Accepted 10 June 2014 Available online 18 June 2014 Keywords: Geoprocessing services orchestration Scientific workflow WS-BPEL GeoChaining Asynchronous Status-tracking abstract Geoprocessing service orchestration (GSO) provides a unified and flexible way to implement cross- application, long-lived, and multi-step geoprocessing service workflows by coordinating geoprocessing services collaboratively. Usually, geoprocessing services and geoprocessing service workflows are data and/or computing intensive. The intensity feature may make the execution process of a workflow time- consuming. Since it initials an execution request without blocking other interactions on the client side, an asynchronous mechanism is especially appropriate for GSO workflows. Many critical problems remain to be solved in existing asynchronous patterns for GSO including difficulties in improving performance, status tracking, and clarifying the workflow structure. These problems are a challenge when orchestrating performance efficiency, making statuses instantly available, and constructing clearly structured GSO workflows. A Fully Asynchronous and Status-Tracking (FAST) pattern that adopts asynchronous interac- tions throughout the whole communication tier of a workflow is proposed for GSO. The proposed FAST pattern includes a mechanism that actively pushes the latest status to clients instantly and economically. An independent proxy was designed to isolate the status tracking logic from the geoprocessing business logic, which assists the formation of a clear GSO workflow structure. A workflow was implemented in the FAST pattern to simulate the flooding process in the Poyang Lake region. Experimental results show that the proposed FAST pattern can efficiently tackle data/computing intensive geoprocessing tasks. The performance of all collaborative partners was improved due to the asynchronous mechanism throughout communication tier. A status-tracking mechanism helps users retrieve the latest running status of a GSO workflow in an efficient and instant way. The clear structure of the GSO workflow lowers the barriers for geospatial domain experts and model designers to compose asynchronous GSO workflows. Most importantly, it provides better support for locating and diagnosing potential exceptions. & 2014 Elsevier Ltd. All rights reserved. 1. Introduction To build large-scale and complex geospatial simulation and analysis models, scattered geoprocessing services distributed on the web are integrated into a geoprocessing services workflow (Brauner et al., 2009). The emergence and spread of these geoprocessing service workflows improves the interoperation and collaboration of distributed geoprocessing functions, which significantly enhances the capacity to derive geoinformation and knowledge over a network (Zhao et al., 2012b). As a special kind of Web Services Orchestration (WSO) (Peltz, 2003) in the geospatial domain, Geoprocessing Service Orchestration (GSO) provides a unified and flexible way to implement a cross-application, long- lived, and multi-step geoprocessing service workflow by coordi- nating geoprocessing services collaboratively. A web service can be implemented as either synchronous or asynchronous according to type of communication mechanism deployed. Most geoprocessing algorithms however, are unsuitable for provision as synchronous web services because a geoproces- sing algorithm includes multiple processing steps; each step might involve data and/or computing intensive calculations. The data and/or computing-intensive geoprocessing algorithm usually con- sumes much time. For an instance, a GSO workflow for simulating a flooding process in the Poyang Lake region may run hours or even days when dealing with fine-resolution images. It is very common for a time-consuming geoprocessing algorithm to take more time than the typical Hypertext Transfer Protocol (HTTP) transaction time-out duration. Theoretically, asynchronous mechanisms would Contents lists available at ScienceDirect journal homepage: www.elsevier.com/locate/cageo Computers & Geosciences http://dx.doi.org/10.1016/j.cageo.2014.06.005 0098-3004/& 2014 Elsevier Ltd. All rights reserved. n Corresponding author. E-mail address: youlan@whu.edu.cn (L. You). Computers & Geosciences 70 (2014) 213–228