M. Bubak et al. (Eds.): ICCS 2004, LNCS 3036, pp. 499–502, 2004. © Springer-Verlag Berlin Heidelberg 2004 An NAT-Based Communication Relay Scheme for Private-IP-Enabled MPI over Grid Environments Siyoul Choi 1 , Kumrye Park 1 , Saeyoung Han 1 , Sungyong Park 1 , Ohyoung Kwon 2 , Yoonhee Kim 3 , and Hyoungwoo Park 4 1 Dept. of Computer Science, Sogang University, Seoul, Korea {adore, namul, syhan, parksy}@sogang.ac.kr 2 Korea University of Technology and Education, Chonan, Korea 3 Sookmyung Women’s University, Seoul, Korea 4 Korea Institute of Science and Technology Information, Daejeon, Korea Abstract. In this paper we propose a communication relay scheme combining the NAT and a user-level proxy to support private IP clusters in Grid environments. Compared with the user-level two-proxy scheme used in PACX- MPI and Firewall-enabled MPICH-G, the proposed scheme shows performance improvement in terms of latency and bandwidth between the nodes located in two private IP clusters. Since the proposed scheme is portable and provides high performance, it can be easily applied to any private IP enabled solutions including the private IP enabled MPICH solution for Globus toolkit. 1 Introduction As cluster systems become more widely available, it becomes feasible to run parallel applications across multiple private clusters at different geographic locations as a Grid environment. However, in the MPICH-G2 library [1], an implementation of the Message Passing Interface standard over Grid environment, it is impossible for any two nodes located in different private clusters to communicate with each other directly across the public network until additional functions are added to the library. In PACX-MPI [2], another implementation of MPI aiming to support the coupling of high performance computing systems distributed in a Grid, the communications among multiple private IP clusters are handled by two user-level daemons that allow the library to bundle communications and avoid having thousands of open connections between systems. However, since these daemons are implemented as proxies running in user space, the total bandwidth is only about half of the bandwidth obtained from kernel-level solutions [3]. It also suffers from higher latency due to the additional overhead of TCP/IP stack traversal and switching between kernel and user mode. This paper proposes an NAT-based communication relay scheme, combining the NAT service with a user level proxy, for private IP enabled MPI solution over Grid environments. In our approach, only incoming messages are handled by a user-level proxy to relay them into proper nodes inside the cluster, while the outgoing messages are handled by the NAT service at the front-end node of the cluster. This brings