How to install the Grid Engine software on hosts with multiple network interfaces

This document describes how to install Grid Engine on machines with multiple network interfaces (multi-homed host). It was originally Solaris-specific and GNU/Linux parts were added later. A special case, using the SolarisTM Operating Environment IP Multipathing (IPMP) technology for IP failover, is described in a separate HOWTO.

Systems on multiple networks

Suppose we have two Ethernet interfaces (say hme0 and hme1) on each machine. One interface is associated with general traffic, NFS file sharing and so on

(hme0) and the other is dedicated to Grid Engine communications (hme1). We would like to set up Grid Engine so that it communicates only on the Grid Engine dedicated network. In this example, there is a Grid Engine master node (sun-master) and three exec hosts (sun-1, sun-2, sun-3)

Setting up the networks

On Solaris in /etc, there will be file called hostname.hme0 populated with the hostname (e.g. sun-1) and another file called hostname.hme1 populated with the grid engine interface name (e.g. sun-1-grid). On GNU/Linux, interface eth0 would typically be the interface corresponding to the canonical name returned by hostname(1), possibly configured via /etc/interfaces or /etc/sysconfig/network-scripts/. In the /etc/hosts file, there should of course, be entries for the SGE interface as well as the standard interface

#
# Grid Engine Network 
#
192.168.7.2     sun-master-grid
192.168.7.3     sun-1-grid
192.168.7.11    sun-2-grid
192.168.7.12    sun-3-grid

When both networks are functioning correctly, install gridengine.

Making Grid Engine use the SGE network

Modify the configuration

Startup Grid Engine

Start up the master node now:

# /etc/init.d/sgemaster
   starting sge_qmaster
# 

Now start up the SGE exec hosts

# pdsh ... /etc/init.d/sgeexecd start
...
#

Check it has worked

Snoop the network to check that the correct interfaces are being used:

# qsub -q sun-1 test.sh
# snoop -V -d hme1 sun-1-grid
sun-master-grid -> sun-1-grid TCP D=46883 S=536 Syn Ack=2694354350 \
  Seq=2537161622 Len=0 Win=49640 Options=<mss 1460,nop,nop,sackOK>
sun-1-grid -> sun-master-grid TCP D=536 S=46883 Ack=2537161623 \
  Seq=2694354350 Len=0 Win=49640

snoop is a Solaris utility; use tcpdump(1) more generally, or some other packet capture tool for your operating system.

Trademarks

Sun and Solaris are trademarks or registered trademarks of Sun Microsystems, Inc. in the United States and other countries. Sun et Solaris sont des marques déposées ou enregistrées de Sun Microsystems, Inc. aux Etats-Unis et dans d'autres pays.