Scyld ClusterWare HPC: Administrator's Guide | ||
---|---|---|
<< Previous | Next >> |
This chapter discusses how to configure a cluster without using the BeoSetup graphical configuration tool. It is highly recommended that the beginning administrator use BeoSetup. However, if you either can't get X-windows running on the master node after installation, or if you want to remotely set up the cluster and X forwarding is too slow, then this chapter is for you.
![]() | Note that there is a "race condition" between BeoSetup, Beoserv, and manual edits made to the configuration file. Do not manually edit the config file while a root user instance of BeoSetup is running. As soon as you have finished editing and saving the configuration file, be sure to immediately notify Beoserv and BPmaster to re-read the file by sending each a "SIGHUP" signal. |
All setup information about the cluster is stored in configuration files. The BeoSetup tool simply edits these files. You may prefer to edit these files directly in your favorite editor.
The file /etc/beowulf/config is the main config file for the cluster. The config file is organized using keywords and values, which are used to control most aspects of running the cluster, including the following:
The name, IP address and netmask of the network interface connected to the internal network
The name of the network ports used by the file transport services
The IP address range to assign to the compute nodes
The MAC (hardware) address of each identified node accepted into the cluster
The node number and IP address assigned to each hardware address
The default kernel and kernel command line to use when creating a boot file
The list of shared library directories to cache on the compute nodes
The list of files to prestage on the compute nodes
Compute node file system startup policy
The name of the final boot file to send to the compute nodes at boot time
The hostname and hostname aliases of compute nodes
Compute node policies for handling local disks and file systems, responding to master node failure, etc.
The following sections briefly discuss some key aspects of the configuration file. See the Reference Guide for details on the specific keywords and values in /etc/beowulf/config.
The IP address range should be kept to a minimum, as all the cluster utilities will loop through this range. Having a few spare addresses is a good idea to allow for growth in the cluster. However, having a large number of addresses that will never be used will be an unnecessary waste of resources.
When new nodes boot up, they send DHCP packets to the network in order to get an IP address assigned to them. The master will detect these DHCP packets, and record the MAC address that the packets are coming from in the unknown addresses file /var/beowulf/unknown_addresses.
At any time, you can copy the MAC addresses out of this file and into the cluster config file /etc/beowulf/config with the node keyword prefixing the MAC address. The order of the node lines in the config file dictates the order in which node numbers and IP addresses are assigned to the hardware addresses.
Node numbers and IP addresses are assigned to the compute nodes in order, beginning with node 0 and the first IP address specified on the IPRANGE line of the config file.
Multiple ranges may be specified. Nodes will be assigned to each range in the order they appear in the file, unless the node number option is included on the IPRANGE line. If this option is used for a range, then the addresses in that range will be assigned beginning with the specified node number.
The node directive assigns node numbers to specific MAC addresses. When Beoserv assigns an IP address to a node and admits it to the cluster, a line is appended to the config file with the node's number, MAC address and IP address. You can use this line to reserve a node number so that it cannot be assigned to any other node. This option is useful if you want to temporarily reassign a compute node but reserve its place in the cluster. To do this, comment out the entire line following the node keyword, for example:
# Node 12 is reassigned as a temporary nfs server NODE 12 # 00 12 34 56 78 192.168.0.112 |
To add a shared library to the list of libraries cached on the compute nodes, just append its name to a line beginning with the libraries keyword. The space character is the delimiter.
You may also specify entire directories of shared libraries to be cached on this line by inserting the directory path. Note that only the shared libraries in these directories will be cached, not binaries or other data files.
The prestage keyword names specific files in the libraries directories that will be explicitly pulled to each compute node at node startup.
The nodename keyword in the Master's /etc/beowulf/config affects the behavior of the ClusterWare NSS. Using the nodename keyword, one may redefine the primary host-name of the cluster, define additional hostname aliases for compute nodes, and define additional hostname (and hostname aliases) for entities loosely associated with the compute node's cluster position.
nodename [name-format] <IPv4 Offset or base> <netgroup>
The presence of the optional IPv4 argument defines if the entry is for "compute nodes" (i.e. the entry will resolve to the 'dot-number' name) or if the entry is for non-cluster entities that are loosely associated with the compute node. In the case where there is an IPv4 argument, the nodename keyword defines an additional hostname name that maps to an IPv4 address loosely associated with the node number. In case where IPv4 argument is present, the nodename keyword defines hostname and hostname aliases for the clustering interface (i.e. the compute nodes). Subsequent nodename entries without an IPv4 argument specify additional hostname aliases for compute nodes. In either case, the format string must contain a conversion specification for node number substitution. The conversion specification is introduced by a '%'. An optional following digit in the range 1..5 specifies a zero-padded minimum field width. The specification is completed with an 'N'. An unspecified or zero field width allows numeric interpretation to match compute node host names. For example, n%N will match n23, n+23, and n0000023. By contrast, n%3N will only match n001 or node023, but not n1 or n23.
host [MACADDR] <IPADDR> <HOSTNAME>
The host keyword affects both the BeoServ dhcp server and how the ClusterWare NSS responds to hostname lookups. The host keyword maps a non-cluster entity to a hostname with an IP address. Also, if the host keyword entry includes a MAC address, then the ClusterWare DHCP server of the configured master node will respond to DHCP queries for that MAC address, serving it the specified IP address, which must be outside of IP address range served to compute nodes.
Currently, the ClusterWare DHCP server has no method of sending a hostname along with an IP address.
The /etc/beowulf/fdisk directory is created by the beofdisk utility when it evaluates local disks on individual compute nodes and creates partition tables for them. For each unique drive geometry discovered among the local disks on the compute nodes, beofdisk creates a file within this directory. The file naming convention is "head;ccc;hhh;sss", where "ccc" is the number of cylinders on the disk, "hhh" is the number of heads, and "sss" is the number of sectors per track.
These files contain the partition table information as read by beofdisk. Normally, these files should not be edited by hand.
You may create separate versions of this directory that end with the node number (for example, /etc/beowulf/fdisk.3). The master's BeoBoot software will look for these directories before using the general /etc/beowulf/fdisk directory.
For more information, see the section on beofdisk in the Reference Guide.
This is the file system table for the mount points of the partitions on the compute nodes. It should be familiar to anyone who has dealt with an /etc/fstab file in a standard UNIX system. For details, see the Reference Guide.
You may create separate versions of this file that end with the node number (e.g., /etc/beowulf/fstab.3). The master's beoboot software will look for these files before using the general /etc/beowulf/fstab file.
![]() | On compute nodes, NFS directories must be mounted using either the IP address or the $MASTER keyword; the hostname cannot be used. This is because fstab is evaluated before /etc is populated on the compute node, so hostnames cannot be resolved at that point. |
<< Previous | Home | Next >> |
The Node List Boxes | Command Line Tools |