Following a successful update or install of ClusterWare,
you may wish to make one or more configuration changes,
depending upon the local needs of your cluster.
As with every ClusterWare upgrade,
after the upgrade you should locate any ClusterWare
*.rpmsave and rpmnew files
and perform merges, as appropriate, to carry forward the local changes.
Sometimes a ClusterWare upgrade will save the locally modified version
as *.rpmsave and overwrite the basic file
with a new version.
Other times the upgrade will keep the locally modified version untouched,
installing the new version as *.rpmnew.
For example,
cd /etc/beowulf
find . -name \*rpmnew
find . -name \*rpmsave |
and examine each such file to understand how it differs from the
configuration file that existed prior to the update.
You may need to merge new lines from the newer
*.rpmnew
file into the existing file (common with
config.rpmnew),
or perhaps replace existing lines with new modifications.
Or you may need to merge older local modifications in
*.rpmsave into the newly installed pristine version of
the file (common with
fstab.rpmsave).
Contact Scyld Customer Support if you are unsure about how to resolve
particular differences,
especially with
/etc/beowulf/config.
ClusterWare execution currently requires that SELinux be disabled.
You can manually edit /etc/sysconfig/selinux and ensure
that:
is set.
If SELinux was not already set to
disabled,
then the master node must be rebooted for this change to take effect.
If you wish to run TORQUE, confirm that it is enabled on the master node:
/sbin/chkconfig --list torque |
and verify the settings
3:on 4:on 5:on.
After you successfully start the cluster compute nodes for the first time,
enable the /etc/beowulf/init.d/torque script:
then restart TORQUE and restart the compute nodes:
service torque restart
bpctl -S all -R |
See the
Administrator's Guide for more details
about TORQUE configuration,
and the
User's Guide for details about how to use TORQUE.
To enable the Ganglia cluster monitoring tool,
edit /etc/xinetd.d/beostat
to change disable=yes to disable=no,
followed by:
/sbin/chkconfig xinetd on
/sbin/chkconfig httpd on
/sbin/chkconfig gmetad on |
then either reboot the master node, which automatically restarts these three
system services;
or without rebooting, manually restart
xinetd to re-read
the newly edited
beostat config file,
and start the remaining services that are not already running:
service xinetd restart
service httpd start
service gmetad start |
See the
Administrator's Guide for more details.
The Integrated Management Framework (IMF) ClusterAdmin Web Interface
is used by a cluster administrator to monitor and administer the cluster
using a Web browser.
It requires Apache on the master node (service httpd)
and is access-protected with a Web application-specific username,
admin, and password combination.
To enable the ClusterAdmin Web interface,
perform the following steps on the master node:
Enable the httpd service,
if it is not already enabled:
/sbin/chkconfig httpd on
service httpd start |
Initialize the admin account by
assigning it a unique password:
/usr/bin/htpasswd /etc/httpd/imf/htpasswd-users admin |
After that, you point your Web browser to your master node
and log into the IMF ClusterAdmin Interface via the Scyld ClusterWare IMF link.
See the Administrator's Guide for more details.
If you wish to use cluster-wide NFS locking,
then you must enable locking on the master node and on the compute nodes.
First ensure that NFS locking is enabled and running on the master:
/sbin/chkconfig nfslock on
service nfslock start |
Then for each NFS mount point for which you need the locking functionality,
you must edit
/etc/beowulf/fstab (or the appropriate
node-specific
/etc/beowulf/fstab.N
file(s)) to remove the default option
nolock.
See the
Administrator's Guide for more details.
The default count of 8 nfsd NFS daemons may be
insufficient for large clusters.
One symptom of an insufficiency is a syslog message,
most commonly seen when you boot all the cluster nodes:
nfsd: too many open TCP sockets, consider increasing the number of nfsd threads |
To increase the thread count (e.g., to 16):
echo 16 > /proc/fs/nfsd/threads |
Ideally, the chosen thread count should be sufficient to eliminate the
syslog complaints, but not significantly higher,
as that would unnecessarily consume system resources.
One approach is to repeatedly double the thread count until the syslog
error messages stop occurring,
then make the satisfactory value
N persistent across
master node reboots by
creating the file
/etc/sysconfig/nfs,
if it does not already exist, and adding to it an entry of the form:
where a value
N of 1.5x to 2x the number of nodes
is probably adequate,
although perhaps excessive.
See the
Administrator's Guide for a more detailed
discussion of NFS configuration.
Certain workloads doing ip forwarding (most commonly seen when you boot
all the cluster nodes) may produce a syslog message of the form:
ip_conntrack: table full, dropping packet. |
Use the command:
cat /proc/sys/net/ipv4/ip_conntrack_max |
to see the current table size.
To increase the size:
echo N > /proc/sys/net/ipv4/ip_conntrack_max |
e.g., where
N is double the current size.
The kernel defaults to using a maximum of 32,768 processID values.
Because BProc manages a common process space across the cluster,
this default may be insufficient for very large clusters and/or
workloads that create large numbers of concurrent processes.
The sysadmin can increase this upper bound by using the
sysctl command. For example,
sysctl -w kernel.pid_max=262144 |
instructs the kernel to use a range of values up to 262,144.
The maximum BProc-supported value is the same
sysctl-managed upper bound supported by the kernel:
4,194,304 [= (4*1024*1024)].
OpenIB and mvapich-0.9.9 require an override to the limit of
how much memory can be locked.
ClusterWare adds a memlock override to
/etc/security/limits.conf during a ClusterWare upgrade,
if the override does not already exist in that file,
regardless of whether or not Infiniband is present in the cluster.
The new override line,
raises the limit to
unlimited.
The sysadmin may remove that override from
limits.conf
if Infiniband is not present,
or in an Infiniband cluster may reduce that
unlimited
to a discrete value,
though mvapich-0.9.9 requires a minimum of 16 MBytes,
which is a
limits.conf value of
16384.
If the value is too small, then MVAPICH reports a
"CQ Creation" or "QP Creation" error.
If you wish to enable automatic CPU frequency management,
you must have the base distribution's kernel-utils package
installed, and then enable the script:
beochkconfig 30cpuspeed on |
You may optionally create a configuration file
/etc/beowulf/conf.d/cpuspeed.conf
(or node-specific
cpuspeed.conf.N),
ostensibly derived from the master node's
/etc/cpuspeed.conf, to override default behavior.
See
man cpuspeed for details.
If you wish to allow users to /usr/bin/ssh
or /usr/bin/scp from the master to a compute node,
or from one compute node to another compute node,
then you must enable sshd on compute nodes
by enabling the script:
See the
Administrator's Guide for details.
You may declare site-specific alternative node names for
cluster nodes by adding entries to
/etc/beowulf/config.
The syntax for a node name entry is:
nodename format-string [domain/netgroup] [IPv4addr] |
For example,
allows the user to refer to node 4 using the traditional
.4 name,
or alternatively using names like
n4
or
n004.
See
man beowulf-config
and the
Administrator's Guide for details.