The following are known issues of significance with the latest version of ClusterWare 5 and suggested workarounds.
Cluster-wide ptrace functionality is not yet supported in ClusterWare 5. For example, you cannot use a debugger running on the master node to observe or manipulate a process that is executing on a compute node, e.g., using gdb -p procID, where procID is a processID of a compute node process. strace does function in its basic form, although you cannot use the -f or -F options to trace forked children if those children move away from the parent's node.
xpvm is not currently supported in ClusterWare 5.
ENV Modules are currently unsupported for OpenMPI jobs submitted through TORQUE. As a workaround, mimic what ENV Modules does by explicitly setting the environment variables with pathnames appropriate to the compiler that was used to build the OpenMPI application. For example, for an OpenMPI application compiled with GNU, add to the TORQUE script:
export PATH=/usr/openmpi/gnu/bin:$PATH export LD_LIBRARY_PATH=/usr/lib64/OMPI/gnu:$LD_LIBRARY_PATH export OPAL_PKGDATADIR=/usr/openmpi/gnu/share |
At this time, we do not recommend using beosetup for observing or altering the cluster state while new compute nodes are booting.
If you access a cluster master node using ssh -X from a workstation, some graphical commands or program may fail with:
Gdk-ERROR **: BadMatch (invalid parameter attributes) serial 798 error_code 8 request_code 72 minor_code 0 Gdk-ERROR **: BadMatch (invalid parameter attributes) serial 802 error_code 8 request_code 72 minor_code 0 |
export XLIB_SKIP_ARGB_VISUALS=1 |
The Ganglia cluster monitoring tool may fail for large clusters. If the /var/log/httpd/error_log shows a fatal error of the form:
PHP Fatal error: Allowed memory size of 16777216 bytes exhausted |
memory_limit = 16M |
ClusterWare installs various scripts in /etc/beowulf/init.d/ that node_up executes when booting each node in the cluster. Any site-local modification to one of these scripts will be lost when a subsequent ClusterWare update overwrites the file with a newer version. If a local sysadmin believes a local modification is necessary, we suggest:
Copy the to-be-edited original script to a file with a unique name, e.g.:
cd /etc/beowulf/init.d cp 20ipmi 20ipmi_local |
Remove the executable state of the original:
beochkconfig 20ipmi off |
Edit 20ipmi_local as desired.
Thereafter, subsequent ClusterWare updates may install a new 20ipmi, but that update will not re-enable the non-executable state of that script. The locally modified 20ipmi_local remains untouched. However, keep in mind that the newer ClusterWare version of 20ipmi may contain fixes or other changes that need to be reflected in 20ipmi_local because that edited file was based upon an older ClusterWare version.
Software tools exist that might make modifications to various system configuration files that ClusterWare also modifies. These tools do not have knowledge of the ClusterWare specific changes and therefore may undo or cause damage to the changes or configuration. Care must be taken when using such tools. One such example is /usr/sbin/authconfig, which manipulates /etc/nsswitch.conf.
ClusterWare modifies these system configuration files at install time:
/etc/exports /etc/nsswitch.conf /etc/security/limits.conf /etc/sysconfig/syslog |
The nscd (Name Service Cache Daemon) service executes by default on each compute node. However, if this service is also enabled on the master node, then it may cause the ClusterWare name service kickbackdaemon to misbehave.
Workaround: when Beowulf starts, if it detects that nscd is running on the master node, then Beowulf automatically stops nscd and reports that it has done so. Beowulf does not invoke /sbin/chkconfig nscd off to permanently turn off the service.
Note: even after stopping nscd on the master node,
service nscd status |
CW4.2.0 (and later releases) support Infiniband via Open Source kernel drivers, OpenIB, OFED, and a ClusterWare-enhanced MVAPICH. The CW4.2.0 MVAPICH default behavior is to assign threads of each multithreaded job to specific CPUs in each node, starting with cpu0 and incrementing upward. While keeping threads pinned to a specific CPU may be an optimal NUMA and CPU cache strategy for nodes that are dedicated solely to a single job, it is usually suboptimal if multiple multithreaded jobs share a node, as each job's threads get permanently assigned to the same low-numbered CPUs. The CW4.2.1 (and beyond) default behavior is to not impose strict CPU affinity assignments, which allows the kernel CPU scheduler to migrate threads as it sees fit to load-balance the node's CPUs as workloads change over time.
However, the user may override this default using:
export VIADEV_ENABLE_AFFINITY=1 |
ClusterWare 5 includes MPI-related packages that conflict with certain packages in the Red Hat or CentOS base distribution.
If yum informs you that it cannot install or update ClusterWare because various mpich and mpiexec packages conflict with various openmpi packages from the base distribution, then run the command:
yum remove openmpi* mvapich* |
yum groupupdate Scyld-ClusterWare |
Currently, beofdisk only supports disks that already have partition tables, even if those tables are empty. Compute nodes with preconfigured hardware RAID, where partition tables have been created on the LUNs, should be configurable. Contact Customer Service for assistance with a disk without partition tables.
BProc interaction with getpid() may return incorrect processID values.
Details: The Red Hat's glibc implements the getpid() syscall by asking the kernel once for the current processID value, then caching that value for subsequent calls to getpid(). If a program calls getpid() before calling bproc_rfork() or bproc_vrfork(), then bproc silently changes the child's processID, but a subsequent getpid() continues to return the former cached processID value.
Workaround: do not call getpid() prior to calling bproc_[v]rfork.
Using the bpcp command, specifying master for both the source and destination, e.g.,
bpcp master:/tmp/x master:/tmp/y |
Workaround: Specify master for the source file or for the destination file, but not both. For example, do:
bpcp /tmp/x master:/tmp/y |
The RHEL5 base distribution includes a sky2 network driver that panics the kernel on Penguin Computing Bladerunner Xeon servers. Contact Penguin Computing Customer Service for a workaround.
<< Previous | Home | |
Notable Feature Enhancements And Bug Fixes | Up |