Scyld ClusterWare HPC: Administrator's Guide | ||
---|---|---|
<< Previous | Configuring the Cluster Manually | Next >> |
The command bpstat can be used to quickly check the status of the cluster nodes and/or see what processes are running on the compute nodes. See the Reference Guide for details on usage.
To reboot or set the state of a node via the command line, one can use the bpctl command. For example, to reboot node 5:
[root@cluster ~] # bpctl -S 5 -R |
As the administrator, you may at some point have reason to prevent other users from running new jobs on a specific node, but you do not want to shut it down. For this purpose we have the unavailable state. When a node is set to unavailable non-root users will be unable to start new jobs on that node, but existing jobs will continue running. To do this, set the state to unavailable using the bpctl command. For example, to set node 5 to unavailable:
[root@cluster ~] # bpctl -S 5 -s unavailable |
If you are mounting local file systems on the compute nodes, you should shut down the node cleanly so that the file systems on the hard drives stay in a consistent state. The node_down script in /usr/lib/beoboot/bin does exactly this. It takes two arguments; the first is the node number, and the second is the state to which you want the node to go. For example, to cleanly reboot node 5:
[root@cluster ~] # /usr/lib/beoboot/bin/node_down 5 reboot |
Alternatively, to cleanly power-off node 5:
[root@cluster ~] # /usr/lib/beoboot/bin/node_down 5 pwroff |
The node_down script works by first setting the node's state to unavailable, then remounting the file systems on the compute node read-only, then calling bpctl to change the node state. This can all be done by hand, but the script saves some keystrokes.
To configure node_down to use IPMI, set the ipmi value in /etc/beowulf/config to enabled as follows:
[root@cluster ~] # beoconfig ipmi enabled |
<< Previous | Home | Next >> |
Configuring the Cluster Manually | Up | The Kernel Command Line |