Scyld ClusterWare HPC: Administrator's Guide | ||
---|---|---|
<< Previous | Extra Tools | Next >> |
Ganglia is an open source distributed monitoring technology for high-performance computing systems, such as clusters and grids. In current versions of Scyld ClusterWare, Ganglia provides network metrics for the master node, time and string metrics (boottime, machine_type, os_release, and sys_clock), and constant metrics (cpu_num and mem_total). Ganglia uses a web server to display these statistics; thus, to use Ganglia, you must run a web server on the cluster's master node.
When installing Scyld ClusterWare, make sure the Ganglia package is selected among the package groups to be installed. Once you have completed the Scyld installation and configured your compute nodes, you will need to configure Ganglia as follows:
Name your cluster.
By default, Ganglia will name your cluster "my cluster". To change this, edit the /etc/gmetad.conf file. On or about line 39, you will see:
data_source "my cluster" localhost |
data_source "iceberg" localhost |
Enable the Ganglia Master Daemon to start on boot.
[root@cluster ~] # chkconfig gmetad --level 345 on |
Enable the web server to start on boot.
[root@cluster ~] # chkconfig httpd --level 345 on |
Turn on the Ganglia Data Collection Service.
Edit the file /etc/xinetd.d/beostat. On or about line 13 of this file, enable the service by changing "disable = yes" to "disable = no".
Restart the xinetd service.
[root@cluster ~] # service xinetd restart |
Start the httpd service:
[root@cluster ~] # service httpd start |
Note that you will not need to start httpd each time the cluster reboots if you've correctly enabled the web server to start on boot (see step 3 above).
Start the Ganglia Master Daemon.
[root@cluster ~] # service gmetad start |
Visit http://localhost/ganglia in a web browser.
Note that if you are visiting the web page from a computer other than the cluster's master node, then you must change localhost to the hostname of the cluster. For example, if the cluster's name is "iceburg", then you may need to use its fully qualified name, such as http://iceburg.penguincomputing.com/ganglia.
![]() | The Ganglia graphs that track load (1-, 5-, and 15-minute), the number of CPUs, and the number of processes may appear inaccurate. These graphs are in fact reporting correct statistics, but for the system as a whole rather than just user processes. Scyld draws its statistics directly from system data structures and /proc. It does not take any further steps to interpret or post-process the metrics reported by these data structures. |
<< Previous | Home | Next >> |
IPMITool | Up | Updating Software On Your Cluster |