Hydra Architecture

Hydra Architecture - Multi-Core Cluster

Server name: Hydra.ucdenver.pvt

The Multi-core cluster consists of following primary components:

1 master node
16 compute nodes
1 NVIDIA Tesla Fermi S2050 GPU server with four GPUs
Cluster private network interface

Master Node

The master node is mainly used to manage all computing resources and operations of the Hydra cluster and correspond the node -1 in the cluster. It is also the machine that users log into, create/edit/compile programs, and submit them for execution on one or more of the compute nodes. Users do not run their programs on the master. Repeat: user programs MUST NOT be run applications on the master . Instead, they are submitted to the compute nodes for execution. The master node of Hydra cluster is featured by:

Processors: Dual AMD Opteron processors (six cores per processor, total 12 cores on master node).
Memory: 32GB RAM
Storage: ~5TB with RAID0 and RAID5 controllers

Compute Nodes

Compute nodes are nodes that execute the jobs submitted by users. From the master node, clients submit programs to execute them on one or more compute nodes.There are 16 compute nodes (nodes 0 to 15) on Hydra including:

Fourteen nodes without GPU (nodes 0, 1, 2, 3, 5, 6, 7, 8, 9, 10, 11, 12, 13 and 15)
Two nodes with GPUs

nodes 12 and 13 have attached 2 Tesla GPUs in each of them
Each Tesla CPU has 448 CUDA cores

All 16 nodes have 12 cores eache one
All cores share 24GB of memory
compute nodes are connected by both Gigabit Ethernet and Infiniband.

** Nodes 04 and 14 are unavailable.

THE PROCESSOR ARCHITECTURE OF ONE COMPUTE NODE ON HYDRA

There are two six-core processors in each compute node on Hydra.
•    Number of cores: 12
•    L1 Instruction Cache: 64 KB per core
•    L1 Data Cache: 64 KB per core
•    L2 Cache: 512KB per core
•    L3 Cache: 6MB per processor
•    Memory: 24GB

Multicore Cluster - System Especification

Host Name	hydra.ucdenver.pvt
Operating System	CentOS release 6.7 (Final)
Cluster	Scyld ClusterWare release 6.7.6
Number Nodes	16 computer nodes plus 1 master node
Total CPU Cores	204 (12 cores on master nodes plus 192 cores in all 16 computer nodes)
Number GPUs	8 Tesla Fermi GPUs
Total GPU CUDA Cores	3584 cuda cores (8 x 448) 1.15 GHz per core
Total Max GFLOPS of CPUs	480 (2.5 GFLOPS per core)
Total Disk Space	7566 GB
Total RAM	544 GB
Total RAM of GPUs	27 GB
Processors per Node	2 x 6-core processors (Node 0 - Node 15)
Cores per Node	12 cores (Node 0 - Node 15)
Processor Type	AMD Opteron 2427 (Node 0 - Node 15)
Processor Speed	2.2 Ghz
L1 Instruction Cache per Processor	6 x 64 KB
L1 Data Cache per Processor	6 x 64 KB
L2 Cache per Processor	6 x 512KB
L3 Cache per Processor	6MB
64 bit Support	yes
RAM on Master Node	32 GB
Disk Space on Master Node	238 GB (RAID1) 2,861GB (RAID5)

NVIDIA Tesla Fermi S2050 GPU servers

There is one NVIDIA Tesla Fermi S2050 GPU server.
The GPU server has 4 NVIDIA GPUs.
There are two compute nodes connected to NVIDIA Tesla GPU server. Each node is connected to two GPUs of a GPU server.

Tesla S2050 GPU Features

CUDA Driver Version / Runtime Version	6.5 / 6.5
CUDA Capability Major/Minor version number	2.0
Total amount of global memory	2687 MBytes (2817982464 bytes)
(14) Multiprocessors, ( 32) CUDA Cores/MP	448 CUDA Cores
GPU Clock rate	1147 MHz (1.15 GHz)
Memory Clock rate	1546 Mhz
Memory Bus Width	384-bit
L2 Cache Size	786432 bytes
Maximum Texture Dimension Size (x,y,z)	1D=(65536), 2D=(65536, 65535), 3D=(2048, 2048, 2048)
Total amount of constant memory	65536 bytes
Total amount of shared memory per block	49152 bytes
Total number of registers available per block	32768
Warp size	32
Maximum number of threads per multiprocessor	Maximum number of threads per multiprocessor
Maximum number of threads per block	1024
Max dimension size of a thread block (x,y,z)	(65535, 65535, 65535)
Integrated GPU sharing Host Memory	No
Compute Mode	multiple host threads can use ::cudaSetDevice() with device simultaneously