Dozer Architecture -64 cores machine

Server name: dozer.ucdenver.pvt

Dozer is a  64 cores machine with one NVIDIA GPU Tesla K40c.

System Specification

Processor Model    
AMD Opteron(TM) Processor 6274
Operating System
Linux - CentOS release 6.7
CPU MHz
1400.000
TLB size       
1536 4K pages
Address sizes  
48 bits physical, 48 bits virtual
Memory
128 GB RAM
Storage
1,862GB SATA
CPU op-mode(s)
32-bit, 64-bit
CPU(s)
64
Thread(s) per core
2
L1d cache
16x16KB
L1i cache
8x64KB
L2 cache
8x2MB
L3 cache
2x8MB
NUMA node0 CPU(s)
NUMA node1 CPU(s)
NUMA node2 CPU(s) 
NUMA node3 CPU(s)
NUMA node4 CPU(s)
NUMA node5 CPU(s)
NUMA node6 CPU(s)
NUMA node7 CPU(s)
0-7
8-15
16-23
24-31
32-39
40-47
48-55
56-63

NVIDIA GPU Tesla K40c specification

CUDA Driver Version / Runtime Version
 7.5 / 7.5
CUDA Capability Major/Minor version number
3.5
(15) Multiprocessors, (192) CUDA Cores/MP
2880 CUDA Cores
 GPU Max Clock rate
3004 Mhz
 Memory Clock rate
3004 Mhz
Memory Bus Width
384-bit
L2 Cache Size
1572864 bytes
Maximum Texture Dimension Size (x,y,z)
1D=(65536),
2D=(65536, 65536),
3D=(4096, 4096, 4096)
Total amount of constant memory
65536 bytes
Total amount of shared memory per block
49152 bytes
Total number of registers available per block
65536
Warp size
32
Maximum number of threads per multiprocessor
2048
 Maximum number of threads per block
1024
Max dimension size of a thread block (x,y,z)
(1024, 1024, 64)
Max dimension size of a grid size    (x,y,z)
2147483647, 65535, 65535
Compute Mode
Default (multiple host threads can use ::cudaSetDevice()
with device simultaneously)



Reference:

[1] M. Butler, L. Barnes, D. D. Sarma and B. Gelinas, "Bulldozer: An Approach to Multithreaded Compute Performance," in IEEE Micro, vol. 31, no. 2, pp. 6-15, March-April 2011.doi: 10.1109/MM.2011.23. [pdf]

[2] AMD64 Technology, AMD64 Architecture, Programmer’s Manual Volume 3:General-Purpose and System Instructions. [pdf]