Here is a snapshot of SDSC Trestles recorded at regular intervals using my nodeview program.

System Overview of Trestles

Total Nodes 313
Total Cores 10000
Total Jobs 261
Total Ranks 6711
Total Load 6522.5
Total SUs Running 372214
Total SUs Queued 212559
  Current Max %
Node Availability 309 324 95.4%
CPU Utilization 6522.5 10000 65.2%
Core Utilization 6711 10000 67.1%
Slot Utilization 7771 10352 75.1%
Avail Slot Util 7771 9872 78.7%
Mem Utilization 1.9TB 19.9TB 9.4%

Visual Node Status of Trestles

System Utilization

Current as of Thursday, July 17, 2014 at 7:00 AM

Shades of blue indicate the node's cpu load (darker = higher). Red nodes are down or offline, and yellow nodes are overloaded (load is significantly higher than amount of available CPUs).

Availability and Utilization over Time

The top figure below shows utilization and availability of various resources. The bottom figure shows the capacity of the system both running and waiting in queue. Such capacity is measured in CPU core-hours (SUs) and is calculated based on the requested time for every job running and in queue. It is generated using a few R scripts which are located in my GitHub repository.

System Utilization

Current as of Thursday, July 17, 2014 at 7:00 AM

System Utilization

Current as of Tuesday, July 15, 2014 at 9:00 AM

Known Events

The following events highlight abnormal features in the above availability, utilization, and queue health data.

DateEvent
June 18, 2015cipres reservation released
June 24, 2015networking issue (due to Arista upgrade?)

Current Utilization Breakdown

System Utilization

Capacity Running

System Utilization

Capacity Waiting

Node Utilization

Node Utilization

Core Utilization

Core Utilization

Current as of Thursday, July 17, 2014 at 7:00 AM