Node Health Status

The status of a node is determined by the health status of the following components:

Ports
Cards
Fan Trays
Power Modules
Memory utilization
CPU utilization

The health of a port and a fan depends on the health status of its associated components. For example, the health of a card depends on the port health. If more than 50% of the ports in a card are up and the operational status of the card is also up, then the card is determined as healthy (green). Similarly, the health of a fan depends on the operational status of the fan tray.

Note:  GigaVUE-FM computes the health status of the node based on FanChange and PowerChange traps and the same is reflected in GigaVUE-FM GUI. The card health status is computed based on ModuleChange trap and the same is reflected in GigaVUE-FM GUI.

A node is considered unhealthy if:

at least 50% of the cards are down
at least 50% of the ports in a card are down
at least 1 power module is down
the average memory usage over the past one hour is more than 70%
the CPU load per core is more than 50% overloaded

The change in the health status of a node is indicated in Events.

The cluster health is determined by the health status of the devices associated to the cluster.

The health status of a node is indicated by the following colors:

Color

Health Status

Green

Up (connected, healthy)

Amber

Warning

This state is displayed when the operational status of the card is up and 50% of the associated ports are up.

Red

Down (disconnected), unreachable

Gray

Unknown

This state is displayed when newly added nodes are yet to be discovered by GigaVUE-FM.