Skip to content

Cluster Health Dashboard

CAUTION

The Cluster page did not exist in versions prior to 8.0. If you are running an older version, you will need to upgrade to see this page and access these features.

The Cluster page is available in the SuperAdmin UI when the server is part of an HA (High Availability) cluster. It provides a real time overview of the cluster's state and allows you to perform maintenance actions.

Cluster page

NOTE

The Cluster menu item only appears in the navigation bar when HA is enabled. If you don't see it, your server is running in standalone mode.

Cluster Overview

The top section shows a summary of the cluster:

  • This Node: The unique ID of the current node
  • Total Nodes: How many nodes are in the cluster
  • Online Nodes: How many nodes are currently reachable (e.g. "2 / 2")
  • Last Replication Cycle: When the last full replication pull completed

A colored status badge indicates the overall cluster health:

BadgeMeaning
ACTIVE (green)All nodes are online and replicating normally
DEGRADED (orange)Some nodes are offline but at least one peer is still reachable
STANDALONE (gray)No peers are reachable; this node is operating alone
DISABLED (muted)HA is not enabled on this server

Per Node Details

Each node in the cluster is shown as a card with the following information:

  • Address: The node's IPC endpoint address
  • Version: The software version running on that node
  • Last Heartbeat: When the last heartbeat was received from that node
  • Last Sync: When the last data synchronization completed with that node
  • Clock Drift: The time difference in seconds between this node and the peer. Values over 5 seconds are highlighted in orange as a warning.
  • Status: The node's current status (ONLINE, OFFLINE, CLOCK_DRIFT, INCOMPATIBLE)

The current node is labeled with a "self" badge for easy identification.

Virtual Site Drain Status

Each node card includes a collapsible "Virtual site drain status" section. It lists every virtual site and shows, for that specific node:

ColumnDescription
Virtual SiteThe name of the virtual site
SessionsThe number of active client sessions on this node
Status on this nodeWhether the virtual site is Running (green) or Draining (orange) on that node

This section lets you see at a glance which sites are active and which are being gracefully wound down, without leaving the Cluster Health page.

Node Status Badges

BadgeColorMeaning
ONLINEGreenNode is reachable and replicating normally
OFFLINERedNode is not responding to heartbeats
CLOCK_DRIFTOrangeNode is online but has excessive clock drift (> 5s)
INCOMPATIBLEPurpleNode is running an incompatible software version

Actions

Drain / Undrain a Node

Every node card shows a Drain button. Draining a node stops it from accepting new incoming client connections while allowing all existing sessions to finish gracefully. This is the recommended first step before performing maintenance, updates, or a rolling restart on a single node.

  • Drain: Click the orange Drain button on any node card (self or peer).
  • Undrain: Once a node is draining, the button changes to allow resuming normal operation.

NOTE

Draining a node does not affect replication. The node continues to receive and push configuration changes to its peers while draining.

See Draining a Virtual Site for a deeper explanation of how drain mode works.

Force Re Sync

If replication has fallen out of sync (e.g. after a network partition or extended downtime), you can force a full re-sync from a specific peer node. This resets the replication cursor and pulls all data from that peer.

Click Force Re-Sync From This Node on the card of the peer you want to sync from. A confirmation dialog will appear before the operation begins.

NOTE

The re-sync does not happen instantly. It is scheduled for the next replication pull cycle. The time between clicking the button and the actual sync completing is at most one pull interval (typically a few seconds).

WARNING

A full re-sync can be resource intensive on large configurations. Only use this when normal incremental replication is insufficient.

Remove Node

The Remove Node button on a peer's card removes that peer from the cluster permanently. This operation:

  • Removes the peer from the cluster's node list on this node
  • Stops all replication to and from that peer
  • Does not touch the removed peer's local data or its own configuration

CAUTION

If the peer being removed is the last remaining peer, the current node automatically transitions to standalone mode. HA is disabled and the node will no longer replicate. To re-join a cluster, you must go through the Setup UI join process again.

NOTE

You cannot remove the current node (self) using this button. To remove the current node from the cluster, use the Leave Cluster button in the Actions section below.

Leave Cluster

The Leave Cluster button removes the current node from the HA cluster. After leaving:

  • This node becomes a standalone server
  • It keeps a snapshot of the data it had at the time of departure
  • It stops sending and receiving replication updates
  • Other nodes remove it from their peer list

CAUTION

Leaving a cluster cannot be undone from the SuperAdmin UI. To rejoin, you must go through the Setup UI join process again.