When you have redundant systems in place, you can get detailed status information by going to Gateway webpage and selecting Status > Redundancy to to view the system's status and events.
Redundancy Status Page Overview
The Redundancy Status page has been overhauled starting in 8.1.21. This section breaks down the Redundancy Status Page, describing each feature.
Metrics and Information
The top fourth of the Redundancy Status page gives information about the Gateway and your redundancy setup, including:
- Redundancy configuration settings
- The current Gateway's role in the Redundant pair
- If the Gateway has a peer connected
- The current node's uptime after failover
- Current redundancy settings
- Force re-sync and failover options
Redundancy Providers Statistics
The next section of the Redundancy Status page details metrics about applicable redundancy providers.
Data that is presented include:
- The name of the provider
- When the provider was last pulled
- The latest time duration (or how long it takes) for the Gateway to get data from the provider
- The latest time duration (or how long it takes) for the Gateway to apply the data received from the provider
- Whether a full sync is needed
- Whether a system restart is needed
System Event Information
The third section of the Redundancy Status page displays a table that will log system events whenever a full sync is required, helping to establish a timeline of when full sync events were requested.
Information displayed in this table includes:
- How severe the system event was/is
- When the system event occurred
- The reason for the system event
The final section of the Redundancy Status page shows logger activity and allows users to enable DEBUG and TRACE logs for a specific redundancy provider.
Features of the log activity table include:
- Minimum logging level. Options are:
- An option to merge logs to the main diagnostic log viewer
- The specified logger
- The log's timestamp
- The issue being logged
The master and backup nodes communicate over TCP/IP. Therefore, they must be able to see each other over the network, through any firewalls that might be in place. All communication goes from the backup to the master node over the gateway network (default port 8088 without SSL, port 8060 with SSL). Therefore, that port must allow TCP listening on the master machine.
The master node maintains the official version of the system configuration. You must make all changes to the system on the master Gateway, the backup Gateway does not allow you to edit properties. Similarly, the Designer only connects to the master node.
When changes are made on the master, they are queued up to be sent to the backup node. When the backup connects, it retrieves these updates, or downloads a full system backup if it is too far out of date.
If the master node has modules that aren't present on the backup, they are sent across. Both types of backup transfers, data only and full, will trigger the Gateway to perform a soft reboot.
Runtime State Synchronization
Information that is only relevant to the running state, such as current alarm states, is shared between nodes on a differential basis so that the backup can take over with the same state that the master had.
On first connection or if the backup node falls too far out of sync, a full state transfer is performed. This information is light-weight and does not trigger a Gateway restart.
After the Master Gateway and Backup Gateway reestablishes a redundancy connection, the Backup Gateway will check if it has any conflicting data compared to the Master Gateway's data. If the Backup Gateway has instances of conflicting data, the Backup Gateway will drop those instances in favor of the Master Gateway's data.
Once connected, the nodes begin monitoring each other for liveliness and configuration changes. While the master is up, the backup runs according to the stand by activity level in the settings.
When the master cannot be contacted by the backup for the specified amount of time, it is determined to be down and the backup assumes responsibility. When the master becomes available again, responsibility is dictated by the recovery mode and the master either takes over immediately or waits for user interaction.
Historical data presents a unique challenge when working with redundancy because it is never possible for the backup node to know whether the master is truly down or simply unreachable. If the master was running, but unreachable due to a network failure, the backup node becomes active and begins to log history at the same time as the master, who is still active.
In some cases this is OK because the immediate availability of the data is more important than the fact that duplicate entries are logged. But in other cases, it's desirable to avoid duplicates, even at the cost of not having the data available until information about the master state is available.
Ignition redundancy provides for both of these cases, with the backup history level, which can be either Partial or Full.
- In Full mode, the backup node logs data directly to the database.
- In Partial mode, however, all historical data is cached until a connection is reestablished with the master. At that time, the backup and master communicate about the uptime of the master, and only the data that was collected while the master was truly down is forwarded to the database.
Failover to the other redundant node is now allowed if the nodes have different platform versions, which will allow attached clients to remain connected to at least one node during a redundant pair upgrade.
All Vision clients connect to the active node. When this system fails and is no longer available, they automatically re-target to the other node. The reconnection and session establishment procedures are handled automatically, but the user is notified that they have been transferred to a different node so that they can notify the system administrator that the system may need attention.
Like Vision clients, Perspective sessions connect to the active node. When connection to the active node is lost, or the activity level of the Gateway changes from active, the session will simultaneously attempt to:
- Re-establish the connection to the Gateway it was connected to, and check to make sure its activity level is active.
- Monitor the backup Gateway. If the backup Gateway becomes reachable and active before the connection to the active Gateway can be re-established, the Perspective session navigates in the browser to the same project and page on the backup Gateway.