GigaVUE‑FM High Availability
This section provides details about the GigaVUE‑FM High Availability (HA) feature and describes how to configure, upgrade, and troubleshoot the feature.
Refer to the following topics for details:
The GigaVUE‑FM High Availability (HA) feature supports a highly available fabric management environment with minimal interruption. The GigaVUE‑FM HA architecture consists of a minimum of three GigaVUE‑FM instances that run together as a highly available group. The highly available group provides protection from failure of any one of the members in the group.
The following figure shows the high-level architecture of the GigaVUE‑FM HA feature.
Starting in software version 5.14.00, you can add additional GigaVUE-FM instances to the high availability group. Refer to the Dynamic Addition of GigaVUE-FM Instances to HA Group section for details.
To configure the GigaVUE‑FM HA feature, you must have access to at least three authenticated GigaVUE‑FM instances that reside on a trusted network. All GigaVUE‑FM instances must run the same software version. The interfaces in the GigaVUE‑FM instances must be up and must be assigned IPv4/IPv6 addresses. You can also choose to use DNS host names.
Note: You can configure the GigaVUE‑FM HA feature only if you have administrative privileges. You must have Prime license installed for forming the GigaVUE-FM HA group.
The GigaVUE‑FM instances are not required to be in the same subnet, but still must be able to communicate with each other.
In addition, ensure that the instances have unique host names. You must be able to ping a GigaVUE‑FM instance from the other instances using the hostname or the IP address.
To add a GigaVUE‑FM instance to a GigaVUE‑FM HA group:
The GigaVUE‑FM instances in the HA group must be reachable to each other. |
If host names are used to configure the HA group, the host names of the GigaVUE‑FM instances must be resolvable through a DNS server. |
You must install a Prime license on the active GigaVUE‑FM instance to configure a High Availability group.
If the Prime license expires or if you accidentally delete the license, the existing configurations will still be present in the GigaVUE-FMs that are part of the HA group, but you will not be able to perform any new configurations. Moreover, if you disassemble the HA group, you cannot reconfigure the HA group without installing a valid Prime license. You can re-use the same Prime license to form the HA group again.
The GigaVUE‑FM HA feature is supported on the following platforms:
VMware vSphere |
GigaVUE‑FM Hardware Appliance |
OpenStack |
VMware NSX-T Manager |
Gigamon Containerized Broker |
Keep in mind the following rules and notes when you configure the GigaVUE‑FM HA feature:
The GigaVUE‑FM instances in the HA group must be identical in terms of system configuration such as hard disk, memory, and network interfaces, which include domain server, NTP server, and name server. |
When creating a GigaVUE-FM HA group, it is highly recommended to create the High Availability group using DNS name. GigaVUE-FM HA cannot manage the individual GigaVUE-FM instances if the IP address changes (if GigaVUE-FM IP(s) are dynamic). |
It is recommended to configure reverse DNS configuration while configuring FMHA with FQDN. |
The GigaVUE-FMs in the high availability group can be accessed using the IPv4/IPv6 address (DNS name) that is used to form the High Availability group. |
You can deploy the GigaVUE‑FM HA virtual machines on a WAN link with a maximum latency of 200 ms. |
You cannot add a GigaVUE‑FM Hardware Appliance and a GigaVUE‑FM virtual machine in the same HA group. |
VIP support is deprecated from software version 5.13.00. If VIP is configured in earlier GigaVUE-FM versions, it will be continued after upgrading to software version 5.13.00. However, you will be only able to delete the existing VIP, adding or updating the VIP is restricted. |
Use the orchestrated upgrade procedure to upgrade the GigaVUE-FM instances if the software version of GigaVUE-FM is 5.10.00 and above. |
The Reload All or Reload any FM commands will work only if there is an active GigaVUE-FM instance in the HA group. |
Third Party Orchestration is supported with GigaVUE-FM High Availability only in OpenStack environment. |
The GigaVUE-FM HA Cluster functions like standalone GigaVUE-FM’s when a fail-over occurs during long-running operations, such as Fabric Launch, Monitoring Session Deployment, or Fabric Upgrade. In such cases, manual intervention is required to re initiate these operations from the standalone or GigaVUE-FM HA nodes. |
Until software version 5.13.00, GigaVUE-FM HA architecture is based on 2N+1 redundancy model (N = 1) with one active and two standby instances providing resiliency and fault tolerance. With horizontal scaling, the services and application requests are distributed to standby GigaVUE-FMs. As the HA group was limited to only three GigaVUE-FM instances, there was a need for higher computing and memory intensive resources in the HA group because of the following use cases:
- Longer data retention period (up to 13 months).
- Better search performance with more horizontal distribution of Elastic Search for features such as Fabric Health Analytics (FHA) and Topology Visualization.
- Better indexing performance with the number of instances collecting various statistics on the configured components.
Therefore, the number of GigaVUE-FM instances in the HA group had to be increased for scalable performance of the various features in GigaVUE-FM.1 Starting in software version 5.14.00, GigaVUE-FM allows you to add more than three GigaVUE-FM instances to the HA group. The hardware requirements and resource allocation process for the additional instances are the same as that of the active and standby instances. The additional instances can be configured either as supporting instances or as active-eligible instances:
Supporting instances are GigaVUE-FMs that run minimal services, thereby providing more resources to elastic search services. Supporting instances:
- do not distribute application requests (all application services are turned off by default) nor run any distributed services.
- offer more scalability.
- cannot become candidates for active instance election
Active-Eligible instances are GigaVUE-FMs that distribute application requests and run distributed services. Active-eligible instances:
- Add more value to fault-tolerant and high available system.
- Can be part of active instance election.
Refer to the following diagram for the roles in the HA group.
Note: When you add a GigaVUE-FM instance as either a supporting instance or a standby instance, the current configuration of that GigaVUE-FM instance will be removed once it is added to the existing GigaVUE-FM HA cluster.
The following table compares GigaVUE-FM HA in software version 5.13.00 and less with GigaVUE-FM HA software version 5.14.00:
|
GigaVUE-FM less than or equal to 5.13.00 |
GigaVUE-FM 5.14.00 |
---|---|---|
Number of GigaVUE-FM Instances that can be configured in the HA group
|
Only three GigaVUE-FM instances can be added. | Minimum three. Additional GigaVUE-FM instances can be configured as supporting instances or active-eligible instances. |
Status and Roles Supported |
|
Note: GigaVUE-FM cannot support more than seven active eligible instances in a HA group. |
Number of GigaVUE-FM instances that can be configured as Active Eligible.
|
All three GigaVUE-FM instances are active eligible. |
|
Remove GigaVUE-FM instances
|
You cannot remove the GigaVUE-FM instances if the HA state is "At Risk". | You can remove the supporting GigaVUE-FM instances "At Risk" state. |
To add instances and to configure the HA group:
1. | On the left navigation pane, click and select High Availability. |
2. | Click Create. The High Availability wizard appears. |
3. | In the Group Name field, enter a unique name for the HA group, and then click Continue. |
4. | In the Add GigaVUE-FM Instance page, the details of the GigaVUE-FM instance from which the HA group is created will be fetched automatically. |
5. | Enter the External IP Address and Third-Party Authentication details, if required. Click the save icon to save the configuration. |
6. | Click Add New Node to add the second and third GigaVUE-FM instances. Enter the following details. |
Field |
Description |
---|---|
Status |
Status of the GigaVUE-FM instance. Can be Active or Standby. |
IP Address/DNS Name |
IP address/DNS name of the GigaVUE-FM instance (s for communication between the GigaVUE-FMs and to configure the HA group) It is always recommended to use the DNS name while creating the GigaVUE-FM High Availability group. |
External IP Address |
External IP address of the GigaVUE-FM instance (for accessing the GigaVUE-FM HA from outside the internal network.) |
Role |
Role of the GigaVUE-FM instance. Can be configured as either:
|
Username |
GigaVUE-FM User Name |
Password |
GigaVUE-FM GUI password |
Third Party Authentication |
Third-party authentication URL |
7. | Click the Save Node icon to validate the details entered for the instances. |
Once the first three instances are added to the HA group, the Continue button is enabled.
Note: When creating a HA group, all the existing data on the first configured GigaVUE-FM instance will be preserved, allowing the system to continue using the data after it is fully set up. However, the data on the subsequent nodes will be lost during the setup.
8. | Use the Add New Node button to continue to add the GigaVUE-FM HA instances. When adding the instances, define the role of the instances either as: |
- Active-eligible
- Supporting instances
9. | Review the details of the GigaVUE-FM instances. Use the Back button to go back and change the details. |
10. | Click the Edit Node icon to edit the details of the GigaVUE-FM instances, if required. |
After adding the instances to the HA group, click Submit. The HA group is configured and will be displayed as follows:
Click the ellipses on a GigaVUE-FM instance to perform the following tasks:
- Edit: Edit the instance details.
- Reboot: Reboot a GigaVUE-FM instance.
- Remove from group: Remove an instance from the HA group.
The HA page displays the following details:
Field |
Description |
Status and Reachability |
Status of the GigaVUE-FM instances. Can be:
Reachability:
|
IP Address/DNS Name |
IP address/DNS name of the GigaVUE-FM instances |
Role |
Role of the GigaVUE-FM instance. Can be configured as follows:
|
Entity ID |
Entity ID of GigaVUE-FM. It is a unique alpha-numeric ID that must be configured for SSO authentication. |
Host Name |
Host name of the GigaVUE-FM instance. |
Application Status |
Status of application services that are responsible for handling the user requests:
|
Software Version |
Software version of the GigaVUE-FM instance. |
System Uptime |
Time since the instance is up and active. |
Third Party Authentication URL |
Third party authentication URL required for SSO configuration. |
Upgrade Status |
Upgrade status of GigaVUE-FM. This is displayed only when the upgrade is triggered. Otherwise, this field is empty. |
You can add instances to the HA group by clicking the Add New Node button, as required. You can also remove the instances from the HA group and reduce the size of the HA group as required. However, while removing the instances ensure to retain the quorum2 number of instances.
Note: To scale components such as Fabric Health Analytics, topology visualization and to improve the performance of GigaVUE-FM by fully utilizing the resource available, it is recommended to configure three active eligible instances and additional supporting instances.
The following are the behavioral changes observed in the HA group with active, standby and supporting instances:
Task | Scenario | Behavior |
---|---|---|
Backup and Restore Operation |
All active-eligible instances are up and few or all supporting instances are up. |
Backup will happen in active instance. During restore, all active-eligible GigaVUE-FMs will be rebooted and will perform the restore operation. Supporting instances will not participate in the restoration. |
Few active-eligible instances are up (Quorum number of instances are up) | Backup will happen in active instances. Restore will not be initiated in this case. | |
Upgrade Operation | Few instances not being up (either active-eligible or supporting instances not up) | Upgrade will not be initiated. |
Few supporting instances are removed from the HA group after formation of the group | Upgrade will be initiated, and the supporting instances present in HA group will be upgraded first followed by the active-eligible instances. | |
All instances are up and running | Upgrade will be initiated in an active instance. Upgrade will be done in a rolling fashion starting with supporting instances and ending with the active-eligible instances. | |
System Configurations | All instances are active-eligible | All GigaVUE-FM's GUI will be accessible, and user can configure the system through the same |
HA with active-eligible and few Supporting instances | For supporting instances, GUI will not be available (as application service is shut down). An option is provided in the GUI to turn on the application service and its dependent service(s) and the status can be seen in HA page. You can then configure the system related configuration from the logged in GigaVUE-FM instance once the application status of the supporting instance is up. | |
Background processes | ||
-FM Service Distribution | All GigaVUE-FMs are active-eligible instances. | Services will be equally distributed/redistributed among the instances based on the availability of the instances. |
HA group with active-eligible instances and supporting instances. | Services will be distributed/re-distributed only to active eligible instances | |
-Load Balancer | All GigaVUE-FMs are active-eligible instances. | All GigaVUE-FM API requests will be distributed to all instances in the HA group. |
With supporting instances. | All GigaVUE-FM API requests will be distributed only to active-eligible instances. |
Keep in mind the following recommendations and notes when working with the HA group.
GigaVUE-FM cannot support more than 7 active-eligible nodes (this is due to the underlying constraint in MongoDB).
- When reducing the size of a HA group that has both active-eligible and supporting instances, it is always recommended to remove the supporting instances first so that the active-eligible instances remain intact in the HA group. If active-eligible instances are removed first from the HA group, then the HA group will become unstable. This, however, depends on the number of active-eligible and supporting instances in the HA group. Consider a GigaVUE-FM HA group with 7 instances. The number of active and supporting instances in a HA group may be as follows:
- When reducing the size of HA group from 7 instance to 5 instance or 3 instance, the quorum is re-computed and adjusted accordingly. This is to ensure that the HA group remains intact with the available set of instances.
- When reducing the size of the HA cluster, during removal of an instance make sure to give time for the Elastic Search to rebalance and return to a complete (100%) healthy state before removing the next instance. If you remove the instances subsequently one after the other, the health state of the HA group will turn red as there will be no time for shard rebalancing amongst the available instances.
Login to GigaVUE-FM CLI as a root user and execute the following command to check the ES cluster health:
Shrinking the size of the HA group:
HA group 1 | With 3 active instances and 4 supporting instances | The recommendation is specifically applicable for HA group 1 in which there are only 3 active eligible instances in the group. Removing the active-eligible instances may cause the HA group to become unstable. |
HA group 2 | With 5 active instances and 2 supporting instances. | The recommendation is not applicable for HA group 2 as removing the active-eligible instances will not impact the stability of the HA group. |
For a HA group to be stable, it must have a minimum of two active-eligible instances in the group.
curl -XGET "http://localhost:9200/_cluster/health?pretty"
Supporting GigaVUE-FM Instances:
- When a supporting instance is added to the HA group, its system information will not be available immediately and will be available only after a minute.
- When the HA group is upgraded, application services of the supporting instances will be turned on automatically for the upgrade and once GigaVUE‑FM HA upgrade is completed it will be turned off automatically. Application services will be turned off automatically irrespective of the upgrade completion status, that is, either success or failure.
- For supporting instances, system information will be populated into MongoDB as the Application Service is down by default (scheduler populates the system information every minute into MongoDB). Hence any changes in the system hostname or GigaVUE-FM version will be reflected in the GUI only after a minute.
- In case of suspended mode, MongoDB will go into write protected mode. Therefore, any system related updates performed in the supporting nodes will not be reflected in the GUI.
- If the supporting instance is not reachable, all system related information that is populated in the GUI reflects the previous known values and not the latest values for Application Status, Up Time, and other system related information.
- Whenever the application services are up and running, events will be logged in the Events page (with the severity level - Info)
When you login to any of the GigaVUE-FM instances of the HA group, the dashboard page appears. Use the drop-down option in the GigaVUE‑FM GUI footer available on specific pages to view the details pertaining to that GigaVUE‑FM. For example, the following pages in the GigaVUE‑FM GUI have drop-down option in the footer:
- FM Health
- IP Resolver
- NTP Server for Clock Synchronization
GigaVUE‑FM instances participating in the HA group act as Load Balancers allowing better distribution of load within the GigaVUE-FM instances. GigaVUE-FM load balancer functionality provides the following capabilities:
- Seamless access to the GigaVUE-FM Dashboard page: Accessing any GigaVUE-FM GUI always takes you to the GigaVUE-FM dashboard page after successful login. This provides a cluster view of the GigaVUE-FM GUI rather than individual views for the active and standby instances.
- Ability to access the available GigaVUE-FM GUI even during a failover: However, there will be an impact to the write operation, until a new active instance takes over. For example, when you create, update or delete any of the resources in GigaVUE-FM such as maps, GigaSMART groups, tags, etc. during failover, then the operation will fail with the following error message: Unable to Connect to Server.
- Enhanced distribution of load across the members of the HA group: This allows faster response to the HTTP GET requests.
- Ability to perform backup/restore/upgrade operation from any of the GigaVUE-FM instances.
The GigaVUE-FM Load Balancer functionality replaces the GetDistribution support provided in software version 5.12.00.
Note: You cannot disable the Load Balancer functionality.
The following behavioral changes are observed:
- After creating, updating, or deleting resources in GigaVUE-FM, the GigaVUE‑FM GUI will be updated immediately. However, if there is latency between the GigaVUE-FM instances, the GUI will not be updated immediately. The subsequent refresh will display the updated data.
- GetDistribution will be impacted if DNS resolution of GigaVUE-FM instances fail:
- If active GigaVUE-FM instance DNS is unresolvable, then the load will be distributed between the two standby instances, with an impact to the write operation.
- If standby GigaVUE-FM DNS is unresolvable, then the load on the active instance increases, thereby increasing the CPU and Memory Usage.
- If any GigaVUE-FM instance is down in the HA group, and if a API request is forwarded to that instance due to load balancing, then for a few seconds GigaVUE-FM GUI will not load any data. The subsequent refresh or reload will display the updated data.
- The rate at which the active GigaVUE-FM instance handles the HTTP Get request and its processing will be high in case of HA upgrade when the standby GigaVUE-FM instances are upgrading and that forces active GigaVUE-FM to handle all the HTTP Get requests. In this case, CPU, Memory Usage will be slightly higher than the normal.
Note: Consider three GigaVUE-FM instances FM-A, FM-B, and FM- C and that you are logged in to the HA group with the IP address or DNS name of FM-A. When working with the GigaVUE-FM GUI, if FM-B goes down, then the GUI will not load any response. This is because the load balancer (FM-A in this case) has not learned about the unreachability of FM-B instance, which will happen only during the next periodic health checkup.
Important Recommendations
- For seamless access to the HA group, configure a DNS name that resolves to all the GigaVUE-FM instances participating in the cluster.
-
For better user experience, the backup/restore operation must be carried out from an active GigaVUE-FM instance.
- If you perform a restore operation from active instance, you will be notified about the restore operational status before the GigaVUE-FM goes down.
- If you perform the same restore operation from a standby GigaVUE-FM instance, GigaVUE-FM GUI will go down without any notification about the restore operation to the users.
- If GigaVUE-FM HA upgrade fails and if the upgrade is completed manually, you must login to the GigaVUE-FM CLI as root user and run the following command. To login as root user, login into shell with admin credentials and type "sudo su -"
curl -XPOST "localhost:4466/fmcs/updateLoadBalancer?pretty"
Response :
{"operation":"success"}
Note: Contact Gigamon Technical Support if you do not see this response.
To remove or replace a standby GigaVUE‑FM instance from the GigaVUE‑FM HA group follow the below listed steps:
1. | Login to any GigaVUE‑FM instance. |
2. | On the left navigation pane, click and select High Availability. |
3. | Select the standby GigaVUE‑FM instance that you want to remove from the HA group. |
4. | Click the ellipsis on the GigaVUE‑FM instance widget in the High Availability page as shown in 2. |
2 | Disabling GigaVUE‑FM Instance |
5. | Select the Remove from group option. The selected standby GigaVUE‑FM instance is removed from the GigaVUE‑FM HA group. |
Note: The status of the GigaVUE‑FM HA group changes from Healthy to At Risk. You will not be allowed to remove the other standby GigaVUE‑FM instance after the HA status changes to At Risk.
To completely disassemble the GigaVUE‑FM HA group:
1. | Login to the active GigaVUE‑FM instance. |
Note: You cannot remove an active GigaVUE‑FM instance or disassemble the GigaVUE‑FM HA group by logging in from a standby GigaVUE‑FM instance.
2. | On the left navigation pane, click and select High Availability. |
3. | Click the ellipsis on the GigaVUE‑FM instance widget in the High Availability page. |
4. | Select the Delete HA group option. The GigaVUE‑FM HA group is disassembled and each of the GigaVUE‑FM instances become standalone GigaVUE‑FM instances. |
Note: Executing the above steps disassembles the GigaVUE‑FM HA group completely. The database of the individual GigaVUE-FM HA group members (active, standby, and supporting instances) will be erased and reinitiated with the default database. If you had performed orchestrated configuration flows (such as the Flexible Inline Arrangement or Intent Based Orchestration) you must manually remove and recreate these if they do not backup the GigaVUE‑FM database before choosing Delete HA Group.
If you cannot access the active GigaVUE-FM instance's GUI, use the following CLI command in all the GigaVUE-FM instances to disassemble the HA group:
/opt/fmcs/bin/fmcs leave force
Before disassembling the instances, take a backup of the configuration. After the GigaVUE‑FM instances are disassembled from the HA group, the configurations present in the instances will be completely removed.
The GigaVUE‑FM HA state depends on the status of the three GigaVUE‑FM instances. The following table lists the various states of the GigaVUE‑FM HA group. You can view the HA group state from Administration > High Availability in the GigaVUE‑FM GUI.
State |
Number of GigaVUE‑FM Instances |
Description |
Healthy |
Three GigaVUE‑FM instances are up and running |
One GigaVUE‑FM instance is in active state. Other two instances are in standby state.
|
At Risk |
Two GigaVUE‑FM instances are up and running |
One GigaVUE‑FM instance is in active state. Another GigaVUE‑FM instance is in standby state. The third GigaVUE‑FM instance has either not joined the HA group or has left the HA group. Note: You must recover the standby GigaVUE-FM instance with in 3 days (72 hours). Failure to do so will cause the instance to move out of the High Availability group.
|
Incomplete |
One GigaVUE‑FM instance is up and running |
Only one GigaVUE‑FM instance is in active state. The other two GigaVUE‑FM instances have either not joined the HA group or have left the HA group.
|
Standalone |
One GigaVUE‑FM instance is up and running |
HA is not configured on the GigaVUE‑FM instance.
|
Suspended |
Two or more GigaVUE‑FM instances are up and running |
HA is configured, but an active GigaVUE‑FM instance is yet to be elected. Note: The GigaVUE-FM high availability group can move to a Suspended state from the At Risk state when the HA group has only one active-eligible and reachable GigaVUE-FM instance. Events will not be captured for the status of the GigaVUE-FM instance that went down (which causes the HA group to move into a suspended state) as there is no active/leader instance in the High Availability group. |
In case of FMHA going to suspended mode from At-Risk mode, the reachability status of node which went down (due to which HA went into suspended mode) won't be audited as writes are not accepted if there is no active/leader instance in the DB cluster
4 | High Availability States |
The active GigaVUE‑FM instance in the high availability group may fail at times resulting in one of the standby instances to take over and become the active instance. This process is called failover.
The following table provides the reasons for failover:
Reason for Failover |
Description |
Reloading the active GigaVUE‑FM instance |
An active GigaVUE‑FM instance is reloaded (using the Reboot option) to bring back the HA group to healthy state again.
|
Planned downtime of the active GigaVUE‑FM instance |
An active GigaVUE‑FM instance is brought down due to various reasons, for example to upgrade to a newer software version.
|
The High Availability page ( On the left navigation pane, click and select High Availability) displays the current state of the GigaVUE‑FM HA group. When a failover occurs, the HA group state changes in the GUI.
The following table lists the GUI changes for the various scenarios:
Scenario |
Changes in GUI
|
||||||
What happens to the High Availability page immediately after a failover?
|
The High Availability page may not update immediately or may not show all the GigaVUE‑FM instances. Note: Refresh the page after a minute to view the new active GigaVUE‑FM instance. However, the page will get updated automatically after 5 minutes.
|
||||||
What happens to an active GigaVUE‑FM instance when it fails (either by itself or if failover is triggered manually)? |
|
||||||
What happens to the GigaVUE‑FM instances that were previously in standby state?
|
|
||||||
What happens to the embedded devices when a new Active instance takes over?
|
There are no changes to the devices except that they are being managed by the new active GigaVUE‑FM instance.
|
||||||
How do you trigger a failover? |
Click on the 'Reboot' option on the current active GigaVUE‑FM instance to trigger a failover. |
Use the following table to troubleshoot issues that you might encounter while working with the HA feature.
Problem |
Solution |
Unable to add a license to GigaVUE‑FM HA group after a failover
Reason: Using the MAC Address in the About page to generate the license
|
Always use the Challenge MAC Address in the Licenses page of the active GigaVUE‑FM instance to generate licenses. Add the licenses to the HA group.
|
Unable to join the GigaVUE‑FM HA group after changing the password
Reason: Not logging out of GigaVUE‑FM after changing the password and before joining the HA group |
Always logout of GigaVUE‑FM after changing the password and login again with the new password before joining the HA group.
|
Orchestrated upgrade of GigaVUE-FM instances in a High Availability group is similar to upgrading a standalone GigaVUE-FM instance. You can upgrade using an image that is located on an external image server, or you can use GigaVUE-FM as the image server. Refer to the Upgrade GigaVUE-FM section in the GigaVUE-FM Installation and Upgrade Guide for more details.
Note: Orchestrated Upgrade of GigaVUE-FM instances in a HA group is supported from software version 5.10.01. For GigaVUE-FM software version 5.10.00 and above, it is recommended only to use the orchestrated upgrade procedure to upgrade the GigaVUE-FM instances in a HA group.
Prerequisites
Before upgrading the GigaVUE-FM instances in a High Availability group, ensure the following:
- The High Availability group must be in a healthy state.
- The latency between the GigaVUE-FM instances must be less than 100ms.
- The config disk space allocated to GigaVUE-FM must have a maximum sustained transfer rate of above 100MB/s. A low disk rate impacts both file sync and installation.
Steps
To upgrade the GigaVUE-FM instances in a High Availability group from the GUI, click the Upgrade option from the User icon. Always trigger the upgrade from the active GigaVUE-FM instance.
Note: Do not run any scripts (example: scripts for fetching polling statistics of maps) or queries (example: database or stats queries) prior to or during the upgrade operation as this will drain GigaVUE-FM's memory and CPU resources and impact the upgrade operation.
The following is the sequence of events that occur in the background:
- Active GigaVUE-FM Instance: Software image download process is triggered and the image is downloaded.
- Active GigaVUE-FM instance: Syncs and copies the downloaded image with one of the standby GigaVUE-FM instances - the first standby GigaVUE-FM instance.
- First Standby Instance: Image is synced, and the standby instance will get upgraded first and rebooted.
- Second Standby Instance: Image is then synced by the second standby instance and the second standby instance will get upgraded and rebooted.
- Active GigaVUE-FM Instance: Once the standby instances are upgraded, the active GigaVUE-FM will start to upgrade and will be rebooted.
- A new active GigaVUE-FM instance will be elected while the active GigaVUE-FM reboots.
Note: As orchestrated upgrade is a background process, device management tasks will be carried out seamlessly. The overall time consumed for the upgrade process is around 60 minutes and the total management loss time during orchestrated upgrade is around 1 minute. This is the time required to elect the new active GigaVUE-FM instance when the current active GigaVUE-FM instance reboots post upgrade.
In a GigaVUE-FM high availability environment, you can perform the GigaVUE-FM and device configurations only from the active GigaVUE-FM instance. Standby instances provide minimal configuration options. In case of failover, it is important to be aware of the IP addresses of the three GigaVUE-FMs and also the DNS host names of the GigaVUE‑FM instances so that you can access the active GigaVUE‑FM instance.
To overcome this restriction and still access the GigaVUE‑FM instances, you can use one of the following options:
- Use Load Balancer
- Assign a DNS Name for the GigaVUE‑FM instances
Load balancer distributes traffic across a number of servers. Integrating a load balancer with GigaVUE‑FM, forwards the traffic to the active GigaVUE‑FM instance. Load balancer performs healthcheck of GigaVUE‑FM and forwards the traffic destined to the external GigaVUE‑FM IP address and thereby to the GigaVUE-FM high availability group.
To use load balancer for forwarding the traffic, you must ensure to do the following:
- Deploy the load balancer.
- Configure the load balancer to access the active instance in the GigaVUE-FM High Availability group. This is accomplished using the following GET API endpoint exposed by GigaVUE-FM:
https://<FM-IP>/api/v1.3/fmHa/status
Based on the returned response codes, the load balancer can be configured in such a way that healthy servers (active instance) are those that return 200 as response code when the endpoint API is queried. The remaining servers are considered as unhealthy servers (standby instances). With these configurations, the requests are forwarded to active GigaVUE-FM always.
Sample Response for GET https://<FM-IP>/api/v1.3/fmHa/status
If FM Role == active/standalone,
return 200 with Payload {"role" : "active" } or {"role" : "standalone" }
Else if FM Role == Standby
return 202 with Payload {"role" : "Standby" }
Else if FM Role == support, suspended, unknown
return 400 with Payload for 400 {"role" : "Unknown" }
Note : 5xx can be thrown if the API Gateway on FM is not available/down
- Configure the load balancer with external GigaVUE-FM IP address as the DNS IP, for example, myfm.com.
Note: If you add or remove the nodes in the GigaVUE-FM HA group, you must ensure to update the corresponding IP addresses of the nodes in the load balancer.
You can assign a DNS name to the three GigaVUE‑FM IP addresses, which helps to access the GigaVUE-FM instance in case of failures. Consider a DNS name, myfm.com that has the IP addresses of the three GigaVUE‑FM instances of the high availability group.
If you type the DNS name of the GigaVUE-FM, myfm.com, the DNS server returns the IP addresses of the three GigaVUE‑FM instances, and:
- the Dashboard page of the active GigaVUE-FM instance appears, or
- the High Availability page of the standby GigaVUE-FM instances may appear if the active instance is not reachable, as in the case of a failover. From the High Availability page of the standby instance, you can navigate to the GigaVUE‑FM active instance.
This ensures that the GigaVUE-FM High Availability group is always accessible.