Controller HA in AWS

Overview

Aviatrix Controller HA in AWS leverages auto scaling group and Lambda function to perform monitoring, launching a new controller and restoring configuration when the active controller instance become unreachable.

When a new controller is launched, the existing controller is terminated and its EIP is associated to the newly launched controller. Existing configuration is restored resulting in a seamless experience when failover happens.

Prerequisites

  • Existing AVX Controller. If you have not yet launched an AVX Controller, please follow this guide.
    • Aviatrix version must be >= 3.4. If older than 3.4, please upgrade.
    • Enable Controller Backup.
    • AMI aviatrix_cloud_services_gateway_043018_YYYY-xxxxxx or later. If you are on an older AMI, please refer here. to migrate to the latest controller AMI first.
  • Controller’s VPC should have one or more public subnets, preferrably in different AZs for HA across multiple AZ.
  • S3 bucket for backups

Controller HA Details

Aviatrix Controller HA operates by relying on an AWS Auto Scaling Group. This ASG has a desired capacity of 1. If the Controller EC2 instance is stopped or terminated, it will be automatically re-deployed by the ASG.

An AWS Lambda script is notified via SNS when new instances are launched by the Auto Scaling Group. This script handles configuration using a recent Controller backup file. The Aviatrix Controller manages these backups once enabled.

Restoring the Aviatrix Controller from a newly built instance requires access to the S3 bucket to retrieve the latest backup file. In order to do this, the newly built EC2 Controller instance must be granted permission to read files in the bucket. The simplest method of doing this is via an IAM user with programmatic access to the S3 bucket.

Steps to Enable Controller HA

Create IAM User

This procedure relies on an existing IAM user that has access to the S3 bucket where your backups reside.

Tip

Aviatrix recommends a new user that is granted access to the backup S3 bucket only.

Be sure to select Programmatic access for Access type when creating the user. Save the Access key ID and Secret access key for later use.

Launch CloudFormation Stack

  1. Login to the AWS console and switch to the region where your existing AVX Controller is installed.

  2. Launch this CloudFormation stack

  3. Click Next

  4. Populate the fields as follows:

    Field Expected Value
    Stack name Any valid stack name.
    Enter VPC of existing controller instance. Select the VPC in this region where the AVX Controller is installed.
    Enter one or more subnets in different Availability zones within that VPC. Select the subnet where the Controller is installed and optionally one additional subnet for redundancy.
    Enter Name tag of the existing Aviatrix Controller instance. Enter the Name tag for the existing Controller EC2 instance.
    Enter S3 Bucket which will be used to store backup files. Name of S3 bucket that stores the backup files from the AVX Controller.
    Enter AWS Access Key with permission to access S3 bucket. Access key ID for the IAM user above
    Enter AWS Secret Key with permission to access S3 bucket. Secret access key for the IAM user
    Enter an email to receive notifications for autoscaling group events Enter an email address that will be notified whenever a new Controller is provisioned.
  5. Click Next

  6. Populate any additional CloudFormation Options.

  7. Click Next

  8. Check “I acknowledge that AWS CloudFormation might create IAM resources with custom names.”

  9. Click Create

  10. Refresh the Stacks page and wait for the status of this stack to change to CREATE_COMPLETE

    Note

    If the stack fails (and ends with status of ROLLBACK_COMPLETE) check the log messages in the Events section. If you see an error that says “Failed to create resource. AMI is not latest. Cannot enable Controller HA. Please backup/restore to the latest AMI before enabling controller HA. ”, then follow the steps outlined here.

Note

This stack creates the following:

  • An Autoscaling group of size 1 and associated security group
  • An SNS topic with same name as of existing controller instance
  • An email subscription to the SNS topic (optional)
  • A Lambda function for setting up HA and restoring configuration automatically
  • An AWS Role for Lambda and corresponding role policy with required permissions

Tip

Additional instructions and code are available here.

Steps to Disable Controller HA

You can disable Controller HA by deleting the Controller HA CloudFormation stack.

Login to AWS Console, go to CloudFormation Service, identify the CloudFormation stack you used to enable Controller HA and delete the stack.

FAQ

  • Can two controllers in two different regions be linked such that they can detect if one or the other is down. Is this possible?
    Our Controller HA script leverages EC2 auto scaling. EC2 auto scaling doesn’t support cross regions but it does support cross AZs. The script will automatically bring up a new Controller in case the existing Controller enters unhealthy state.
  • Could a controller in a different region be used to restore saved configuration in case of disaster recovery? Will the change in controller’s IP cause any issues?
    A controller can be manually launched from a different region and the backed up configuration can be restored on it. The controller’s new EIP shouldn’t cause any issue unless SAML VPN authentication is being used. (All peering tunnels will still work). In that case, SAML VPN client will need reach the controller IP address. If FQDN hostname is used for the controller for SAML, then it should work after changing the Route 53 to resolve to the correct EIP in the different region.