Skip to main content

Overview

This guide provides comprehensive instructions for upgrading your cloud networking infrastructure, including gateways, controllers, and associated services. Follow these procedures to ensure smooth upgrades with minimal downtime and reduced risk.

Upgrade Planning

Pre-Upgrade Assessment

System Health Check

# Check system health status
copilot health check --component all

# Verify gateway status
copilot gateway status --all

# Check connectivity
copilot connectivity test --all-sites

# Review resource utilization
copilot metrics summary --timeframe 7d

Capacity Planning

  • Review current resource utilization trends
  • Assess impact of new features on resource consumption
  • Plan for potential performance improvements or changes
  • Ensure adequate capacity for upgrade process

Dependencies and Requirements

  • Cloud provider API compatibility
  • Network connectivity requirements
  • Certificate and authentication dependencies
  • Integration with monitoring and security tools

Upgrade Strategy

Maintenance Windows

Business Hours Considerations
  • Identify lowest-impact time windows
  • Coordinate with business stakeholders
  • Plan for different time zones if globally distributed
  • Consider backup windows for critical operations
Rollback Planning
  • Define rollback triggers and decision points
  • Test rollback procedures in non-production
  • Ensure ability to quickly revert changes
  • Maintain previous version backups

Communication Plan

Stakeholder Communication:
  Before Upgrade:
    - 1 week notice: High-level timeline and impact
    - 3 days notice: Detailed maintenance window
    - 1 day notice: Final confirmation and contacts

  During Upgrade:
    - Start notification: Maintenance begins
    - Progress updates: Every 30 minutes for major steps
    - Completion notice: Service restoration confirmation

  After Upgrade:
    - Success notification: Upgrade completion status
    - Issues summary: Any problems encountered
    - Next steps: Follow-up actions required

Upgrade Types

Controller Upgrades

Preparing for Controller Upgrade

Backup Current Configuration
# Export complete configuration
copilot config export --format yaml --output backup-$(date +%Y%m%d).yaml

# Backup certificates and keys
copilot cert export --all --output cert-backup-$(date +%Y%m%d)

# Document current version and settings
copilot version --detailed > version-info-$(date +%Y%m%d).txt
Pre-Upgrade Validation
# Verify configuration consistency
copilot config validate --comprehensive

# Check for deprecated features
copilot compatibility check --target-version 6.8.0

# Review upgrade prerequisites
copilot upgrade prereq --version 6.8.0

Controller Upgrade Process

Staged Upgrade Approach
  1. Standby Controller First (if HA deployment)
    • Upgrade standby controller
    • Verify functionality and health
    • Perform failover to upgraded controller
    • Upgrade original primary controller
  2. Single Controller Upgrade
    • Schedule maintenance window
    • Perform upgrade during low-traffic period
    • Monitor closely during upgrade process
Upgrade Commands
# Check available upgrades
copilot upgrade check

# Download upgrade package
copilot upgrade download --version 6.8.0

# Perform upgrade (with dry-run option)
copilot upgrade start --version 6.8.0 --dry-run

# Execute actual upgrade
copilot upgrade start --version 6.8.0 --confirm

Gateway Upgrades

Gateway Upgrade Strategy

Rolling Upgrade (Recommended)
  • Upgrade gateways one at a time or in small batches
  • Maintain service availability during upgrade process
  • Allows for immediate rollback if issues occur
  • Minimizes impact on production traffic
Parallel Upgrade
  • Upgrade multiple gateways simultaneously
  • Faster completion but higher risk
  • Requires careful capacity planning
  • Use only in well-tested environments

Gateway Upgrade Process

Pre-Upgrade Steps
# List all gateways and versions
copilot gateway list --show-version

# Check gateway health
copilot gateway health --name production-transit-gw

# Verify redundancy and failover capability
copilot gateway failover test --name production-transit-gw
Individual Gateway Upgrade
# Put gateway in maintenance mode
copilot gateway maintenance enable --name production-transit-gw

# Verify traffic has been redirected
copilot traffic status --gateway production-transit-gw

# Perform gateway upgrade
copilot gateway upgrade --name production-transit-gw --version 6.8.0

# Verify upgrade completion
copilot gateway status --name production-transit-gw

# Exit maintenance mode
copilot gateway maintenance disable --name production-transit-gw
Automated Rolling Upgrade
# Configure rolling upgrade parameters
copilot upgrade configure --strategy rolling --batch-size 2 --delay 10m

# Start rolling upgrade
copilot gateway upgrade --all --strategy rolling --version 6.8.0

# Monitor upgrade progress
copilot upgrade status --watch

Software and Feature Updates

Feature Flag Management

Enabling New Features
Feature Rollout Strategy:
  Phase 1:
    Development Environment - Enable feature in dev environment - Validate
    functionality and performance - Test integration with existing features

  Phase 2:
    Staging Environment - Enable feature in staging - Perform comprehensive
    testing - Validate with production-like data

  Phase 3:
    Production Rollout - Enable for small subset of users - Monitor performance
    and issues - Gradually expand to all users
Feature Configuration
# List available features
copilot features list --version 6.8.0

# Enable specific feature
copilot feature enable --name enhanced-analytics --environment staging

# Configure feature parameters
copilot feature configure --name enhanced-analytics --config analytics-config.yaml

# Monitor feature performance
copilot feature metrics --name enhanced-analytics --timeframe 24h

Upgrade Procedures by Component

Certificate Updates

Certificate Rotation

# Check certificate expiration
copilot cert list --expiring-days 30

# Generate new certificates
copilot cert generate --type gateway --duration 365d

# Deploy new certificates
copilot cert deploy --cert-id new-cert-001 --gateway production-transit-gw

# Verify certificate installation
copilot cert verify --gateway production-transit-gw

Certificate Authority Updates

  • Plan for CA certificate rotation
  • Update trust stores across all components
  • Coordinate with client certificate updates
  • Test certificate chain validation

Security Policy Updates

Policy Version Management

# Export current policies
copilot policy export --all --version current

# Import new policy version
copilot policy import --file updated-policies.yaml --version 6.8.0

# Test policy changes
copilot policy test --source test-vm --destination prod-db --new-version

# Apply policy updates
copilot policy apply --version 6.8.0 --strategy gradual

Integration Updates

Third-Party Integration Updates

  • SIEM Integration: Update connectors and data formats
  • Identity Providers: Verify SAML/OIDC compatibility
  • Monitoring Tools: Update API integrations and metrics
  • Backup Systems: Validate backup and restore procedures

Validation and Testing

Post-Upgrade Validation

Functional Testing

# Comprehensive connectivity test
copilot test connectivity --all-sites --comprehensive

# Performance baseline comparison
copilot test performance --baseline pre-upgrade-baseline.json

# Security policy validation
copilot test security --policy-compliance --all-rules

# User access verification
copilot test access --user-groups all --resources critical

Automated Test Suite

Post-Upgrade Test Suite:
  Connectivity Tests:
    - Gateway-to-gateway connectivity
    - User VPN connectivity
    - Site-to-site VPN functionality
    - Internet breakout functionality

  Security Tests:
    - Firewall rule enforcement
    - Intrusion prevention system
    - Certificate validation
    - Access control verification

  Performance Tests:
    - Latency measurements
    - Throughput testing
    - Resource utilization check
    - Scalability validation

  Integration Tests:
    - SIEM data flow
    - Monitoring alerts
    - Backup operations
    - API functionality

Performance Validation

Baseline Comparison

# Generate performance report
copilot performance report --compare-baseline pre-upgrade

# Check for performance regressions
copilot performance analyze --threshold 10% --timeframe 24h

# Monitor resource utilization changes
copilot metrics compare --before upgrade --after upgrade

Load Testing

  • Execute load tests similar to production patterns
  • Verify auto-scaling functionality
  • Test failover and recovery scenarios
  • Validate capacity limits and thresholds

Rollback Procedures

When to Rollback

Immediate Rollback Triggers
  • Critical functionality failures
  • Security vulnerabilities introduced
  • Severe performance degradation (>25% regression)
  • Data corruption or loss incidents
Planned Rollback Triggers
  • User acceptance criteria not met
  • Integration failures with critical systems
  • Unacceptable stability issues
  • Compliance or regulatory violations

Rollback Process

Controller Rollback

# Check rollback readiness
copilot rollback check --component controller

# Perform controller rollback
copilot rollback execute --component controller --target-version 6.7.5

# Verify rollback completion
copilot version verify --expected 6.7.5

Gateway Rollback

# Rollback specific gateway
copilot gateway rollback --name production-transit-gw --target-version 6.7.5

# Rollback all gateways
copilot gateway rollback --all --target-version 6.7.5 --strategy rolling

# Verify gateway functionality
copilot gateway test --name production-transit-gw --comprehensive

Configuration Rollback

# Restore previous configuration
copilot config restore --backup backup-20240115.yaml

# Verify configuration consistency
copilot config validate --comprehensive

# Test restored functionality
copilot test all --quick-validation

Best Practices

Pre-Upgrade Preparation

Testing Strategy
  • Test upgrades in non-production environments first
  • Use identical configurations and data patterns
  • Validate all integrations and dependencies
  • Document test results and performance baselines
Risk Mitigation
  • Maintain current backup and restore procedures
  • Verify rollback capabilities before starting upgrade
  • Plan for extended maintenance windows
  • Prepare emergency contact procedures

During Upgrade

Monitoring and Communication
  • Continuous monitoring of system health
  • Regular progress updates to stakeholders
  • Document any issues or deviations from plan
  • Be prepared to pause or rollback if needed
Change Control
  • Follow established change management procedures
  • Maintain detailed logs of all actions taken
  • Coordinate with other maintenance activities
  • Ensure proper approval for any deviations

Post-Upgrade

Validation and Monitoring
  • Extended monitoring period (24-48 hours minimum)
  • Comparison with baseline performance metrics
  • User acceptance testing and feedback collection
  • Documentation of lessons learned
Knowledge Transfer
  • Update operational procedures and documentation
  • Training for operations team on new features
  • Communication of changes to end users
  • Update disaster recovery and rollback procedures

Maintenance Windows

Planning Considerations

Impact Assessment
Service Impact Matrix:
  Critical Services (0-minute tolerance):
    - Financial trading systems
    - Emergency services
    - Real-time monitoring

  High Priority (15-minute tolerance):
    - Customer-facing applications
    - Production databases
    - Authentication services

  Standard Services (1-hour tolerance):
    - Internal applications
    - Development environments
    - Reporting systems
Window Scheduling
  • Coordinate across multiple time zones
  • Consider business calendar and peak periods
  • Allow buffer time for unexpected issues
  • Plan for potential window extension

Emergency Upgrades

Security Patch Deployment

# Emergency security patch process
copilot security patch check --critical

# Deploy critical security patches
copilot patch deploy --security --critical --emergency

# Verify patch installation
copilot security validate --patch-level current

Rapid Response Procedures

  • Abbreviated testing for critical security fixes
  • Emergency change approval processes
  • Accelerated communication procedures
  • Post-incident review and documentation

Upgrade Automation

Scripted Upgrades

Automation Framework

#!/bin/bash
# Automated upgrade script template

# Pre-upgrade checks
./pre-upgrade-check.sh || exit 1

# Backup current state
./backup-configuration.sh

# Perform upgrade
copilot upgrade start --version $TARGET_VERSION --automated

# Post-upgrade validation
./post-upgrade-test.sh

# Notification
./send-notification.sh "Upgrade completed successfully"

CI/CD Integration

  • Integration with deployment pipelines
  • Automated testing and validation
  • Rollback automation on failure
  • Integration with monitoring and alerting
For additional information and advanced upgrade scenarios, see: