Skip to main content

Why Secure Kubernetes at the Network Fabric

Traditional firewalls match on source and destination IP addresses. Kubernetes breaks this model. Pods get ephemeral IPs that change on every restart, scale event, or rolling deployment — making IP-based rules meaningless within minutes. Securing Kubernetes traffic requires a shift from address-based policy to identity-based policy.

The Problem with IP-Based Security in Kubernetes

A pod running at 100.64.3.47 right now may be gone in 30 seconds. A completely different workload may inherit that same IP moments later. Firewall rules like “allow 100.64.3.47 to 10.0.1.5 on port 443” are stale before the next deployment.
  • Ephemeral IPs — Pods receive random IPs from a CIDR pool. Every restart, scale-up, or rolling update shuffles the assignments. Static rules cannot keep pace.
  • Overlapping CIDRs at scale — When multiple clusters use the same secondary CIDR (e.g., 100.64.0.0/16), the same IP address maps to entirely different workloads in different clusters. CIDR-based rules are ambiguous.
  • IPs do not represent identity — Kubernetes workloads are defined by namespace, labels, and service accounts. These are the primitives that express intent — not IP addresses.
IP overlap is a real operational problem, but it is a symptom. The deeper issue is that IP addresses do not represent workload identity in Kubernetes. Solving overlap without solving identity leaves security gaps.

External Firewalling: Don’t Touch the Cluster

Aviatrix enforces Kubernetes security at the network fabric — outside the cluster — using the spoke gateway as the policy enforcement point. This keeps the cluster clean:
  • Use the cloud-native CNI — The VPC CNI is battle-tested, AWS-supported, and deeply integrated with EC2 networking. Aviatrix works with it, not against it. No CNI replacement required.
  • No DaemonSets — In-cluster network policy engines (Calico, Cilium) require DaemonSets on every node — another component to patch, monitor, and troubleshoot inside the blast radius of the cluster itself.
  • No sidecars — Service mesh approaches (Istio, Linkerd) inject proxies into every pod, adding latency, memory overhead, and operational surface. For network-level segmentation, this is unnecessary.
  • Separation of concerns — The networking team manages firewall policy in the Aviatrix fabric. Application teams manage workloads in the cluster. Neither modifies the other’s domain. A platform engineer can enforce “namespace X can only talk to namespace Y” without ever running kubectl.
  • Consistent across clusters and clouds — The same DCF policies work whether the cluster is EKS, AKS, or GKE. Policy follows the architecture, not the Kubernetes distribution.
For a full overview of DCF capabilities for Kubernetes — including SmartGroup types and supported distributions — see Distributed Cloud Firewall for Kubernetes.

Platform Engineering Models

This architecture enables two common platform delivery patterns:
  • Namespace as a service — Multiple teams share a single EKS cluster, isolated by namespace. DCF uses Kubernetes-type SmartGroups to enforce per-namespace egress policy at the spoke gateway. Teams get self-service namespaces; the platform team controls what leaves the cluster.
  • Cluster as a service — Each tenant gets a dedicated EKS cluster in its own VPC with its own (potentially overlapping) pod CIDR. DCF uses VPC-type SmartGroups for cluster-level segmentation. Tenants get full cluster isolation; the platform team manages inter-cluster and egress policy centrally.
  • Multi-cluster, multi-cloud — Both models compose. An organization can run namespace-as-a-service in EKS and cluster-as-a-service in AKS, governed by a single set of DCF policies.

Overview

One of the practical challenges with EKS at scale is IP exhaustion: the VPC CNI plugin assigns one VPC IP address per pod, rapidly consuming routable RFC 1918 address space. A cluster running 500 pods consumes 500 IPs from your enterprise IP plan — and that is a single cluster. AWS documents this problem and a native solution in Addressing IPv4 address exhaustion in Amazon EKS clusters using private NAT gateways. The AWS approach uses a private NAT gateway as the SNAT translation point between non-routable pod subnets and routable VPC subnets. When Aviatrix is the network fabric, the Aviatrix spoke gateway replaces the private NAT gateway — it becomes the SNAT translation point. This is simpler (no additional NAT gateway infrastructure), integrates with DCF for policy enforcement, and works consistently across multi-cloud environments. This architecture supports both the namespace-as-a-service and cluster-as-a-service platform engineering models described above.
SmartGroups are enforced on cluster egress traffic. Kubernetes network policies should be used outside of Aviatrix to restrict traffic between namespaces within the cluster.
This guide walks through end-to-end configuration: AWS VPC setup, EKS VPC CNI custom networking, Aviatrix custom SNAT policies, transit route exclusion, and DCF SmartGroup considerations.
A complete Terraform reference implementation is available in the AWS EKS Multi-Cluster Blueprint.

Routable vs. Non-Routable Address Space

The EKS VPC uses two CIDR blocks with distinct roles:
LayerCIDR SourceRoutable?Purpose
Primary CIDREnterprise IP plan (e.g., 10.10.0.0/23)Yes — unique, advertised across the fabricCluster ingress, EKS control plane ENIs, node management, Aviatrix gateway
Secondary CIDRRFC 6598 space (e.g., 100.64.0.0/16)No — never advertised, can overlap across VPCsPod networking via VPC CNI custom networking
The primary CIDR is intentionally small — it only needs to cover infrastructure, not pods. A /23 (512 IPs) is often sufficient. The heavy lifting moves to the secondary CIDR, which provides tens of thousands of pod IPs without consuming enterprise address space.

What Lives on the Routable (Primary) CIDR

  • Cluster ingress — Load balancer ENIs (ALB, NLB, or any ingress controller) sit in routable subnets so they are reachable from other VPCs, on-prem, or the internet.
  • EKS control plane ENIs — The managed cross-account ENIs that EKS places in your VPC for API server connectivity. These must be in routable subnets so nodes and pods can reach the Kubernetes API.
  • EKS nodes — Node EC2 instances launch in routable private subnets. Their primary ENI gets a routable IP; secondary ENIs (for pods) are placed in non-routable subnets via ENIConfig.
  • Aviatrix spoke gateway — The gateway ENIs need routable IPs since they are the SNAT translation point.

What Lives on the Non-Routable (Secondary) CIDR

  • Pod ENIs — Every pod gets an IP from the 100.64.0.0/16 secondary CIDR via VPC CNI custom networking. These IPs are never exposed directly outside the VPC — all egress is SNATed by the Aviatrix spoke gateway.

Traffic Flow

EKS pod traffic flow through Aviatrix spoke gateway with SNAT

Step 1: AWS VPC Configuration

1.1 Add a Secondary CIDR Block

Attach a secondary CIDR from RFC 6598 space (100.64.0.0/10) to the EKS VPC. A /16 provides 65,536 pod IPs — sufficient for most clusters.
resource "aws_vpc_ipv4_cidr_block_association" "secondary" {
  vpc_id     = aws_vpc.this.id
  cidr_block = "100.64.0.0/16"
}
This CIDR is non-routable by design. It can safely overlap with the same secondary CIDR in other VPCs because it is never advertised across the Aviatrix fabric.

1.2 Create Pod Subnets from the Secondary CIDR

Create one pod subnet per Availability Zone from the secondary CIDR:
  • AZ-a: 100.64.0.0/17 (32,768 IPs)
  • AZ-b: 100.64.128.0/17 (32,768 IPs)
These subnets must depend on the aws_vpc_ipv4_cidr_block_association resource.

1.3 Subnet Layout

SubnetCIDR SourceRoutable?Purpose
Aviatrix gateway (x2 AZs)PrimaryYesSpoke gateway ENIs
Load balancer / ingress (x2 AZs)PrimaryYesALB, NLB, or any ingress controller ENIs
Infrastructure / nodes (x2 AZs)PrimaryYesEKS nodes, control plane ENIs
Pod networking (x2 AZs)SecondaryNoPod ENIs via VPC CNI custom networking
The load balancer subnets should be tagged appropriately for your ingress controller. For the AWS Load Balancer Controller:
  • Public-facing: kubernetes.io/role/elb = 1
  • Internal-facing: kubernetes.io/role/internal-elb = 1

Step 2: EKS VPC CNI Configuration

2.1 Enable Custom Networking on the vpc-cni Addon

Three environment variables control the behavior:
VariableValuePurpose
AWS_VPC_K8S_CNI_CUSTOM_NETWORK_CFG"true"Pods use ENIConfig-specified subnets instead of the node’s subnet
ENI_CONFIG_LABEL_DEF"topology.kubernetes.io/zone"CNI selects the ENIConfig matching the node’s AZ label
AWS_VPC_K8S_CNI_EXTERNALSNAT"true"Disables CNI-level SNAT — delegates NAT to Aviatrix
addons = {
  vpc-cni = {
    most_recent = true
    configuration_values = jsonencode({
      env = {
        AWS_VPC_K8S_CNI_CUSTOM_NETWORK_CFG = "true"
        ENI_CONFIG_LABEL_DEF               = "topology.kubernetes.io/zone"
        AWS_VPC_K8S_CNI_EXTERNALSNAT        = "true"
      }
    })
  }
}
AWS_VPC_K8S_CNI_EXTERNALSNAT is critical. By default, the VPC CNI performs SNAT on pod egress, translating the pod IP to the node’s primary IP. This must be disabled so packets leave the node with their 100.64.x.x source intact. The Aviatrix spoke gateway then performs SNAT at the network edge — this is what replaces the AWS private NAT gateway.

2.2 Create ENIConfig Custom Resources

One ENIConfig per AZ, created before node groups launch:
apiVersion: crd.k8s.amazonaws.com/v1alpha1
kind: ENIConfig
metadata:
  name: us-east-2a            # Must match the AZ name
spec:
  subnet: subnet-0abc123...   # Pod subnet ID (100.64.x.x) in this AZ
  securityGroups:
    - sg-0def456...            # Pod security group
If nodes launch before ENIConfigs exist, pods will incorrectly receive IPs from the node’s primary (routable) subnet.

2.3 Pod Security Group

Create a dedicated security group for pod ENIs with the following rules: Ingress:
  • All traffic from node security group (node-to-pod communication)
  • All traffic from self (pod-to-pod within cluster)
  • TCP 9443 from EKS cluster primary security group (control plane webhook callbacks)
Egress:
  • All traffic to 0.0.0.0/0

2.4 Cluster Management Access from Pods

Pods on the non-routable secondary CIDR (100.64.x.x) need to reach the EKS API server for normal Kubernetes operations — service account token refresh, watch streams, operator controllers, etc. This works because the control plane ENIs and pod ENIs are in the same VPC, even though they are on different CIDRs. The key security group rule is:
# Allow pods (on 100.64.x.x) to reach the EKS control plane (on 10.x.x.x)
resource "aws_vpc_security_group_ingress_rule" "cluster_from_pods" {
  security_group_id            = module.eks.cluster_primary_security_group_id
  referenced_security_group_id = aws_security_group.pod.id
  ip_protocol                  = "-1"
  description                  = "Allow pods to reach EKS control plane"
}
This rule opens the cluster’s primary security group (attached to control plane ENIs) to all traffic from the pod security group (attached to pod ENIs via ENIConfig). Because security group rules are evaluated at the ENI level regardless of subnet CIDR, the cross-CIDR reference works — pods on 100.64.x.x can reach control plane ENIs on 10.x.x.x within the VPC. The reciprocal rule (in section 2.3 above) allows the control plane to call back to pod webhooks on port 9443. This is required for admission controllers like the AWS Load Balancer Controller or cert-manager.
No SNAT is involved in this path — traffic stays entirely within the VPC and does not traverse the Aviatrix gateway.

2.5 Cluster Ingress on the Routable CIDR

Ingress load balancers must be placed in routable subnets (from the primary CIDR) so they are reachable from outside the VPC. The pods behind them are on the non-routable secondary CIDR — the load balancer bridges the gap. Target type must be ip regardless of which ingress controller you use. With CNI-level SNAT disabled, the instance target type (which routes through kube-proxy on the node) does not work correctly. The ip target type registers pod IPs (100.64.x.x) directly with the load balancer, and the load balancer reaches them within the VPC.
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  annotations:
    alb.ingress.kubernetes.io/scheme: internet-facing
    alb.ingress.kubernetes.io/target-type: ip
AWS Load Balancer Controller caveat: When pods run on the secondary CIDR, they cannot reach EC2 instance metadata (IMDS) to auto-detect VPC ID and region. The controller must be configured with explicit values:
--set vpcId=<vpc-id>
--set region=<aws-region>
This caveat is specific to the AWS Load Balancer Controller. Other ingress controllers may not have this dependency.

Step 3: Aviatrix Spoke Gateway — Custom SNAT

This is where Aviatrix replaces the AWS private NAT gateway. Instead of deploying a NAT gateway in the VPC and configuring route tables to point non-routable subnets at it, the Aviatrix spoke gateway performs the same SNAT function — and integrates with DCF for policy enforcement.

3.1 Do NOT Enable single_ip_snat

The spoke gateway must not use single_ip_snat = true. That mode applies blanket NAT and cannot be combined with policy-based rules. Leave it at the default (false).

3.2 Configure Custom SNAT Policies

Create an aviatrix_gateway_snat resource with three policies: Policy 1 — Pod traffic toward the transit (east-west to other VPCs, on-prem, etc.)
snat_policy {
  src_cidr   = "100.64.0.0/16"
  dst_cidr   = "0.0.0.0/0"
  protocol   = "all"
  connection = "<transit-gateway-name>"
  snat_ips   = "<spoke-gateway-private-ip>"
}
Policy 2 — Pod traffic toward the internet (north-south)
snat_policy {
  src_cidr   = "100.64.0.0/16"
  dst_cidr   = "0.0.0.0/0"
  protocol   = "all"
  interface  = "eth0"
  snat_ips   = "<spoke-gateway-private-ip>"
}
Policy 3 — Node traffic toward the internet Since single_ip_snat is disabled, nodes also lose default internet NAT. Add a policy per infrastructure subnet:
snat_policy {
  src_cidr   = "10.10.0.160/26"   # Node subnet
  dst_cidr   = "0.0.0.0/0"
  protocol   = "all"
  interface  = "eth0"
  snat_ips   = "<spoke-gateway-private-ip>"
}

3.3 Field Notes

  • connection and interface are mutually exclusive — when specifying connection, do not set interface, and vice versa.
  • snat_ips is the spoke gateway’s primary private IP — the routable VPC address.
  • Custom SNAT policies must be configured on each gateway individually. If the spoke has an HA gateway, create a separate aviatrix_gateway_snat resource targeting the HA gateway with its own private IP as the snat_ips value.

Step 4: Aviatrix Transit Gateway — Route Exclusion

The transit must not advertise the non-routable pod CIDR to other spokes or on-prem:
excluded_advertised_spoke_routes = "100.64.0.0/16"
Without this, the spoke advertises 100.64.0.0/16 into the transit routing domain. If multiple clusters use the same secondary CIDR, this creates a routing conflict. Even with a single cluster, advertising a non-routable CIDR is unnecessary — all pod traffic is SNATed to a routable address before entering the fabric.
Multiple CIDRs can be comma-separated: "100.64.0.0/16,100.65.0.0/16".

Step 5: Aviatrix DCF and SmartGroup Considerations

How DCF Handles SNAT

DCF evaluates traffic at the spoke gateway — the entry point into the Aviatrix fabric. Transit gateways are forwarders and do not perform policy evaluation. The DCF engine is NAT-aware and automatically accounts for SNAT translations when programming policies. You do not need to manually adjust rules for pre- vs. post-SNAT addresses.

Use Dynamic-Type SmartGroups

Because the pod CIDR is non-routable and potentially overlapping across VPCs, CIDR-based SmartGroups matching 100.64.0.0/16 are ambiguous — DCF cannot distinguish which cluster a 100.64.x.x address belongs to. Use dynamic-type SmartGroups that resolve to identity:
Dynamic TypeUse Case
VPCCluster-level / VPC-level segmentation
KubernetesWorkload-level segmentation by namespace, label, or service account
resource "aviatrix_smart_group" "eks_cluster" {
  name = "sg-eks-cluster"
  selector {
    match_expressions {
      type = "vpc"
      name = "eks-production"
    }
  }
}

Prerequisites for Kubernetes-Type SmartGroups

EKS clusters must be onboarded to the Aviatrix Controller via aviatrix_kubernetes_cluster. The Controller needs read access to the cluster (EKS access entries with view permissions) to discover pod, namespace, and label metadata.